AnnualRep2017.1.1 Sm ar t cu lt ur e. A na ly si s o f d ig it al t re nd s 62 BIG DATA IN THE DIGITAL HUMANITIES · ANTONIO ROJAS BIG DATA IN THE DIGITAL HUMANITIES. NEW CONVERSATIONS IN THE GLOBAL ACADEMIC CONTEXT Antonio Rojas Castro @RojasCastroA Antonio Rojas Castro earned a doctorate in the humanities from the Universitat Pompeu Fabra (2015, Barcelona). Also at this university he was a pre-doctoral fellow, an FPI grantee belonging to the Todo Góngora II research group and a lecturer on academic writing and literary studies subjects. In 2015 he was joint editor of a monograph on the Digital Humanities for the magazine Ínsula. He is currently editor of The Programming Historian en español, is in charge of communicatiosn at the European Association for Digital Humanities (EADH) and works as a post-doctoral fellow at the Cologne Center for eHumanities (Germany) https://twitter.com/RojasCastroA Sm art culture. A nalysis o f d igital trend s 63AC/E DIGITAL CULTURE ANNUAL REPORT 2017 the methods that are currently available. This requirement is not unrelated to the work of hu- manists, who have always been in contact with other fringe disciplines such as anthropology, Marxism and gender studies. Indeed, in recent years humanists have established a fruitful dialogue with computer studies and the social sciences – which has been called a “computa- tional turn” (Berry, 2011). In this academic con- text, the expression “Big Data” has directly found its way into debates on “scale” – how can we study all the eighteenth- and nineteenth-century novels written in England, France, Germany, the United States or Japan?; or, more commonly, in a cross-cutting way through concepts more familiar to humanists, such as “distant reading” (Moretti, 2007) or “macroanalysis” (Jockers, 2013). Humanistic disciplines such as philosophy, philology and history are characterised not only by a specific object of study but also by a method that seeks to understand particular, unusual and even unique cases through text commentary. These changes have been made possible by the fact that statistical and computing methods, as well as other methods related to the social sciences, have been modified and have suc- ceeded in adapting their conceptual models to the complexity of texts (English and Underwood, 2016). In other words, we are dealing with a genuine conversation in which the various interlocutors talk and listen to each other. Concerning the particular in the universal The expression “Big Data” has been spreading in the experimental sciences and the media since 2011, as if an increased amount of available data were the next scientific breakthrough. The term is used in academia, industry and the media… but what exactly does it mean? Is it an object of study, a method, a group of technologies or a discipline? Introduction Christmas 2016. A perfect time to think back, sum up and publish lists of the main events of the year. Google Trends published the most popular searches grouped into categories such as “News”, “People”, “Technology”, “Films”, “Music”, “Sport” and “Deaths”. A few days earlier the Swedish company Spotify, which provides online access to millions of songs, launched an advertising campaign based on data produced by users. Some of the huge posters plastered all over the streets of London display messages such as: “Dear person who played ‘Sorry’ 42 times on Valentine’s Day, what did you do?”; or “Dear 3,749 people who streamed ‘It’s the End of the World as We Know It’ the day of the Brexit vote, hang in there.” Spotify’s campaign is both surprising and effective because it plays on the viewer’s engagement. But what has all this got to do with the humanistic disciplines that study documents, texts and images of the past? Or, in other words, how can handling the large amount of data amassed by companies help us gain a better understanding of the limits of our thought, language and historical events – basically all the expressions of our human mind? If we accept that humanistic disciplines such as philosophy, philology and history are char- acterised not only by a specific object of study but also by a method that seeks to understand particular, unusual and even unique cases through text commentary, then the answer will no doubt be negative: “nothing, or very little”. However, as Professor Rens Bod (2013) recently argued, since antiquity humanists have also sought general principles, laws and patterns to explain our culture, and have often (for good or for bad) changed how we perceive the world. We should begin by dismissing certain clichés about the humanities and ask ourselves about their classic objects of study, bearing in mind Sm ar t cu lt ur e. A na ly si s o f d ig it al t re nd s 64 BIG DATA IN THE DIGITAL HUMANITIES · ANTONIO ROJAS example of the type of projects carried out. Since 2015, the association has devoted a space on its website to documenting and promoting access to European Digital Humanities projects conducted in the past five years. The initiative is participatory in nature because any researcher (whether or not they belong to the association) can fill in the form available on the website and submit a description of their project providing details of the name of the project, a descriptive summary, collaborating institutions or the team in charge, among other fields. So far, at the time of writing this article, the association has re- ceived 175 submissions. If the titles and summa- ries are analysed with Voyant, a tool for counting the most frequently used words, it is easy to see that the projects abound in words related to the subject of this article, such as “data”, “informa- tion” and “database”, and others that denote the scale or size of the project, including “archive”, “collection”, “platform” and “library”. The current state of the Digital Humanities in Europe can be gauged by three aspects: projects, tools and research groups. Prominent among the projects for making digital texts available online are Oxford Text Archive, Deutsche TextArchive, Eighteenth-Century Poetry Archive, and DigiLibt. Tools for textual analysis include Alcide, CATMA and Stylo R. Infrastructure and research groups such as CLARIN, CLiGS and Electronic Text Reuse Acquisition Project are also important. These initiatives use algorithms to attribute authorship of texts (Burrows, 2002), discover latent themes underlying a large group of texts (Blei, 2012), or detect cases of intertextuality in several authors’ literary output (Ganascia, Glaudes and Del Lungo, 2015). Suffice it to say Words used most frequently to describe Digital Humanities projects in Europe CC-BY One of the few articles to have shed some light on the matter is entitled “Undefined by Data: A Survey of Big Data Definitions”. The authors (Ward and Barker, 2013) collate the various definitions of “Big Data” provided by major technology companies like Oracle, Intel and Microsoft and a few previous reports. In general, the definitions combine two important ideas: storage of a large volume of data (some authors speak of 500 Terabytes per week); and analysing this data quantitatively and visually to find patterns, establish laws and predict conduct. The classic definition of “Big Data” is a formula that is easy to understand and memorise – the three Vs: Volume (Terabytes, Petabytes, Exabytes), Velocity (data that is constantly generated) and Variety (texts, images, sounds) (Ward and Barker, 2013). Some reports have subsequently added a fourth V, which stands for the term Veracity. Though this volume-based definition of Big Data only makes sense if we consider blogs, the social media and sensors to be main sources of data. In contrast, the classic object of study of the humanities is usually texts and analogue images which have fortunately been digitised and published in computer-legible format. In other words, if we take the three Vs as a basis, we have to admit that we cannot speak of Big Data in the strict sense in the humanities. For one thing, the classic works of Spanish Golden Age poetry fit into a 4GB pen drive; for another, archives and libraries do not constantly produce new data and at a high speed on our poets, writers or artists (or rather, this data is not accessible to research- ers). As for variety, we are dealing with image files in TIFF, JPEG or another similar format, and semi-structured text in XML format or, without markup, in TXT format. Before the advent of Google Books in 2004, digital humanists worked to digitise corpuses of texts and images in the form of digital editions, libraries and files. The European Association for Digital Humanities (EADH) provides a good Sm art culture. A nalysis o f d igital trend s 65AC/E DIGITAL CULTURE ANNUAL REPORT 2017 algorithms for studying large holdings of texts and images quantitatively. Indeed, digital hu- manists have played an active part in the debates on the nature of data. In a context in which data is equated with objec- tive, irrefutable evidence, it is constantly stated that data is in fact a human construction; that is, it is conditioned by the time, place, language and ideology of the actors involved in gathering it. For example, the researcher Johanna Drucker (2011) rejects the term “data” – Latin for “that which is given to us” – and uses instead the term “capta” meaning “that which has been taken or collected”; evidently this critical intervention highlights the impartial and incomplete nature of data. Digital humanists have also stressed the tempo- rality of data – for all data has a date of creation and expiry – and the fallacy of separating data from metadata (that is, data such as title, maker, theme, description, date, format, identifier, source, language, etc.). Actually there is no such thing as second-grade data, as embodied by the root meta; metadata is just as important, selective and impartial as data because it is produced by humans (or rather by algorithms designed by human beings). Equally invalid is the distinction – which dates back to Lévi-Strauss’s culinary triangle – between “raw data” and “cooked data” or between “data”, “raw material” and “information”. Indeed, for researchers like Tom Boellstorff (2013), data is dense, interpretative and contex- tual, and it is therefore preferable to speak of “thick data”. Paraphrasing the anthropologist Clifford Geertz, data should be regarded as “our own constructions of other people’s con- structions” of objects imagined by a particular community. For example, the Text Encoding Initiative is a non-profit organisation that publishes Recommendations on how to encode humanistic texts with XML markup language so that they that many of these procedures are comparable to automatic image processing (Rosado, 2015). The ultimate aim is usually to find patterns that help understand literary and artistic creations. But text commentary – close reading – continues to play an important role even when statistical methods are used to analyse texts, because researchers shift their attention from the whole to the detail and from the detail to the whole to check that their ideas about the work are correct and accordingly gain a better understanding of the different layers of meaning, the central themes, the events and the style. Put another way, distant reading and close reading are not mutually exclusive because researchers usually combine both strategies: they first gain an overview and then filter and examine the details for a deep comprehension. They usually complete their analysis with visualisations of information in the form of marginal annotations, parallel texts that are connected in some way (colours, density, contrast between form and substance, arrows) or more abstract structures like maps, trees and graphs (Jänicke, Franzini, Cheema and Scheuermann, 2015). In the humanities we can only speak of Big Data in connection with the technologies associated with this phenomenon, such as data mining, stylometry or natural language processing. To sum up, although the volume of data is not comparable to that currently generated by the social media, blogs and major companies, in the humanities (and specifically in literary studies) we can only speak of Big Data in connection with the technologies associated with this phenomenon, such as data mining, stylometry or natural language processing. Data as a human construction The conversation between the humanities and Big Data does not merely boil down to adopting example of the type of projects carried out. Since 2015, the association has devoted a space on its website to documenting and promoting access to European Digital Humanities projects conducted in the past five years. The initiative is participatory in nature because any researcher (whether or not they belong to the association) can fill in the form available on the website and submit a description of their project providing details of the name of the project, a descriptive summary, collaborating institutions or the team in charge, among other fields. So far, at the time of writing this article, the association has re- ceived 175 submissions. If the titles and summa- ries are analysed with Voyant, a tool for counting the most frequently used words, it is easy to see that the projects abound in words related to the subject of this article, such as “data”, “informa- tion” and “database”, and others that denote the scale or size of the project, including “archive”, “collection”, “platform” and “library”. The current state of the Digital Humanities in Europe can be gauged by three aspects: projects, tools and research groups. Prominent among the projects for making digital texts available online are Oxford Text Archive, Deutsche TextArchive, Eighteenth-Century Poetry Archive, and DigiLibt. Tools for textual analysis include Alcide, CATMA and Stylo R. Infrastructure and research groups such as CLARIN, CLiGS and Electronic Text Reuse Acquisition Project are also important. These initiatives use algorithms to attribute authorship of texts (Burrows, 2002), discover latent themes underlying a large group of texts (Blei, 2012), or detect cases of intertextuality in several authors’ literary output (Ganascia, Glaudes and Del Lungo, 2015). Suffice it to say Words used most frequently to describe Digital Humanities projects in Europe CC-BY Sm ar t cu lt ur e. A na ly si s o f d ig it al t re nd s 66 BIG DATA IN THE DIGITAL HUMANITIES · ANTONIO ROJAS writings. Jean-Gabriel Ganascia (2015: 632–33), for example, claims that a theory or previous hypothesis is no longer necessary if we analyse all the existing data as opposed to a sample or small group, as has been done so far. In contrast to this viewpoint, a considerable number of writings have confirmed the importance of theories, models and hypotheses for research. It should be remembered that our cultural heritage (documents, texts, paintings, images, sounds) is not fully digitised, despite the collective efforts of initiatives like Europeana. According to the latest report issued by the European Commission project ENUMERATE (Nauta and Wietske, 2015), only 23% of European collections have currently been digitised. The survey was answered by some 1,000 European institutions including libraries, museums and archives. These institutions have yet to digitise some 50% of their collections and admit that about 27% of their holdings will not be digitised. These figures highlight the fact that much of our heritage is not accessible on the internet. Digitisation always involves making a selection based on the resources available to the insti- tution or working group in charge of digitising the documents; but this selection furthermore stems from ideological and identity reasons. It should not be forgotten that museums, libraries and archives are publicly funded institutions and their role is to preserve and disseminate the cultural heritage of a community (for example, a nation). In addition, formats, markup languages and algorithms are also part of a particular culture and ideology and go hand in hand with many assumptions that vary depending on the context. From a humanistic viewpoint, it is thus hard to believe that analysing large amounts of data could renders scientific method useless, because we never have all the existing data – one of the vectors of Big Data is the Velocity with which new data is generated – because the data is are interchangeable and, more or less, standard. It is a participatory organisation in which any researcher can suggest changes or improvements based on their experience to the set of labels defined by the consortium. Up until 2012, however, none of its members had questioned the fact that the label for describing the sex of a person mentioned in a text complied with standard ISO/IEC 5218:2004 and that the attributes (@value) were given as single-digit codes 1 (male), 2 (female), 9 (not applicable) and 0 (not known). The situation was re-examined when a female researcher pointed out that this typology was sexist, as it put women in second place with respect to men, and codified patriarchal structures with markup language (Terras, 2013). With this I do not wish to detract from the importance of the TEI, especially in giving shape to the Digital Humanities, but rather to stress that technology, data, algorithms and standards are the product of an interpretation of the world and bear cultural marks. In conclusion, data should not be viewed as absolute truths but be questioned critically. Our cultural heritage is not fully digitised, despite the collective efforts of initiatives like Europeana. Only 23% of European collections have currently been digitised. In defence of theory In literature on Big Data it is also common to find that theory is discredited. The argument is basically as follows: if we have large amounts of data and effective statistical methods, we do not need theories, models and hypotheses, which need to be proven or refuted with experiments. Put another way, in the era of the Petabyte, scientific method is obsolete (Anderson, 2008). The dismissal of theories and models has not only been given credit in the business world, but it has also been accepted in a few humanistic Sm art culture. A nalysis o f d igital trend s 67AC/E DIGITAL CULTURE ANNUAL REPORT 2017 The connection between the external object (for example, an epigraphic inscription) and the representation (a 3D reconstruction that allows the tombstone to be viewed from various angles and in greater detail) is based on similarity; it is therefore important to place reflection on “modelling” in context of the tradition of semiot- ics and the science of signs (Ciula and Eder, 2016). Naturally there are different degrees of similarity; the relationship can range from total likeness to metaphor, including a certain similar- ity between the properties of the object repre- sented and the digital representation. Digital models are thus icons that help us think and learn more about the original, the analogue object. This type of thought has been described as “abduction”, because it stands somewhere between induction and deduction and is based on the intuition and experience of the person who “models” (Bryant and Raja, 2014). In other words, the process of modelling is influenced by contextual elements such as starting hypotheses, theoretical assumptions, scientific methods, formats and technologies. 3D modelled epigraphic inscription. © Epigraphia 3D http://www.epigraphia3d.es/ erroneous or ambiguous, or because data pro- cessing (automatic or otherwise) is determined by our culture and, therefore, has ideological biases. Take the case of CollateX, a tool designed to compare texts with slight variations and align the parts of the texts that are different. Among other assumptions of the algorithm, it should be stressed that for CollateX it is not relevant to distinguish between a transposition or change of place of a portion of text (for example, in a poem, a stanza that appears displaced or in a different place) and a substitution (that is, elimination of a stanza from one place and the addition of the same lines in another place) (Van Zundert, in press). Here the question is not to establish whether CollateX’s algorithm is correct. Researchers may or may not agree, but the key lies in knowing about this choice, this preference, and being aware that it conditions results and interpretations. Digital models are thus icons that help us think and learn more about the original, the analogue object and the process of modelling is influenced by contextual elements. Indeed, a few authors argue that theories and models are even more important in the era of Big Data because it is necessary to explain and understand the phenomena analysed through abstractions. In the Digital Humanities the concept of “model” is very widespread because it helps explain the core of digitisation work. Models are taken as tools, schemes or designs used in a specific context for particular purposes that are sometimes practical (to make a group of texts available online), but are often, especially in the academic field, speculative (to understand the structure of texts). More than the finished product, what matters in the Digital Humanities is the creative process that takes place when a phenomenon is “modelled”, because the aim is to gain new knowledge, new meanings, by generat- ing an external object that represents it. http://www.epigraphia3d.es/ Sm ar t cu lt ur e. A na ly si s o f d ig it al t re nd s 68 BIG DATA IN THE DIGITAL HUMANITIES · ANTONIO ROJAS Nevertheless, this type of data is not accessible because municipal libraries have a long tradition of data protection (Starr, 2004). They do, how- ever, publish lists of the most frequently borrowed books which function as indicators of contempo- rary taste. In order to be studied, this data would have to be published in an open format like XML or CSV and include a series of metadata such as the place and time of the loan, but such practices would encroach on users’ privacy. For researchers interested in reading habits, enjoying access to so much data would be a breakthrough. For example, it would be possible to ascertain how films, television and advertising influence people’s tastes and reading habits. Manufacturers of electronic books, for example, are already using reading statistics to discover which books can be regarded as good – because readers finish them – despite not being best sellers; or to identify the next Dan Brown based on readers’ degree of satisfaction with books written by unknown authors (Kobo, 2014). Ba- sically, all the data generated by our electronic books is amassed by publishing companies to learn more about the relationship between sales and customer satisfaction; this makes it easier to justify economic decisions about the publishing future of a particular author, literary saga or genre. By this I do not mean to imply that public libraries and museums should act in the same way as companies. I merely wish to point out that the state of being watched existed before the social media – just as spaces of resistance did. Just as companies like Twitter have been accused of exerting coercive power over research in the social sciences (Reichert, 2015), we should ask ourselves how humanists can study citizens’ cultural habits, in constant dialogue with libraries and museums and using methods to anonymise data. In my view, we should aim to ensure that companies like Spotify and Amazon do not know more about a particular society – about our tastes, interests and moods – than its own members do. Inside the Panopticon The constant production of large amounts of data in real time through the social media also has a sinister counterpart. It is not unusual for Big Data to be compared to Big Brother or, better still, to the Panopticon – a type of peni- tentiary building devised by Jeremy Benthan in the eighteenth century which creates the sensa- tion of being constantly watched – especially in the wake of the Edward Snowden case. Govern- ments monitor citizens to ensure their security; this is by no means new and is part of the history of power structures studied by Michel Foucault, among others. In the modern state people are watched and, at the same time, encouraged to reveal their deepest secrets through confession, psychoanalytical therapy or, nowadays, by posting their “statuses” on Facebook. As we have seen, the object of study of the humanities tends to be external, autonomous and finished – a historical document, a literary text, a visual representation – and research therefore does not usually pose ethical dilemmas on the privacy of creators and recipients. However, as consumers of culture, our acts are registered every time we search for a book, film or song on the internet, and when we click on a product and buy it; the same is true when we visit a museum – the surveillance camera is there to protect our heritage from crime and theft, but also to keep check of visitors; lastly, when we borrow a book from a public library a record is created in the database. We should ask ourselves how humanists can study citizens’ cultural habits, in constant dialogue with libraries and museums and using methods to anonymise data. The case of public libraries is particularly inter- esting because they are a type of neighbourhood infrastructure accessible to everyone regardless of their economic status. Librarians record all loans, noting the date and borrower, in their databases. Sm art culture. A nalysis o f d igital trend s 69AC/E DIGITAL CULTURE ANNUAL REPORT 2017 Bibliography Anderson, Chris (06.23.08). “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete”. Wired. https://www.wired. com/2008/06/pb-theory/. Berry, D. M. (2011). “The Computational Turn: Thinking about the Digital Humanities”. Culture Machine 12. http://www.culturemachine.net/ index.php/cm/article/viewarticle/440. Blei, David M. (2012). “Probabilistic Topic Models”. Communications of the ACM, 55.4: pp. 77–84: http://cacm.acm.org/maga- zines/2012/4/147361-probabilistic-topic-models/ fulltext. Bod, Rens (2013). A New History of the Human- ities. Oxford University Press. Boellstorff, Tom (2013). “Making Big Data, in Theory”. First Monday 18.10. http://firstmonday. org/ojs/index.php/fm/article/view/4869. Bryant, Anthony and Raja, Uzma (2014). “In the Realm of Big Data…” First Monday, 19.2. http:// firstmonday.org/ojs/index.php/fm/article/ view/4991. Burrows, John (2002). “‘Delta’: a Measure of Stylistic Difference and a Guide to Likely Author- ship”. Literary and Linguistic Computing 17.3: pp. 267–87. Ciula, Arianna and Eder, Øyvind (2016). “Model- ling in the Digital Humanities: Signs in Context”. Digital Scholarship in the Humanities. Drucker, Johanna (2011). “Humanities Approaches to Graphical Display”. DHQ: Digital Humanities Quarterly 5.1. http://www.digitalhumanities.org/ dhq/vol/5/1/000091/000091.html. Conclusions Since 2011 the expression “Big Data” has been widely used in the experimental sciences and the media as if the increased amount of available data were the next scientific breakthrough. Although there is plenty of hype, the humanities have not been unaffected by this phenomenon; very specifically, although the digitisation of our cultural heritage is incomplete, several publica- tions can be found which enter into conversation with Big Data and the social sciences. In Euro- pean academia, there are many notable projects that process large amounts of data in order to study language, literature or art using techniques such as Natural Language Processing, automatic computer vision, topic modelling and stylometry. After analysing the meaning of the expression “Big Data”, this article highlights the cultural nature of data and defends the validity of theories, models and hypotheses for carrying out scientific research. Lastly, it discusses the dialectic between privacy and control. In a sense, this issue escapes the traditional field of the humanities, but it also deserves our attention as twenty-first-century citizens interested in the cultural practices of the present. Humanists no doubt have much to contribute to ethical and epistemological debates on the use of the data generated by citizens, recalling the “captured” and cultural nature of data, and bringing their experience to analysing particular cases bearing in mind the general context. https://www.wired.com/2008/06/pb-theory/ https://www.wired.com/2008/06/pb-theory/ http://www.culturemachine.net/index.php/cm/article/viewarticle/440 http://www.culturemachine.net/index.php/cm/article/viewarticle/440 http://cacm.acm.org/magazines/2012/4/147361-probabilistic-topic-models/fulltext http://cacm.acm.org/magazines/2012/4/147361-probabilistic-topic-models/fulltext http://cacm.acm.org/magazines/2012/4/147361-probabilistic-topic-models/fulltext http://firstmonday.org/ojs/index.php/fm/article/view/4869 http://firstmonday.org/ojs/index.php/fm/article/view/4869 http://firstmonday.org/ojs/index.php/fm/article/view/4991 http://firstmonday.org/ojs/index.php/fm/article/view/4991 http://firstmonday.org/ojs/index.php/fm/article/view/4991 http://www.digitalhumanities.org/dhq/vol/5/1/000091/000091.html http://www.digitalhumanities.org/dhq/vol/5/1/000091/000091.html Sm ar t cu lt ur e. A na ly si s o f d ig it al t re nd s 70 BIG DATA IN THE DIGITAL HUMANITIES · ANTONIO ROJAS Reichert, Ramón (2015). “Big Data. Digital Media Culture in Transition”. Poetics and Politics of Data. The Ambivalence of Life in a Data-Driven Society. Sabine Himmelsbach and Claudia Mareis (eds.). Basel, pp. 147–66. Rosado Rodrigo, Pilar (2015). Formas latentes: protocolos de visión artificial para la detección de analogías aplicados a la catalogación y creación artísticas. Universitat de Barcelona. Doctoral the- sis. http://www.tdx.cat/handle/10803/300302 Starr, Joan (2004). “Libraries and National Security: An Historical Review.” First Monday 9.12. http://firstmonday.org/ojs/index.php/fm/ article/view/1198/1118 Terras, Melissa (27. 03. 2013). “On Changing the Rules of Digital Humanities from the Inside”. https://melissaterras.org/2013/05/27/on-chang- ing-the-rules-of-digital-humanities-from-the- inside/ Van Zundert, Joris (in press). Digital Scholarship in the Humanities. Ward, Jonathan Stuart and Barker, Adam (2013). “Undefined By Data: A Survey of Big Data Definitions”. https://arxiv.org/abs/1309.5821. Digital resources 1. Alliance of Digital Humanities Organiza- tions: http://adho.org/ 2. Europeana: http://www.europeana.eu/ portal/es 3. European Association for Digital Human- ities: http://eadh.org/ 4. FreeLing: http://nlp.lsi.upc.edu/freeling/ node/1 5. Asociación Humanidades Digitales Hispánicas: http://www.humanidadesdig- itales.org/inicio.htm;jsessionid=FDC5ED- 5B005786714E45936B6E127DF8 English, James F. and Underwood, Ted (2016). “Shifting Scales: Between Literature and Social Science”. Modern Language Quarterly 77.3: pp. 278–95. http://mlq.dukejournals.org/ content/77/3/277.full. Ganascia, Jean-Gabriel (2015). “Les Big Data dans les Humanités”. Critique 818–19: pp. 627–36. https://www.cairn.info/revue-critique-2015-8- page-627.htm. Ganascia, Jean-Gabriel, Glaudes, Pierre and Del Lungo, Andrea (2015). “Automatic Detection of Reuses and Citations in Literary Texts”. Literary and Linguistic Computing 29.3: pp. 412–21. Jänicke, S., Franzini, G., Faisal, C., Scheuermann, G. (2016). “Visual Text Analysis in Digital Hu- manities”. Computer Graphics Forum, 35.2. DOI: 10.1111/cgf.12873 Jockers, Matthew (2013). Macroanalysis. Digital Methods and Literary History. University of Illinois Press. Kobo, 2014. “Publishing in the Era of Big Data”. http://news.kobo.com/_ir/159/20149/ Publishing%20in%20the%20Era%20of%20Big%20 Data%20-%20Kobo%20Whitepaper%20Fall%20 2014.pdf Moretti, Franco (2007). Graphs, Maps, Trees: Ab- stract Models for Literary History. Verso: London. Nauta, Gerhard Jan and Heuvel, Wietske van den (2015). Survey Report on Digitization in European Cultural Heritage Institutions 2015. ENUMERATE. http://dataplatform.enumerate. eu/reports/survey-report-on-digitisation-in-eu- ropean-cultural-heritage-institutions-2015/ detail Nudd, Tim (2016), Spotify Crunches User Data in Fun Ways for This New Global Outdoor Ad Campaign, Adweek. http://www.adweek.com/ adfreak/spotify-crunches-user-data-fun-ways- new-global-outdoor-ad-campaign-174826 http://www.tdx.cat/handle/10803/300302 http://firstmonday.org/ojs/index.php/fm/article/view/1198/1118 http://firstmonday.org/ojs/index.php/fm/article/view/1198/1118 https://melissaterras.org/2013/05/27/on-changing-the-rules-of-digital-humanities-from-the-inside/ https://melissaterras.org/2013/05/27/on-changing-the-rules-of-digital-humanities-from-the-inside/ https://melissaterras.org/2013/05/27/on-changing-the-rules-of-digital-humanities-from-the-inside/ https://arxiv.org/abs/1309.5821 http://adho.org/ http://www.europeana.eu/portal/es http://www.europeana.eu/portal/es http://eadh.org/ http://nlp.lsi.upc.edu/freeling/node/1 http://nlp.lsi.upc.edu/freeling/node/1 http://www.humanidadesdigitales.org/inicio.htm;jsessionid=FDC5ED5B005786714E45936B6E127DF8 http://www.humanidadesdigitales.org/inicio.htm;jsessionid=FDC5ED5B005786714E45936B6E127DF8 http://www.humanidadesdigitales.org/inicio.htm;jsessionid=FDC5ED5B005786714E45936B6E127DF8 http://mlq.dukejournals.org/content/77/3/277.full http://mlq.dukejournals.org/content/77/3/277.full https://www.cairn.info/revue-critique-2015-8-page-627.htm https://www.cairn.info/revue-critique-2015-8-page-627.htm http://news.kobo.com/_ir/159/20149/Publishing%20in%20the%20Era%20of%20Big%20Data%20-%20Kobo%20Whitepaper%20Fall%202014.pdf http://news.kobo.com/_ir/159/20149/Publishing%20in%20the%20Era%20of%20Big%20Data%20-%20Kobo%20Whitepaper%20Fall%202014.pdf http://news.kobo.com/_ir/159/20149/Publishing%20in%20the%20Era%20of%20Big%20Data%20-%20Kobo%20Whitepaper%20Fall%202014.pdf http://news.kobo.com/_ir/159/20149/Publishing%20in%20the%20Era%20of%20Big%20Data%20-%20Kobo%20Whitepaper%20Fall%202014.pdf http://dataplatform.enumerate.eu/reports/survey-report-on-digitisation-in-european-cultural-heritage-institutions-2015/detail http://dataplatform.enumerate.eu/reports/survey-report-on-digitisation-in-european-cultural-heritage-institutions-2015/detail http://dataplatform.enumerate.eu/reports/survey-report-on-digitisation-in-european-cultural-heritage-institutions-2015/detail http://dataplatform.enumerate.eu/reports/survey-report-on-digitisation-in-european-cultural-heritage-institutions-2015/detail http://www.adweek.com/adfreak/spotify-crunches-user-data-fun-ways-new-global-outdoor-ad-campaign-174826 http://www.adweek.com/adfreak/spotify-crunches-user-data-fun-ways-new-global-outdoor-ad-campaign-174826 http://www.adweek.com/adfreak/spotify-crunches-user-data-fun-ways-new-global-outdoor-ad-campaign-174826 Sm art culture. A nalysis o f d igital trend s 71AC/E DIGITAL CULTURE ANNUAL REPORT 2017 Tweeters 1. Ted Underwood: @Ted_Underwood 2. Lev Manovich: @manovich 3. Nuria Rodríguez Ortega: @airun72 4. Greta Franzini: @GretaFranzini 5. Dev Verhoeven: @bestqualitycrab 6. Frank Fischer: @umblaetterer 7. Matthew Lincoln: @matthewdlincoln 8. José Calvo: @eumanismo 9. Elena González Blanco: @elenagbg 10. Dan Cohen: @dancohen 6. Text Encoding Initiative: http://www.tei-c. org/index.xml 7. The Programming Historian: http:// programminghistorian.org/ 8. Stylo R: https://sites.google.com/site/ computationalstylistics/stylo 9. Voyant: http://voyant-tools.org/ 10. Google Arts & Culture: https:// www.google.com/culturalinstitute/ beta/u/0/?utm_campaign=cilex_v1&utm_ source=cilab&utm_medium=artsexperi- ments&utm_content=freefall https://twitter.com/Ted_Underwood https://twitter.com/manovich https://twitter.com/airun72 https://twitter.com/GretaFranzini https://twitter.com/bestqualitycrab https://twitter.com/umblaetterer https://twitter.com/matthewdlincoln https://twitter.com/eumanismo https://twitter.com/elenagbg https://twitter.com/dancohen http://www.tei-c.org/index.xml http://www.tei-c.org/index.xml http://programminghistorian.org/ http://programminghistorian.org/ https://sites.google.com/site/computationalstylistics/stylo https://sites.google.com/site/computationalstylistics/stylo http://voyant-tools.org/ https://www.google.com/culturalinstitute/beta/u/0/?utm_campaign=cilex_v1&utm_source=cilab&utm_medium=artsexperiments&utm_content=freefall https://www.google.com/culturalinstitute/beta/u/0/?utm_campaign=cilex_v1&utm_source=cilab&utm_medium=artsexperiments&utm_content=freefall https://www.google.com/culturalinstitute/beta/u/0/?utm_campaign=cilex_v1&utm_source=cilab&utm_medium=artsexperiments&utm_content=freefall https://www.google.com/culturalinstitute/beta/u/0/?utm_campaign=cilex_v1&utm_source=cilab&utm_medium=artsexperiments&utm_content=freefall https://www.google.com/culturalinstitute/beta/u/0/?utm_campaign=cilex_v1&utm_source=cilab&utm_medium=artsexperiments&utm_content=freefall Big Data in the Digital Humanities. New Conversations in the Global Academic Context Antonio Rojas Castro @RojasCastroA