Microsoft Word - RIs special issue v5 - revised - changes accepted King’s Research Portal DOI: 10.3366/ijhac.2013.0086 Document Version Early version, also known as pre-print Link to publication record in King's Research Portal Citation for published version (APA): Dunn, S., & Hedges, M. (2013). Crowd-sourcing as a Component of Humanities Research Infrastructures. International Journal of Humanities and Arts Computing, 7(1), 147-169. [N/A]. https://doi.org/10.3366/ijhac.2013.0086 Citing this paper Please note that where the full-text provided on King's Research Portal is the Author Accepted Manuscript or Post-Print version this may differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version for pagination, volume/issue, and date of publication details. And where the final published version is provided on the Research Portal, if citing you are again advised to check the publisher's website for any subsequent corrections. General rights Copyright and moral rights for the publications made accessible in the Research Portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognize and abide by the legal requirements associated with these rights. •Users may download and print one copy of any publication from the Research Portal for the purpose of private study or research. •You may not further distribute the material or use it for any profit-making activity or commercial gain •You may freely distribute the URL identifying the publication in the Research Portal Take down policy If you believe that this document breaches copyright please contact librarypure@kcl.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. Download date: 06. Apr. 2021 https://doi.org/10.3366/ijhac.2013.0086 https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a-component-of-humanities-research-infrastructures(294ddb42-e10c-4bce-b6cf-8b9500c9aeb7).html https://kclpure.kcl.ac.uk/portal/en/persons/stuart-dunn(9a7fa6a7-47a3-49b3-a358-140b7ba41334).html /portal/mark.hedges.html https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a-component-of-humanities-research-infrastructures(294ddb42-e10c-4bce-b6cf-8b9500c9aeb7).html https://kclpure.kcl.ac.uk/portal/en/journals/international-journal-of-humanities-and-arts-computing(ede900f4-f773-46e9-a878-a2110f8c1d8a).html https://doi.org/10.3366/ijhac.2013.0086 Open Access document downloaded from King’s Research Portal https://kclpure.kcl.ac.uk/portal The copyright in the published version resides with the publisher. When referring to this paper, please check the page numbers in the published version and cite these. General rights Copyright and moral rights for the publications made accessible in King’s Research Portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications in King's Research Portal that users recognise and abide by the legal requirements associated with these rights.' • Users may download and print one copy of any publication from King’s Research Portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the King’s Research Portal Take down policy If you believe that this document breaches copyright please contact librarypure@kcl.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. Citation to published version: Hedges, M., & Dunn, S. (2013). Crowd-sourcing as a Component of Humanities Research Infrastructures. International Journal of Humanities and Arts Computing This version: Pre-print https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a- component-of-humanities-research-infrastructures%28294ddb42-e10c- 4bce-b6cf-8b9500c9aeb7%29.html This Pre-print version has been submitted for publication https://kclpure.kcl.ac.uk/portal/ mailto:librarypure@kcl.ac.uk https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a-component-of-humanities-research-infrastructures%28294ddb42-e10c-4bce-b6cf-8b9500c9aeb7%29.html https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a-component-of-humanities-research-infrastructures%28294ddb42-e10c-4bce-b6cf-8b9500c9aeb7%29.html https://kclpure.kcl.ac.uk/portal/en/publications/crowdsourcing-as-a-component-of-humanities-research-infrastructures%28294ddb42-e10c-4bce-b6cf-8b9500c9aeb7%29.html Crowd-sourcing as a Component of Humanities Research Infrastructures Stuart Dunn, Mark Hedges Centre for e-Research, Department of Digital Humanities, King’s College London, 26-29 Drury Lane, London, UK mark.hedges@kcl.ac.uk, stuart.dunn@kcl.ac.uk Abstract: Crowd-sourcing, the process of leveraging public participation in or contribution to a project or activity, is relatively new to academic research, but is becoming increasingly important as the Web transforms collaboration and communication and blurs the boundaries between the academic and non- academic worlds. At the same time, digital research methods are entering the mainstream of humanities research, and there are a number of initiatives addressing the conceptualisation and construction of research infrastructures for the humanities. This paper examines the place of crowd-sourcing activities within such initiatives, presenting a framework for describing and analysing academic humanities crowd- sourcing, and using this framework of ‘primitives’ as a basis for exploring potential relationships between crowd-sourcing and humanities research infrastructures. Keywords: crowd-sourcing, research infrastructures, citizen science, scholarly primitives, typology. Introduction Crowd-sourcing, 1 the process of leveraging public participation in or contribution to a project or activity, is relatively new to academic research, and even more so to the humanities. However, at a time when the Web is transforming the way in which people collaborate and communicate, and is blurring boundaries between the spaces inhabited by the academic and non-academic worlds, it has never been more important to examine the role that public communities are beginning to play in academic humanities research. At the same time, digital research methods are starting to enter the mainstream of humanities research, and there are a number of initiatives addressing the conceptualisation and construction of research infrastructures that would support a shift from ad hoc projects and centres to an environment that is more integrated and sustainable. Such an environment will inevitably be distributed, integrating knowledge, services and people in a loosely-coupled, collaborative ‘digital social marketplace’. 2 The question naturally arises as to where crowd-sourcing activities fit within this framework. More specifically, what contributions can public participants, and the communities to which they belong, make to a humanities research infrastructure, and conversely how can these participants and communities, and the academic researchers who make use of the knowledge and effort that they contribute, benefit from such participation? To begin to address these questions is one of the aims of this paper. The paper is organised as follows: we begin by describing the context in which the work was carried out, and the methodology used. We then review a number of existing terminologies and typologies for crowd- sourcing and related concepts, and follow this with an analysis of the main motivations for engaging with crowd-sourcing, from both the volunteer’s and the academic’s points of view. Finally, we build upon this by presenting the outline of a framework for describing and analysing academic humanities crowd-sourcing projects, and use this framework of ‘primitives’ as a basis for exploring the potential relationships between various forms of crowd-sourcing activity and humanities research infrastructures. Background and Methodology The research described in this paper was mostly carried out as part of the Crowd-sourcing Scoping Study project (Ref. AH/J01155X/1), which ran for nine months from February-November 2012, and was funded by the Arts and Humanities Research Council as part of its Connected Communities programme. The study’s methodology had four main components: • a literature review covering academic humanities research that has incorporated crowd-sourcing, research into crowd-sourcing as a method, and less formal outputs such as blogs and project websites. • two workshops facilitating discussion between, respectively, humanities academics who have used crowd-sourcing, and contributors to crowd-sourcing projects; • an online survey of contributors to crowd-sourcing projects, exploring their backgrounds, histories of participating in such projects, and motivations for doing so; • interviews with academics and contributors. The study does not claim to be comprehensive: there are bound to be important projects, publications, individuals and activities that have been omitted, and there is a strong UK and Anglophone focus on the activities studied. In particular, while the survey was widely publicised, it was self-selecting and makes no claim to being statistically representative; it functioned rather as a means of gathering qualitative information about contributors’ backgrounds and motivations. Crowd-sourcing and related concepts The term crowd-sourcing was coined in a Wired article by Jeff Howe, 3 in which he draws a parallel between reducing labour costs by outsourcing to cheaper countries, and utilising ‘the productive potential of millions of plugged-in enthusiasts’. In an academic context, the term has developed from an economic focus to an information focus, in which this productive potential is used to achieve research aims. However, the term is problematic and requires further analysis. It is first necessary to distinguish crowd-sourcing from some related concepts. It is broader and less easy to define than ‘citizen science’, which is commonly understood to refer to activities whereby members of the public undertake well-defined and (individually) small-scale tasks as part of larger-scale scientific projects. 4 Another related concept is the ‘Wisdom of Crowds’, 5 which holds that large-scale collective decision-making can be superior to that of individuals, even experts. Although academic crowd-sourcing can be about decision, the decisions involved are rarely as neatly packageable as those implied in the world of business, where the ‘good’ or ‘bad’ nature of a decision can be evaluated on the basis of profitability. 6 Such collective decision-making also lacks the elements of collaboration around activities conceived and directed for a common purpose that characterise crowd-sourcing as commonly understood. Another important distinction is that between crowd-sourcing and ‘social engagement’. 7 According to Holley, social engagement involves ’giving the public the ability to communicate with us and each other‘, and is ’usually undertaken by individuals for themselves and their own purposes‘, whereas crowd-sourcing ’uses social engagement techniques to help a group of people achieve a shared, usually significant, and large goal by working collaboratively together as a group‘. Holley also notes that crowd-sourcing is likely to involve more effort, and implies a level of commitment and participation that goes beyond casual interest, whereas social engagement is an extension of the kinds of online activities – Tweeting, commenting – that millions do on a daily basis anyway. In one way, this aligns crowd-sourcing with ‘citizen science’. Indeed, Wiggins and Crowston develop this theme by highlighting a distinction between citizen science and community science, and stating as a key ingredient of the former that it is not self-organising and ’does not represent peer production ... because the power structure of these projects is usually hierarchical‘. 8 A fundamental aspect of citizen science is thus that the goal is defined by a particular person or group (almost always as part of a professional academic undertaking), and the participants (recruited through an open call) provide some significant effort towards achieving that goal. However, the different intellectual traditions of the sciences and the humanities embrace, and are embraced by, different kinds of non-academic community. Indeed, as Trevor Owens has noted, most successful crowd-sourcing activities in the humanities and cultural sectors are not really about crowds at all, in the sense of ’large anonymous masses of people’, but are about ’participation from interested and engaged members of the public’. 9 While a crowd-sourcing project may have the capacity for involving large numbers of people, in many cases only a few contributors end up being actively engaged, and these contribute a large percentage of the work. While there may be a centralised recruitment process, at this level the body of contributors is self-organising and self-selecting. A number of attempts have been made to identify the key characteristics, or to formulate a typology, of crowd-sourcing and related activities. Estellés-Arolas and González-Ladrón-de-Guevara identify eight characteristics, distilled from 32 distinct definitions identified in the literature: the crowd; the task at hand; the recompense obtained; the crowdsourcer or initiator of the crowdsourcing activity; what is obtained by crowdsourcing process; the type of process; the call to participate; and the medium. 10 This extremely processual definition is comprehensive in identifying stages that map easily to business processes. For the humanities, the ‘type of process’ is both more significant and more problematic, given the great diversity of processes in the creation of humanities research material. A more task-oriented approach is taken by Wiggins and Crowston, 11 who construct a typology for ‘citizen science’ activities, identifying five areas of application: Action, Conservation, Investigation, Virtual, and Education. The factors that lead to an activity being assigned to a category are multivariate, and the identification of the categories was based on whether there is an occurrence in a category or not, rather than frequency of those occurrences. The coverage is therefore extremely broad; ’Action’, for example, covers self-organising citizen groups that use web technologies to achieve a common purpose, often to do with campaigns on local issues. Moreover, the use of the word ‘science’ (at least in the usual Anglophone sense) confines the activities reviewed (in terms of both the methods and the content) to a particular epistemic bracket, which inevitably excludes some aspects of humanities research. One widely-quoted set of definitions for citizen science projects was presented by Bonney et al.. 12 This divided the field into three broad categories: contributory projects, in which members of the public, via an open call, contribute along lines that are tightly defined and directed by scientists; collaborative projects, which have a central design but to which members of the public contribute data, and may also help to refine project design, analyze data, or disseminate findings; and co-created projects, which are designed by scientists and members of the public working together and for which at least some of the public participants are actively involved in the scientific process. This approach shares important characteristics with the ‘task type’ described below, in that it is rooted in the complexity of the task, and the amount of initiative and independent analysis required to make a contribution. The Galleries, Libraries, Archives and Museums (hereafter GLAM) sectors have in particular seen efforts to develop crowd-sourcing typologies. One such typology has been proposed by Mia Ridge in a blog post, 13 and includes the following categories: Tagging, Debunking (i.e. correcting/reviewing content), Recording a personal story, Linking, Stating preferences, Categorizing, and Creative responses. Again, these categories imply a processual approach, concerning the type of task being carried out, and are potentially extensible across different types of online and physical content and collections. Another typology from the GLAM domain was developed by Oomen and Aroyo. 14 Their categories include Correction and Transcription, defined as inviting users to correct and/or transcribe outputs of digitisation processes (a category that Ridge’s ‘Debunking’ partially, but not entirely, covers); Contextualisation, or adding contextual knowledge to objects, by constructing narratives or creating User Generated Content (UGC) with contextual data; Complementing Collections, which is the active pursuit of additional objects to be included in a collection; Classification, defined as the gathering of descriptive metadata related to objects in a collection (Ridge’s ‘Tagging’ is a subset of this); Co-curation, which is using inspiration/expertise of non-professional curators to create (Web) exhibits (somewhat analogous to the co- created projects of Bonney et al., but more task-oriented); and Crowdfunding, or the collective cooperation of people who pool their money and other resources together to support efforts initiated by others. 15 Ridge explicitly rejects crowdfunding as a component of crowd-sourcing. 16 These typologies from the GLAM world perhaps represent best the different crowd-sourcing activities examined by the study, although such lists of categories do not reflect fully the complexity of the situations encountered. Instead, we propose a typology that is orientated along four distinct, although inter- dependent, facets, as described in Crowd-sourcing and research infrastructures below. Motivations Motivations of participants Overview Most studies have concluded that crowd-sourcing contributors typically do not have a single motivation; our own survey indicated overwhelmingly (79%) that the contributors who responded have both personal and altruistic motivations. However in many cases it is possible to identify a dominant motivating factor, which is almost always concerned directly with the activity’s subject area. In an analysis of 207 forum posts and interview responses for example, the Galaxy Zoo project found that the top motivations were an interest in astronomy (39%), a desire to contribute (13%) and a concern with the vastness of the universe (11%). 17 A study of volunteers for the Florida Fish and Wildlife Conservation Commission’s Nesting Beach Survey found that concern for turtle conservation was the overwhelming motivating factor. 18 Moreover, studies of the motivations of the contributors to academic crowd-sourcing projects have emphasised personal interest in the subject area concerned, and the opportunities provided to exercise that interest and to engage with people who share it, without material benefit. Such interest is usually concerned with the outcome, but it can also be in the process, or some combination of both. For example, in her 2009 assessment of volunteers to the TROVE project, Holley notes that ‘a large proportion was family history researchers’, who were highly motivated and had ‘a sense of responsibility towards other genealogists to help not only themselves but other people where possible’. 19 In general, it may be said that research into crowd-sourcing motivations suggests a clear primary, although not exclusive, focus on the subject or activity area, and that motivations can be personal or altruistic, and extrinsic or intrinsic. Rewards For the most part, crowd-sourcing projects do not reward their contributors directly in material or professional terms, and conversely contributors to crowd-sourcing projects are not subject to discipline (in either sense) or sanction in the way that members of conventionally-configured research projects are. Indeed, it is clear that the motivations of participants in academic crowd-sourcing tend to be intrinsic to the activity. However, we may regard more indirect benefits as constituting a form of reward: the fulfilment of an interest in the subject; personal gains such as skills, experience or knowledge; some form of status; or a feeling of gratification. In our survey, contributors mentioned a number of skills gained, including general IT competencies, such as editing wikis and using Skype for distributed collaboration, as well as specialised skills such as TEI encoding. Many contributors gained domain knowledge, for example through the opportunity to edit historical documents (ships’ histories) resulting from participation in the Old Weather project. This project showed that the domain interests of the participants can differ from those of the project team, which in this case is solely interested in those parts of the documents being transcribed that relate to climate history, 20 whereas several contributors became interested in the histories of individual ships, and in addressing niches of history that had been hitherto unexplored. Participants can also pick up a basic grounding in research methods of collation, synthesis and analysis in the area of interest to them. Less concrete benefits also function as rewards. It was frequently noted that some form of ‘feedback loop’, through which a participant is informed that their contributions were correct and valuable, is a very important motivating factor for engaging with crowd-sourcing projects, and conversely that a lack of feedback can be very frustrating and discouraging to the participant. Feedback also plays a key role in building a sense of community, and making participants feel that they have a stake in the project. For complex tasks, feedback may also be a necessary part of improving volunteers’ work practices, as in Transcribe Bentham. 21 This feedback can be immediate and specific to an individual contribution – for example,. participants in the British Library’s Georeferencer project (BLG), 22 who could see the results of their work immediately – or it can be deferred and cumulative, for example by means of rankings. Contributors may receive various ’social’ rewards, for example through rankings, increased standing in the crowd-sourcing community, or (in the case of Galaxy Zoo) being credited and named in publications. Similarly, contributors may be subjected to social sanctions, such as banning (e.g. removal of pages or blocking of accounts on Wikipedia), which can adversely affect their reputation and enjoyment, and may even in rare cases reflect on their professional standing. As well as simple feedback interactions between the project and an individual user, the ability to interact with other participants, for example via a project forum, is an extremely important motivation. Such project-based social networks are used both for ‘exchanging chit-chat’ and for discussing and sharing information on the practical and technical issues raised, and can foster a sense of community among the participants that can extend beyond the immediate activities of the project itself. A good example of this is the Old Weather forum, 23 which contains exchanges among participants that are indicative of a high degree of collaborative, communal working in addressing problems that arise during the process. The importance of forums was also noted by participants in Transcribing Bentham and British Library Georeferencer. Gamification Some approaches have emphasised the importance of tasks being enjoyable, and have focused on the development of games for crowd-sourcing of different kinds. Prestnopnik and Crowston discuss the role of games, and in particular possible approaches to creating an application for crowd-sourced natural history taxonomy classification using design science. 24 The Bodiam Castle project provides an example of the potential for games in the context of archaeological analysis of buildings, although this had a greater emphasis on visualisation than on competition. 25 However, Prestnopnik and Crowston also note that ‘gamification’ can act as a disincentive to contributors who have expert knowledge or deep interest in the subject. 26 Gamification can also be a barrier for users who simply want to engage with the assets or processes in question, and can trivialise the process of acquiring or processing data. 27 In their analysis of The Bird Network project, in which gathered data about the use of bird-boxes by birds, which was then shared with the scientific team, Brossard et al. note that participants’ interest in ornithology was likely to overshadow awareness of scientific process, 28 and thus stymie efforts by the Lab to contribute to scientific awareness and education. 29 Competition Although very few participants in our survey admitted to being motivated by competition with each other, among those who attended our workshop competition featured strongly as a factor, although this should be qualified by the fact that those present tended to be ‘super contributors’, who are likely to feel more competitive than those in the ‘long tail’ of the crowd. For many projects it is possible to track individual participant’s contributions and to acquire statistics on contributions, and in such cases projects can establish ‘leader boards’ indicating which participants have made the biggest contributions (in whatever terms the project is using). For example, the British Library’s Georeferencer project displayed the handles of the users who processed the most maps, and the ‘winner’ was invited to meet the Library’s head of cartography. The Old Weather project also encouraged competition by assigning roles to contributors based on the number of pages transcribed. However, in order for competition to be a significant motivating factor, the tasks and their outcomes must be sufficiently quantifiable to allow mutual comparison; matters can become complex when tasks are not comparable directly. For example, in BLG some maps were more complex than others, and the team felt that this affected the meaningfulness of comparing the effort needed to georeference them. Where more creative or interpretive outputs are being created, this lack of commensurability is a still greater issue, and there may even be conflicts between outputs; simple rankings seem inappropriate to such scenarios. In any case, the encouragement of competition should not be at the cost of alienating potential participants who are not by nature competitive, nor of favouring speed and volume at the expense of quality and care. Indeed, competition can be defined not just in this quantitative sense; volunteers may compete to produce more high-quality work, although in the absence of metrics this can amount to competing only against oneself. Note also that competition is not incompatible with a sense of common purpose; for example, Old Weather participants often ‘feel like part of the ship’ on which they are working. Motivations of academics At least part of the success of Galaxy Zoo and other Zooniverse projects is that they catered to clear and present academic needs. In the case of Galaxy Zoo itself, the assets – photographs of galaxies – were far too numerous to be examined individually by any research team, and the task – the classification of those galaxies – was not one that could be performed by computer software, although for the most part could be carried out by a person without specialist expertise. 30 Quite simply, this is work that could not have been carried without large-scale public engagement and participation. Most cases where humanities academics have engaged with crowd-sourcing have been driven by specific research questions or the need for a particular resource. For example, the Transcribe Bentham project was motivated by the fact that 40,000 folios of Bentham’s work were untranscribed, and thus these valuable primary sources were inaccessible to people researching eighteenth or nineteenth century thought. 31 BLG was motivated by the desire to make its map collections more searchable and thus more exploitable. In Old Weather, researchers were motivated by the desire to be able to use information contained within the assets to explore historic weather patterns, although these motivations may not necessarily be shared by the participants. 32 Although the research motivations are various, the key characteristic leading the project to use crowd-sourcing is that each involves tasks that a computer could not carry out, and that a research team could only do only with prohibitively large resources. Note however, during the initial six-month testing period of the project, the rate of volunteer transcription compared unfavourably with that of professional researchers, 33 possibly due to the complexity of the material and the difficulty of Bentham’s handwriting. There was also an extremely high moderation overhead, with significant staff time needed to validate the outputs and provide feedback to the contributors. Since then, the volunteer transcription rate has improved significantly, so there is potential for avoiding significant costs in the future. 34 However, this example can serve as a warning against assumptions that crowd-sourcing provides free labour. Other researchers, particularly those in the GLAM sector, see crowd-sourcing as a means of filling gaps in the coverage of their collections, 35 as it can be an effective way of obtaining information about assets (or the assets themselves) to which only certain members of the public have access, for example through personal or family connections. However, in order to be usable for academic purposes, a degree of curation is required, and this may involve expert input. It is clear that public engagement and community building is frequently an unintentional by-product of crowd-sourcing projects. In some cases it is seen as an explicit motivation, with the aim of encouraging public engagement with scholarly archives and research, and thus increasing the broader impact of academic research activities. 36 Crowd-sourcing and research infrastructures A conceptual framework for crowd-sourcing One of the outcomes of our study is a typology for crowd-sourcing in the humanities, which brings together the earlier work cited in Section 2 with the experiences and processes uncovered during the study. It does not seek to provide an alternative set of categories specifically for the humanities, in competition with those considered above. Rather, we propose a model for describing and understanding crowd-sourcing projects in the humanities by analysing them in terms of four key facets – asset type, process type, task type, and output type – and of the relationships between them, and in particular by observing how the applicable categories in one facet are dependent on those in other facets. Error! Reference source not found. illustrates the four facets and their interactions. • A process is composed of tasks through which an output is produced by operating on an asset. It is conditioned by the kind of asset involved, and by the questions that are of interest to project stakeholders (both organisers and volunteers) and can be answered, or at least addressed, using information contained in the asset. • An Asset refers to the content that is, in some way, transformed as a result of processing by a crowd-sourcing activity. • A task is an activity that a project participant undertakes in order to create, process or modify an asset (usually a digital asset). Tasks can differ significantly as regards the extent to which they require initiative and/or independent analysis on the part of the participant, and the difficulty with which they can be quantified or documented. The task types were identified the aim of categorising this complexity, and are listed below in approximately increasing order. • The output is what is produced as the result of applying a process to an asset. Outputs can be tangible and/or measurable, but we make allowance also for intangible outcomes, such as awareness or knowledge etc. Error! Reference source not found.–Error! Reference source not found. list the categories that the study identified under each facet; these are based for the most part on an examination of existing crowd- sourcing practice, so it is to be expected that the lists will be extended and/or challenged by future work. Detailed descriptions of each category may be found in the report by Dunn and Hedges; 37 in the rest of this paper, we examine the framework specifically in relation to humanities research infrastructures. From crowd-sourcing primitives to research infrastructures Rather than attempting to map the elements of this crowd-sourcing framework to specific infrastructures or infrastructural components, we note instead that it may be thought of as a framework of ‘primitives’, in a sense analogous to that of ‘scholarly primitives’. Scholarly primitives may be defined as ’basic functions common to scholarly activity across disciplines’, 38 and they provide a conceptual framework for classifying scholarly activities. Given the diversity of humanities research, it is not surprising that there are various sets of candidates – in addition to Palmer et al. there are, for example, Unsworth, 39 Benardou et al. 40 and Anderson et al. 41 – and such a structure has in particular been used as a framework for conceptualising and developing infrastructure for supporting humanities research. 42 The process facet in particular may be regarded as providing a set of primitives in this sense, and the output type composite digital collection with multiple meanings may in particular be regarded as a form of humanities ‘research object’, in the sense used by Bechhofer et al. 43 and Blanke and Hedges. 44 Of course, the categorisation into primitives described above is quite different to those in the works cited; this is only to be expected, as it represents the activities of quite different stakeholders, namely interested members of the public rather than professional scholars (although of course one person can play different roles in different circumstances). In particular, there is a greater emphasis on creating or enhancing digital assets in some way, rather than using these assets in research (although again these activities can overlap. For the remainder of this paper, we will look in more detail at each of the process types in turn, using specific examples examined by the study with a view to seeing how crowd-sourcing can contribute effectively to humanities research infrastructures. COLLABORATIVE TAGGING Collaborative tagging may be regarded as crowd-sourcing the organisation of information assets by allowing users to attach tags to those assets. Tags can be based on existing controlled vocabularies, but are more usually derived from free text supplied by the users themselves. Such ‘folksonomies’ are distinguished from deliberately designed knowledge organisation systems by the fact that they are self- organising, evolving and growing as contributors add new terms. It is possible to extract more formal vocabularies from folksonomies. 45 Collaborative tagging can result in two concrete outcomes: it can make a corpus of information assets searchable using keywords applied by the user pool, and it can highlight assets that have particular significance, as evidenced by the number of repeat tags they are accorded by the pool. Research in this area has examined the patterns and information that can be extracted from folksonomies. Golder found that patterns generated by collaborative tagging are, on the whole, extremely stable, meaning that minority opinions can be preserved alongside more highly replicated, and therefore mainstream, concentrations of tags. 46 Other research has shown that user-assigned tags in museums may be quite different from vocabulary terms assigned by curators, and that relating tags to controlled vocabularies can be very problematic, 47 although it could be argued that this allows works to be addressed from a different perspective than that of the museum’s formal documentation. In any case, such approaches to knowledge organisation are likely to play a significant part in the organisation of humanities data in the future. An example is the BBC’s YourPaintings project, 48 developed in collaboration with the Public Catalogue Foundation, which has amassed a collection of photographs of all paintings in public ownership in the UK. The public is invited to apply tags to these, which both improves discovery and enables the creation of an aggregation of specialised knowledge. A more complex example is provided by the Prism project. 49 Collaborative tagging typically assumes that the assets being tagged are themselves stable and clearly identifiable as distinct objects. Prism allowed readers to highlight significant areas of a text and apply tags to them, and thus build up a collective interpretation of the text. Unlike many humanities crowd-sourcing activities, such as transcribing texts according to well-defined procedures, which have identifiable completions, interpretation can go on indefinitely, and there are no right or wrong answers. LINKING Linking covers the identification and documentation of relationships (usually typed) between individual assets. Most commonly, this takes the form of linking via semantic tags, where the tags describe binary relationships, in which case it is analogous to collaborative tagging. In principle, this could also include the identification of n-ary relationships. TRANSCRIBING Transcribing is currently one of the most prominent areas of humanities crowd-sourcing, as it can be used to address a fundamental problem with digitisation, namely the difficulty of rendering handwriting into machine-readable form using current technology. Typically, such transcription requires the human eye and, in many cases, human interpretation. In terms of our typology, the output of a transcribing process will typically be transcribed text. Two projects have contributed significantly to this prominence: Old Weather (OW) and Transcribe Bentham (TB). OW involved the transcription of ships’ log-books held by The National Archives, in order to obtain access to the weather observations they contain, information that is of major significance for climate research. 50 TB encouraged volunteers to transcribe and engage with unpublished manuscripts by the philosopher and reformer Jeremy Bentham, by rendering them into text marked up using TEI XML. 51 The collaborative model needed for successful crowd-sourced transcription depends on the complexity of the source material. Complex material, such as these two cases, requires a high level of support, whether from the project team or a participant’s peers. Simpler material is likely to require less support; for example, when transcribing the more structured data found in family records, 52 the information (text or integers) to be transcribed is presented to the user in small segments – e.g. names, dates, addresses – and transcription requires different cognitive processes that are less dependent on interaction with peers and experts. Note that this category includes marked-up transcriptions, e.g. using TEI XML, as well as simple transcription of characters. There will be a point however at which the addition of semantic mark-up will go beyond mere transcription, and will count as a form of collaborative tagging or linking, and the output will typically be enhanced text. CORRECTING/MODIFYING CONTENT While content is increasingly ‘born digital’, projects for digitising analogue material abound. Many mass- digitisation technologies, such as Optical Character Recognition (OCR) and speech recognition, can be error-prone, and any such enterprise needs to factor in quality control and error correction, which can make use of crowd-sourcing. The TROVE project, which produced OCR-ed scans of newspapers from the Australian National Archives, is an excellent example of this. 53 The volume of digitised material precluded the corrections being undertaken by the Archive’s its own staff, and using uncorrected text would have significantly reduced the benefits of digitisation, as search capability would have been very restricted. Another potential application in this category is for correcting automated transcriptions of recorded speech, as such transcription is currently highly error-prone, with error rates of 30% or more. 54 RECORDING AND CREATING CONTENT Processes in this category frequently deal with ephemera and intangible cultural heritage. The latter covers any cultural manifestation that does not exist in tangible form; typically, crowd-sourcing is used to document such heritage through a set of processes and tasks, resulting in some form of tangible output. The importance of preserving intangible cultural heritage has been recognised by the UN, 55 and the ways in which this can be documented and curated by distributed communities is an important area for future research. Frequently this takes the form of a cultural institution soliciting memories from the communities it serves, for example the Tenbury Wells Regal Cinema’s Memory Reel project. 56 Such processes can incorporate a form of editorial control or post hoc digital curation, and their outputs can be edited into more formal publications. Another example is the Scottish Words and Place-names (SWAP) project, 57 which gathered words in Scots, determining which words were in current use and where/how they were used, with the ultimate aim of offering selected words for inclusion in the Scottish Language Dictionaries resource. 58 Candidate words were gathered via the project website as well as via social media – Facebook in particular was an important venue for developing conversations around the material – and words that the project felt were suitable were passed to lexicographers for further scrutiny. By ephemera, we understand cultural objects that are tangible, but are at risk of loss because of their transitory nature, for example home videos or personal photographs. 59 There are a number of project addressing such assets, for example the Europeana 1914-1918 project, 60 which is collecting digitised personal artefacts relating to the First World War. The ubiquity of the Web, and access to content creation and digitisation technologies, has led to the creation of non-professionally curated online archives. These have a clear role to play in enriching, augmenting and complementing collections held by memory institutions, and in developing curatorial narratives independent from those of library and archive professionals. 61 Processes in this category are also likely to have elements of the ‘social engagement’ model, in terms of Holley’s distinction. 62 COMMENTING, CRITICAL RESPONSES AND STATING PREFERENCES Processes of this type are likely to count as crowd-sourcing only if there is some specific purpose around which people come together. One example of this is the Shakespeare’s Global Communities project, 63 which captured audience responses to the 2012 World Shakespeare Festival, with the aim of investigating how ‘social networking technologies reshape the ways in which diverse global communities connect with one another around a figure such as Shakespeare’ 64 . The question provides a focus for the activity, which, although not itself producing an academic output, provides a dataset for addressing research questions on the modern reception of Shakespeare. Appropriately managed blogs can provide a platform for focused scholarly interactions of this type. For example, a review by Sonia Massai of King Lear on the Year of Shakespeare site attracted controversial responses, leading to an exchange about critical methods as well as content. 65 What differentiates such exchanges from amateur blogging is the scholarly focus and context provided by the project, and its proactive directing of content creation. The project thus provides a tangible link between the crowd and the subject. CATEGORISING Categorising involves assigning assets to predefined categories; it differs from collaborative tagging in that the latter is unconstrained. CATALOGUING Cataloguing – or the creation of structured, descriptive metadata – is a more open-ended process than categorising, but is nevertheless constrained to following accepted metadata standards and approaches. It frequently includes categorising as a sub-activity, e.g. by LoC subject headings. Cataloguing is a time- and resource-consuming process for many GLAM institutions, and crowd-sourcing has been explored as a means of addressing this. For example, the What’s the Score project at the Bodleian investigated a cost-effective approach to increasing access to music scores from their collections through a combination of rapid digitisation and crowd-sourcing descriptive metadata. 66 Cataloguing is related to contextualising, as ordering, arraying and describing assets will also make explicit some of their context. CONTEXTUALISING Contextualising is typically a more broadly-conceived activity than the related process types of cataloguing or linking, and it involves enriching an asset by adding to it or associating with it other relevant information or content. GEOREFERENCING Georeferencing is the process of establishing the location of un-referenced geographical information in terms of a modern coordinate system such as latitude and longitude. Georeferencing can be used to enrich geospatial assets – datasets or texts, including maps, gazetteers or travelogues, that refer to locations on the earth’s surface – that do not include such explicit information. A major example of crowd-sourcing activity in this area is the British Library Georeferencer project, which aimed to ’geo-enable‘ historical maps in its collections by asking participants to assign spatial coordinates to digitised map images, a task that would have been too labour-intensive for Library staff to undertake themselves. Once georeferenced, the digitised maps are searchable geographically due to the inclusion of latitude and longitude coordinates in the metadata. 67 MAPPING Mapping (in the sense of this typology) refers to the process of creating a spatial representation of some information asset(s). This could involve the creation of map data from scratch, but could also be applied to the spatial mapping of concepts, as in a ‘mind map’. The precise sense will depend on the asset type to which mapping is being applied. There is an important distinction between maps and related geospatial assets created by expert organisations, such as the Ordnance Survey, and those created by community-based initiatives. The former may have the authority of a governmental imprimatur, and the distinction of official endorsement. However, the recent emergence of crowd-sourced geospatial assets – a product of the recent global growth in the ownership of hand-held devices with the ability to record location using GPS 68 – has led to the emergence of resources such as Open Street Map, 69 which has in turn led to a discussion about the reliability of such resources. In general, it has been found that Open Street Map in particular is extremely reliable, 70 but that the specifications for such resources must be carefully defined. 71 The impact of Open Street Map on the cartographic community generally has been noted. 72 The importance of mapping as a means of convening spatial significance means that this kind of asset is particularly open to different discourses, and possibly conflicting narratives. The digital realm, with its potential for accommodating multiple, diverse, contributions and interpretations, holds great potential for such material. 73 TRANSLATING This covers the translation of content from one language to another. In many cases, a crowd-sourced translation will require a strongly collaborative element if it is to be successful, given the semantic interdependencies that can occur between different parts of a text. However, in cases where a large text can be broken up naturally into smaller pieces, a more independent mode of work may be possible; for example, Suda On-Line, 74 which is translating the entries in a 10 th Century Byzantine lexicon/encyclopaedia. A more modern, although non-academic, example is the phenomenon of ‘fansubbing’, where enthusiasts provide subtitles for television shows and other audiovisual material. 75 Conclusions One of the main conclusions of our study is that research involving humanities crowd-sourcing can best be framed and understood through an analysis in terms of four fundamental facets – asset type, process type, task type, and output type – and of the relationships between them. Depending on the activity in question, and what it aims to do, some categories, or indeed some facets, will have primacy. Outputs might be original knowledge, or they might be more ephemeral and difficult to identify: however, considering the processes of both knowledge and resource creation as comprising of these four facets gives a meaningful context to every piece of research, publication and activity we have uncovered in the course of this review. We hope the lessons and good practice we have identified here will, along with this typology, contribute to the development of new kinds of humanities crowd-sourcing in the future. Significantly, we have determined that most humanities scholars that have used crowd-sourcing as part of some research activity agree that it is not simply a form of ‘cheap labour’ for mass digitisation or resource enhancement; indeed, in a narrowly cost-benefit sense it does not always compare well with more conventional mechanisms of digitisation. In this sense, it has truly left its economic roots, as defined by Howe (2006), behind. The creativity, enthusiasm and alternative foci that communities outside that academy can bring to academic research is a resource that is now ripe for tapping into, and the examples above illustrate the rich variety of forms that this tapping can take. We have noted the similarity between some aspects of our typology and the concept of the ‘scholarly primitive’, which has proved valuable in humanities e-research for providing a conceptual framework of fundamental building blocks for describing scholarly activities and modelling putative research infrastructures for the humanities. We have used this relationship to investigate how crowd-sourcing activities falling under various process types can contribute effectively to such research infrastructures. Acknowledgements and additional information A list of the projects investigated by the study, and a description of the survey (including the questions and a summary of the results) may be found in Appendices B and A respectively of (Dunn and Hedges 2012). The project website is at http://humanitiescrowds.org, and additional information (in ‘raw’ form) from the workshops organised as part of the study may be found at http://humanitiescrowds.org/wp- uploads/2012/09/workshop_report1.pdf. We are very grateful to all those who have shared their knowledge and experience with us during the study, and in particular those who agreed to be interviewed, or participated in the workshops, or provided feedback on the project report. 1 We follow the convention of hyphenating ‘crowd-sourcing’; other authors use ‘crowdsourcing’ or ‘crowd sourcing’. In quotations, we preserve the original form. 2 T. Blanke, M. Bryant, M. Hedges, A. Aschenbrenner and M. Priddy, ‘Preparing DARIAH’, 7th IEEE International Conference on e-Science, Stockholm, Sweden (2011), 158-165, http://dx.doi.org/10.1109/eScience.2011.30. 3 J. Howe, ‘The rise of crowdsourcing’, Wired, 14.06 (2006), http://www.wired.com/wired/archive/14.06/crowds.html. 4 J. Silvertown, ‘A new dawn for citizen science’, Trends in ecology & evolution, 24, No. 9 (2009), 467-71. D. P. Anderson, J. Cobb, E. Korpela, M. Lebofsky and D. Werthimer, ‘SETI@home: an experiment in public-resource computing’, Communications of the ACM, 45, Issue 11 (2002), 56-61. 5 J. Surowiecki, The wisdom of crowds: why the many are smarter than the few, 2004. 6 D. Brabham, ‘Crowdsourcing as a model for problem solving: an introduction and cases’, Convergence: The International Journal of Research into New Media Technologies, 14, Issue 1 (2008), 75-90. 7 R. Holley, ‘Crowdsourcing: how and why should libraries do it?’, D-Lib Magazine, 16, No. 3/4 (2010), http://www.dlib.org/dlib/march10/holley/03holley.html. 8 A. Wiggins and K. Crowston, ‘From conservation to crowdsourcing: a typology of citizen science’, System Sciences (HICSS), 2011 44 th Hawaii International Conference, http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5718708. 9 http://www.trevorowens.org/2012/05/the-crowd-andthe-library 10 E. Estellés-Arolas and F. González-Ladrón-de-Guevara, ‘Towards an integrated crowdsourcing definition’, Journal of Information Science, 38, No. 2 (2012), 189-200. 11 A. Wiggins and K. Crowston, ‘From conservation to crowdsourcing: a typology of citizen science’. 12 R. Bonney, H. Ballard, R. Jordan, E. McCallie. T. Phillips, J. Shirk and C. Wilderman, Public Participation in Scientific Research: Defining the Field and Assessing Its Potential for Informal Science Education, Center for Advancement of Informal Science Education, Washington D. C. (2009), http://caise.insci.org/uploads/docs/PPSR%20report%20FINAL.pdf. 13 http://openobjects.blogspot.co.uk/2012/06/frequently-asked-questions-about.htm 14 J. Oomen and L. Aroyo, ‘Crowdsourcing in the cultural heritage domain: opportunities and challenges’, Proceedings of the 5 th International Conference on Communities and Technologies (2011), 138-149, http://www.cs.vu.nl/~marieke/OomenAroyoCT2011.pdf. 15 A. Agrawal, C. Catalini and A. Goldfarb, ‘The geography of crowdfunding’, NET Institute Working Paper Series, 10-8 (2011), 1-57, http://ssrn.com/abstract=1692661. 16 http://openobjects.blogspot.co.uk/2012/06/frequently-asked-questions-about.htm 17 M. J. Raddick, G, Bracey, P. L. Gay, C. J. Lintott, P. Murray, K. Schawinski, A. S. Szalay and J. Vandenberg, ‘Galaxy Zoo: exploring the motivations of citizen science volunteers’, Astronomy Education Review, 9 (2010), http://aer.aas.org/resource/1/aerscz/v9/i1/p010103_s1. 18 B. M. Bradford and G. D. Israel, ‘Evaluating volunteer motivation for sea turtle conservation in Florida’, Agricultural Education (2004), 1-9. 19 R. Holley, Many hands make light work: public collaborative OCR text correction in Australian historic newspapers, National Library of Australia (2009), http://www.nla.gov.au/ndp/project_details/documents/ANDP_ManyHands.pdf. 20 http://crowds.cerch.kcl.ac.uk/wp-uploads/2012/04/Brohan.pdf 21 T. Causer, J. Tonra and V. Wallace, ‘Transcription maximized; expense minimized? crowdsourcing and editing The Collected Works of Jeremy Bentham’, Literary and Linguistic Computing, 27, Issue 2 (2012), 1-19. Similar conclusions were drawn by the authors of the current article, based on their interviews with staff and volunteers from the Old Weather project and the British Library’s Georeferencer project. 22 http://www.bl.uk/maps/ 23 http://forum.oldweather.org 24 N. R. Prestopnik and K. Crowston, ‘Gaming for (citizen) science: exploring motivation and data quality in the context of crowdsourced science through the design and evaluation of a social-computational system’, Proceedings of “Computing for Citizen Science” workshop at the 7 th IEEE eScience Conference (2011), http://crowston.syr.edu/sites/crowston.syr.edu/files/gamingforcitizenscience_ver6.pdf. 25 http://crowds.cerch.kcl.ac.uk/wp-uploads/2012/04/Masinton.pdf 26 N. R. Prestopnik and K. Crowston, ‘Gaming for (citizen) science: exploring motivation and data quality in the context of crowdsourced science through the design and evaluation of a social-computational system’ (2011). 27 See http://blog.tommorris.org/post/3216687621/im-not-an-experience-seeking-user-im-a for a combative assertion of this position. 28 D. Brossard, B. Lewenstein and R. Bonney, ‘Scientific knowledge and attitude change: the impact of a citizen science project’, International Journal of Science Education, 27, Issue 9 (2005), 1029-1121. 29 D. J. Trumbull, R. Bonney, D. Bascom and A. Cabral, ‘Thinking scientifically during participation in a citizen-science project’ Science Education, 84, Issue 2 (1999), 265-275. 30 C. J. Lintott, K. Schawinski, A. Slosar, K. Land, S. Bamford, D. Thomas, M. J. Raddick, R. Nichol, A. Szalay, D. Andreescu, P. Murray and J. Vandenberg, ‘Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey’, Monthly Notices of the Royal Astronomical Society, 389, Issue 3 (2008), 1179-1189. 31 http://humanitiescrowds.org/wp-uploads/2012/04/Causer.pdf 32 http://humanitiescrowds.org/wp-uploads/2012/04/Brohan.pdf 33 T. Causer, J. Tonra and V. Wallace, ‘Transcription maximized; expense minimized? crowdsourcing and editing The Collected Works of Jeremy Bentham’ (2012). 34 T. Causer and V. Wallace, ‘Building a volunteer community: results and findings from Transcribe Bentham’ Digital Humanities Quarterly, 6. No. 2 (2012), http://www.digitalhumanities.org/dhq/vol/6/2/000125/000125.html. 35 M. Terras, ‘Digital curiosities: resource creation via amateur digitisation’, Literary and Linguistic Computing, 25, No. 4 (2010), 425-438, doi:10.1093/llc/fqq019. 36 M. Moyle, J. Tonra and V. Wallace, ‘Manuscript transcription by crowdsourcing: Transcribe Bentham’. Liber Quarterly - The Journal of European Research Libraries. 20, Issue 3/4 (2011). 37 S. Dunn and M. Hedges, ‘Crowd-sourcing scoping study: engaging the crowd with humanities research’, Arts and Humanities Research Council report (2012), http://humanitiescrowds.org/wp- uploads/2012/12/Crowdsourcing-connected-communities.pdf. 38 C. L. Palmer, L. C. Teffeau and C. M. Pirmann, ‘Scholarly information practices in the online environment: themes from the literature and implications for library service development’ (2009). 39 J. Unsworth, ‘Scholarly primitives: what methods do humanities researchers have in common, and how might our tools reflect this’, ‘Humanities Computing, Formal Methods, Experimental Practice’ Symposium, King’s College London (2000), http://people.lis.illinois.edu/~unsworth/Kings.5-00/primitives.html. 40 A. Benardou, P. Constantopoulos, C. Dallas and D. Gavrilis, ‘Understanding the information requirements of arts and humanities scholarship’, International Journal of Digital Curation, 5, No. 1 (2010), 18-33. 41 S. Anderson, T. Blanke and S. Dunn, ‘Methodological commons: arts and humanities e-science fundamentals’, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 368, No. 1925 (2010), 3779-3796. 42 T. Blanke and M. Hedges, ‘Scholarly primitives: building institutional infrastructure for humanities e- science’, Future Generation Computer Systems, 29, Issue 2 (2013), 654-661, http://dx.doi.org/10.1016/j.bbr.2011.03.031. 43 S. Bechhofer, I. Buchan, D. De Roure, P. Missier, J. Ainsworth, J. Bhagat, P. Couch, D. Cruickshank, M. Delderfield, I. Dunlop, M. Gamble, D. Michaelides, S. Owen, D. Newman, S. Sufi and C. Goble, Future Generation Computer Systems, 29, Issue 2 (2013), 599–611, http://dx.doi.org/10.1016/j.future.2011.08.004. 44 T. Blanke and M. Hedges, ‘Scholarly primitives: building institutional infrastructure for humanities e- science’ (2013). 45 H. Lin and J. Davis, ‘Computational and crowdsourcing methods for extracting ontological structure from folksonomy, The Semantic Web: Research and Applications, Lecture Notes in Computer Science, 6089 (2010), 472-477, DOI:10.1007/978-3-642-13489-0_46. 46 S. Golder, ‘Usage patterns of collaborative tagging systems’, Journal of Information Science, 32, Issue 2 (2006), 198-208. 47 J. Trant, Tagging, Folksonomy, and Art Museums: Resultsof steve.museum’s research (2009), http://conference.archimuse.com/blog/jtrant/stevemuseum_research_report_available_tagging_fo; J. Trant, D. Bearman and S. Chun, ‘The eye of the beholder: steve.museum and social tagging of museum collections’, Proceedings of the International Cultural Heritage Informatics Meeting (ICHIM07), Toronto, Canada (2007). 48 http://www.bbc.co.uk/arts/yourpaintings/ 49 http://www.scholarslab.org/category/praxis-program/ 50 P. Brohan, R. Allan, J. E. Freeman, A. M. Waple, D. Wheeler, C. Wilkinson and S. Woodruff, ‘Marine observations of old weather’ Bulletin of the American Meteorological Society, 90, Issue 2 (2009), 219-230. 51 T. Causer, J. Tonra and V. Wallace, ‘Transcription maximized; expense minimized? crowdsourcing and editing The Collected Works of Jeremy Bentham’ (2012). 52 For example, http://www.familysearch.org 53 R. Holley, Many hands make light work: public collaborative OCR text correction in Australian historic newspapers (2009). 54 M. Wald, ‘Crowdsourcing correction of speech recognition captioning errors’ Proceedings of the International Cross-Disciplinary Conference on Web Accessibility - W4A '11 (2011), http://eprints.soton.ac.uk/272430/1/crowdsourcecaptioningw4allCRv2.pdf. 55 R. Kurin, ‘Safeguarding intangible cultural heritage in the 2003 UNESCO convention: a critical appraisal’, Museum International, 56, Issue 1-2 (2004), 66–77. 56 http://www.regaltenbury.org.uk/memory-reel/ 57 C. Hough, E. Bramwell and D. Grieve, Scots Words and Place-Names Final Report, JISC (2011), http://www.jisc.ac.uk/media/documents/programmes/digitisation/swapfinalreport.pdf. See also http://swap.nesc.gla.ac.uk/. 58 http://www.scotsdictionaries.org.uk/ 59 This usage differs from the standard usage of the term by museums. 60 http://www.europeana1914-1918.eu/en/contributor 61 M. Terras, ‘Digital curiosities: resource creation via amateur digitisation’ (2010). 62 R. Holley, ‘Crowdsourcing: how and why should libraries do it?’ (2010). 63 www.yearofshakespeare.com 64 http://humanitiescrowds.org/wp-uploads/2012/09/workshop_report1.pdf 65 http://bloggingshakespeare.com/year-of-shakespeare-king-lear-at-the-almeida 66 http://www.whats-the-score.org; http://scores.bodleian.ox.ac.uk 67 C. Fleet, K. C. Kowal and P. Pridal, ‘Georeferencer: crowdsourced georeferencing for map library collections, D-Lib Magazine, 18, No. 11/12 (2012), http://www.dlib.org/dlib/november12/fleet/11fleet.html. 68 M. Goodchild, ‘Editorial: citizens as voluntary sensors: spatial data infrastructure in the world of Web 2.0’, International Journal of Spatial Data Infrastructures Research, 2 (2007), 24-32. 69 http://www.openstreetmap.org/ 70 M. Haklay and P. Weber, ‘OpenStreetMap: user-generated street maps’, Pervasive Computing, IEEE, 7, Issue 7 (2008), 12-18. M. Haklay, ‘How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets’, Environment and Planning B: Planning and Design,37, Issue 4 (2010), 682-703. 71 C. Brando and B. Bucher, ‘Quality in user generated spatial content: a matter of specifications’, Proceedings of the 13 th AGILE International Conference on Geographic Information Science, Guimarães, Portugal (2010), 1-8. 72 S. Chilton, ‘Crowdsourcing is radically changing the geodata landscape: case study of OpenStreetMap’, Proceedings of the 24th International Cartographic Conference, Santiago, Chile (2009), http://w.icaci.org/files/documents/ICC_proceedings/ICC2009/html/nonref/22_6.pdf. 73 C. Fink, ‘Mapping together: on collaborative implicit cartographies, their discourses and space construction’, Journal for Theoretical Cartography, 4 (2011), 1-14. M. Graham, ‘Neogeography and the palimpsests of place: Web 2.0 and the construction of a virtual earth’, Tijdschrift voor Economische en Sociale Geografie, 101, Issue 4 (2010), 422-436. 74 http://www.stoa.org/sol/ 75 J. D. Cintas and P. M. Sanchez, ‘Fansubs: audiovisual translation in an amateur environment’, Journal of Specialised Translation, 6 (2006), 37-52. Figure 1: Typology framework Process Type Collaborative tagging Linking Correcting/modifying content Transcribing Recording and creating content Commenting, critical responses and stating preferences Categorising Cataloguing Contextualisation Mapping Georeferencing Translating Table 1: Process Types Asset Type Geospatial Text Numerical or statistical information Sound Image Video Ephemera and intangible cultural heritage Table 2: Asset Types TASK Mechanical Configurational Editorial Synthetic Investigative Creative Table 3: Task Types Asset Type Original text Transcribed text Corrected text Enhanced text Transcribed music Metadata Structured data Knowledge/awareness Funding Synthesis Composite digital collection with multiple meanings Table 4: Output Types ADP106.tmp Open Access document downloaded from King’s Research Portal General rights Citation to published version: