key: cord-0271284-132cu3dq authors: Kirkham, Jamie J; Penfold, Naomi; Murphy, Fiona; Boutron, Isabelle; Ioannidis, John PA; Polka, Jessica K; Moher, David title: A systematic examination of preprint platforms for use in the medical and biomedical sciences setting date: 2020-04-28 journal: bioRxiv DOI: 10.1101/2020.04.27.063578 sha: 29b78d7ee4606831cc5107b87004ebd4adfaec75 doc_id: 271284 cord_uid: 132cu3dq Objectives The objective of this review is to identify all preprint platforms with biomedical and medical scope and to compare and contrast the key characteristics and policies of these platforms. We also aim to provide a searchable database to enable relevant stakeholders to compare between platforms. Study Design and Setting Preprint platforms that were launched up to 25th June 2019 and have a biomedical and medical scope according to MEDLINE’s journal selection criteria were identified using existing lists, web-based searches and the expertise of both academic and non-academic publication scientists. A data extraction form was developed, pilot-tested and used to collect data from each preprint platform’s webpage(s). Data collected were in relation to scope and ownership; content-specific characteristics and information relating to submission, journal transfer options, and external discoverability; screening, moderation, and permanence of content; usage metrics and metadata. Where possible, all online data were verified by the platform owner or representative by correspondence. Results A total of 44 preprint platforms were identified as having biomedical and medical scope, 17 (39%) were hosted by the Open Science Framework preprint infrastructure, six (14%) were provided by F1000 Research Ltd (the Open Research Central infrastructure) and 21 (48%) were other independent preprint platforms. Preprint platforms were either owned by non-profit academic groups, scientific societies or funding organisations (n=28; 64%), owned/partly owned by for-profit publishers or companies (n=14; 32%) or owned by individuals/small communities (n=2; 5%). Twenty-four (55%) preprint platforms accepted content from all scientific fields although some of these had restrictions relating to funding source, geographical region or an affiliated journal’s remit. Thirty-three (75%) preprint platforms provided details about article screening (basic checks) and 14 (32%) of these actively involved researchers with context expertise in the screening process. The three most common screening checks related to the scope of the article, plagiarism and legal/ethical/societal issues and compliance. Almost all preprint platforms allow submission to any peer-reviewed journal following publication, have a preservation plan for read-access, and most have a policy regarding reasons for retraction and the sustainability of the service. Forty-one (93%) platforms currently have usage metrics, with the most common metric being the number of downloads presented on the abstract page. Conclusion A large number of preprint platforms exist for use in biomedical and medical sciences, all of which offer researchers an opportunity to rapidly disseminate their research findings onto an open-access public server, subject to scope and eligibility. However, the process by which content is screened before online posting and withdrawn or removed after posting varies between platforms, which may be associated with platform operation, ownership, governance and financing. What is already known on this topic In concurrence with an increase in the number of preprint servers and platforms supporting biomedical and medical sciences research since 2013, there has been substantial growth in the number of preprints posted in this research area. The significant benefits of accelerated dissemination of research that preprints offer has attracted the support of many major funders. The raised profile of preprints has led to their wider acceptance in institutional and individual level assessment. What this study adds This is the first full examination of the characteristics and policies of 44 preprint platforms with biomedical and medical scope. We use a robust methodological approach to extract relevant information from web-based material with input from preprint platform owners. Despite concerns regarding the permanence and quality of preprints, most preprint platforms have long-term preservation strategies and many have screening checks (for example, a basic check for the relevance of content) in place. For some platforms, these checks are performed by researchers with content expertise. We provide a searchable database as a valuable resource for researchers, funders and policymakers in the biomedical and medical science field to determine which preprint platforms are relevant to their research scope and which have the functionality and policies that they value most. • In concurrence with an increase in the number of preprint servers and platforms supporting biomedical and medical sciences research since 2013, there has been substantial growth in the number of preprints posted in this research area. • The significant benefits of accelerated dissemination of research that preprints offer has attracted the support of many major funders. The raised profile of preprints has led to their wider acceptance in institutional and individual level assessment. • This is the first full examination of the characteristics and policies of 44 preprint platforms with biomedical and medical scope. We use a robust methodological approach to extract relevant information from web-based material with input from preprint platform owners. • Despite concerns regarding the permanence and quality of preprints, most preprint platforms have long-term preservation strategies and many have screening checks (for example, a basic check for the relevance of content) in place. For some platforms, these checks are performed by researchers with content expertise. A preprint is an non-peer reviewed scientific manuscript that authors can upload to a public preprint platform and make available almost immediately without formal external peer review. Posting a preprint enables researchers to 'claim' priority of discovery of a research finding; this can be particularly useful for early-career researchers in a highly competitive research environment. Some preprint platforms provide digital object identifier (DOIs) for each included manuscript. This information can be included in grant applications. Indeed, progressive granting agencies are recommending applicants include preprints in their applications (e.g., National Institutes of Health (NIH, USA) [1] and in the UK, preprints are becoming recognised as eligible Preprints have been widely used in the physical sciences since the early 1990s, and with the creation of the repository of electronic articles, arXiv, over 1.6 million preprints or accepted/published manuscripts have been deposited on this platform alone [3] . Since September 2003, arXiv has supported the sharing of quantitative biology preprints under the q-bio category. The use of preprints in biomedical sciences is increasing, leading to the formation of the scientist-driven initiative ASAPbio (Accelerating Science and Publication in biology) to promote their use [4] . A preprint platform dedicated to life-science-related research (bioRxiv) founded in 2013 has already attracted nearly 80,000 preprints [5] . This platform was set up to capture science manuscripts from all areas of biology, however, medRxiv was launched in June 2019 to provide a dedicated platform and processes for preprints in medicine and health related sciences [6] and it already hosts over 3400 preprints, becoming particularly popular with COVID-19. The Center for Open Science [7] has also developed web infrastructure for these new 'Rxiv' (pronounced "archive") services [8] , while F1000 Research Ltd has provided instances of its post-publication peer review and publishing platform for use by several funders (e.g. Wellcome Trust) and research institutions to encourage preprint-first research publishing [9] . Recently, several large publishers (Springer Nature, Wiley, Elsevier) have developed, co-developed or acquired preprint platforms or services, and in April 2020, SciELO launched a preprint platform that works with Open Journal Systems [10] . Many other preprint platforms also support dissemination of biomedical and medical sciences within their broader multi-disciplinary platforms. Given the increase in the use and profile of preprint platforms, it is increasingly important to identify how many such platforms exist and to understand how they operate in relation to policies and practices important for scientific publishing. With this aim in mind, we conduct a review to identify all preprint platforms that have biomedical and medical science scope and contrast them in terms of their unique characteristics and policies. We also provide a searchable repository of the platforms identified so that researchers, funders and policymakers have access to a structured approach for identifying preprint platforms that are relevant to their research area. articles using either a Word doc or as a PDF, with many platforms offering authors a choice of licensing, although where authors do not get a choice, the license required is commonly the CC-BY license. In general, the Open Science Framework (OSF) and many of the other platforms allow authors to submit their articles to any journal although in some cases there is facilitated submission to certain journals, for example, for bioRxiv there is a host of direct transfer journal options. Authors submitting to F1000 Research, the Open Research platforms and all First Look platforms can only submit articles to journals associated with the platform. Where the information is available, all platforms with the exception of Therapoid and ViXra are externally indexed and most are commonly indexed on Google Scholar. Screening, moderation, and permanence of content (Table 3) Thirty-three (75%) preprint platforms provided some detail about article screening, while two (FocUS Archive and SocArxiv) do mention checks although the details of such checks are unknown. Therapoid does not perform any screening checks but relies on a moderation process by site users following article posting and ViXra does not perform screening checks but will retract articles in response to issues. Fourteen (32%) preprint platforms that perform screening checks actively involved researchers with content expertise in this process. The three most common screening checks performed related to scope of the article (e.g. scientific content, not spam, relevant material, language), plagiarism and legal/ethical/societal issues and compliance. Only three preprint platforms (Research Square, bioRxiv and medRxiv) check whether the content is dangerous to human health. Preprints and Preprints.org describe policies online in relation to NIH guidance for reporting preprints [15] with regards to plagiarism, competing interests, misconduct and all other hallmarks of reputable scholarly publishing. Some preprint platforms do have policies but fall short of transparently making these policies visible online while some platforms have no policies. If content is withdrawn, some platforms ensure that the article retains a web presence (e.g. basic information on a tombstone page) although this was not standard across all platforms. Almost all platforms have a preservation plan (or are about to implement) for read access. Most commonly, platforms have set up an archiving agreement with Portico. Others have made their own arrangements: as a notable example, the OSF platforms are associated with a preservation fund provided by the Center for Open Science (COS) to maintain read access for over 50 years. In addition, most platforms have details on the sustainability of the service, for the OSF platforms this come from an external source (e.g. grants to support the COS framework), while for the Open Research Central infrastructure platforms this comes from article processing charges covered by the respective funding agencies. For some of the other platforms, funding is received from either internal or external sources or from other business model services (e.g. from associated journal publishing). With the exception of arXiv and MitoFit Preprint Archives (Therapoid metrics arriving soon), all preprint platforms have some form of usage metrics, and apart from JMIR Preprints and ViXra all provide the number of article downloads on the abstract page. The OSF preprints are limited to downloads but the Open Research Central platforms also include the number of views, number of citations and altmetrics, whilst some of the independent platforms also include details of social media interactions direct from the platform (as opposed to the altmetric attention score). Most platforms (n=33; 75%) have some form of commenting and onsite search options (35; 80%), and some (mostly but not exclusively to the independent platforms) have alerts such as RSS feeds or email alerts. Forty (91%) of platforms provided information on metadata and all provide the manuscript title, publication date, abstract, and author names in the metadata. Nearly all of these with the exception of SciELO Preprints provide a DOI or other manuscript identifier as well. The majority also offer subject categories (n=34) and license information (n=26) but less than half include author affiliations (n=17) and funder acknowledgements (n=13). Eleven platforms (all six platforms under the Open Research Central infrastructure, Authorea, bioRxiv, ChemRxiv, F1000 Research, Research Square) offer full-text content, but only five include references in the metadata. Half of the platforms (n=22) offer a relational link to the journal publication (if it exists) in the metadata. Forty-four preprint platforms were identified that considered biomedical and medical scope. This review characterises each of these preprint platforms such that authors can make a more informed choice about which ones might be relevant to their research field. Moreover, funders can use the data from this review to compare platforms if they wish to explicitly support and/or encourage their researchers to use certain platforms. Preprint platforms are fast evolving and despite our cutoff of 25 th June 2019, we are aware of new eligible preprint platforms that have been or are about to be launched after this date, for example Open Anthropology Research Repository (OARR) [16] and Cambridge Open Engage [17] . However, the recent advancements in the number of preprint platforms in this field has meant that one platform in this review (PeerJ Preprints) ceased to accept new preprints from the end of September 2019 to focus on their peer-reviewed journal activities [18] . Through our searchable database (https://asapbio.org/preprint-servers), we will endeavour to keep this information up-to-date. Due to the lack of formal external peer review, preprint platforms that include medical content have been criticised as they may lack quality which can lead to errors in methods, results and interpretation, which subsequently has the potential to harm patients [19, 20] . This review has demonstrated the reality that many preprints do undergo some sensible checks before going online, in contrast to the perception that preprints are not reviewed at all. Research Square, bioRxiv and medRxiv check specifically if there is potential harm to the preprints' dissemination before peer review. Research Square also offers a transparent checklist to indicate the status of various quality assurance checks (not equivalent to scientific peer review) for each preprint. Empirical evidence to support the use of editors and peer reviewers as a mechanism to ensure the quality of biomedical research is relatively weak [21, 22] although other studies have rendered peer review as being potentially useful [23, 24] . This review provides some justification that preprint platforms might be a reasonable option for researchers, especially given the time spent and associated cost of peer review [25] . In a recent survey of authors that have published with F1000 Research, 70% of respondents found the speed of publication to be important or very important [26] . In some scenarios, the time to deliver research findings may be as equally as important as research quality, and may be critical to health care provision. A good example of this is the current outbreak of novel coronavirus, where much of the preliminary evidence has been made available through preprints at the time of the World Health Organisation declaring the epidemic a public health emergency [27] . The issue of preprints being available before peer review, and also the level of screening before a preprint is posted, has been particularly pertinent in this case. As an example, bioRxiv has rapidly adapted to ensure users appreciate there has not been any peer review of the COVID-19-related work presented on this platform. In light of COVID-19, people including the patients and the public might be interested in a quick and easy way to search across platforms. As a start at improving discoverability, Europe PMC aggregates preprints from several repositories and already nearly 3000 preprint articles with 'COVID-19' in the title are listed [28] . The strength of this study is that we developed robust methodology for systematically identifying relevant preprint platforms and involved platform owners/representatives wherever possible to verify data that was either unclear or not available on platform websites, and when this was not possible, a second researcher was involved in the data acquisition process. Systematically identifying web-based data that is not indexed in an academic bibliographic database is challenging [29] , though the methods employed here are compatible with the principles of a systematic search: the methods are transparent and reproducible. This approach builds on the work by Martyn Rittman to produce an earlier list of preprint servers [12] , the process behind which did not use systematic methods or involve platform owners as far as we are aware. We undertook an internal pilot of developing and testing out the data collection form in collaboration with a preprint platform owner (John Inglis, bioRxiv, medRxiv) and ASAPbio staff and funders (promoters of preprint use) in order to ensure that the list of characteristics collected was both complete and relevant to different stakeholder groups including academics and funders. Many of the general policy information for some platforms was not well-reported or easy to find online and therefore an unexpected but positive by-product of this research is that several of these platforms have updated their webpages to improve the visibility and transparency of their policies in response to this research. Similarly, some platforms became aware of policy attributes that they had not previously considered and are now in the process of considering these for future implementation. One limitation is that we focussed our attention on the 'main' preprint article although in some cases different policies existed for the supplementary material, e.g. acceptable formats and licensing options. This level of detail will be included in our searchable database. Another potential shortcoming was that some preprint platforms had a partner journal and without verification it was sometimes unclear if the policy information related to the journal, preprint platform or both. Finally, we defined preprint platforms as hosting work before peer review is formally complete and we acknowledge that some platforms included here also host content that has already been peer-reviewed and/or published in a journal (e.g. post-prints) [30] ; this is unlikely to affect the interpretation of policies for pre-printed works discussed herein. With the increase in the number of preprint platforms available in the biomedical and medical research field, authors have the option to make publicly available and gain some early ownership of their research findings with little or no cost to themselves. Moreover, with many preprints platforms there is little restriction with regards to authors later publishing their preprints in peerreviewed journals of their choice. While we did not tabulate information on this specifically, it was noted that some platforms (notably OSF platforms) did recommend that authors check the SHERPA/RoMEO service for details of a journal's sharing policy. There is also some evidence that pre-printing an article first may even boost citation rates [31] due to increased attention from tweets, blogs and news articles than those articles published without a preprint. With many platforms carrying out suitable quality-control checks and having long-term preservation strategies, preprint platforms offer authors direct control of the dissemination of their research in a rapid, free and open environment. As well as primary research, preprints are also vital to users of research (systematic reviewers and decision makers). As an example, a living mapping systematic review of ongoing research into COVID-19 is currently being undertaken, and almost all included studies to date have been identified through preprint platforms [32] . There has been a sharp rise in the number of preprints being published each month and it has been estimated (as of June 2019), preprints in biology represents approximately 2.4% of all biomedical publications [33] ; and as of April 2020 there are already over 2.72 million preprints in the platforms that we evaluated. This review has summarised the key characteristics and policies of preprint platforms posting both medical and biomedical content although there is a need for some of these platforms to update their policies and to make them more transparent online. As preprints are not formally validated through peer review, it is important to make it clear that their validity is less certain than for peer-reviewed articles (although even the latter may still not be valid). There is perhaps a growing need to standardise the checking process across platforms; such a process should not diminish the speed of publication (what authors value most about a preprint [22] ). There is the temptation of making the checking process more rigorous, e.g. by including relevant researchers within the field as gatekeepers. However, this may slow down the process of making scientific work rapidly available and may promote groupthink, blocking innovative contrarian ideas to be circulated for public open review in the preprint platforms. Based on current checks, our review shows that most preprint platforms manage to post preprints within 48 hours and all within a week on average. Further challenges may arise on resources if the number of preprints continue to rise at a similar rate and the number of new platforms begins to plateau. And now, as several initiatives progress with work to build scientific review directly onto preprints (e.g. Peer Community in [34] , Review Commons [35] , PREreview [36]), it may become even more important to provide clarity about the level of checks a manuscript has already received and would need to receive to be considered "certified" by the scientific community. If anything, the wide public availability of preprints allows for far more extensive review by many reviewers, as opposed to the typical journal peer-review where only a couple of reviewers are involved. Our review identified 14 platforms linked to for-profit publishers and companies but only F1000 Research currently charges a small article processing charge to authors. With the increase in demand and resources needed to maintain preprint platforms, we should be mindful that article processing charges may change downstream meaning that platforms may have to charge authors; this would be unfortunate in light of the open science movement. One outcome of this review has been to understand the various drivers behind the proliferation of preprint platforms for the life and biomedical sciences. While arXiv, bioRxiv, chemRxiv and medRxiv aim to provide dedicated servers for academics within each field they are dedicated to, several academic groups have offered alternative subject-specific or regional services in line with their own community's needs, such as sharing work in languages other than English, using the OSF infrastructure. A third provider of preprint platforms is industry stakeholders: as academic publishers providing or acquiring preprint services to support the content they receive as submissions to their journals, and as biotechnology or pharmaceutical companies looking to support the sharing of relevant research content. Whether any platform becomes dominant may be influenced by the communities who adopt them, the influencers who promote them (funders and researchers who influence hiring and promotion decisions) and the financial sustainability underpinning them. We hope that enabling transparency into the processes and policies at each platform empowers the research community (including researchers, funders and others involved in the enterprise) to identify and support the platform(s) that help them to share research results most effectively. " T h e F r e n c h s e r v e r f o r P r e p r i n t s i n a l l t h e s c i e n t i f i c f i e l d s " A l l s c i e n t i f i c f i e l d s O b : S m a l l g r o u p o f e n t h u s i a s t s O T : I n d i v i d u a l o r c o m m u n i t y P : N o n -p r o f i t o r n o t -f o r -p r o f i t T : O p e n S c i e n c e F r a m e w o r k ( o p e n s o u r c e ) T : A f e w d a y s C : N o f e e t o a u t h o r I N A -R x i v [ 7 ] ( 1 7 A u g u s t 2 0 1 7 ; 1 6 , 6 3 7 ) V e r i f i e d " A p r e p r i n t s e r v e r f o r I n d o n e s i a n a c a d e m i a t o p r o v i d e a n o p e n , f r e e a n d s u s t a i n a b l e s c i e n t i f i c r e p o s i t o r y " A l l s c i e n t i f i c f i e l d s O b : I n d o n e s i a o p e n s c i e n c e t e a m O T : A c a d e m i c c o m m u n i t y g r o u p P : n i n t e r d i s c i p l i n a r y a r c h i v e o f a r t i c l e s f o c u s e d o n i m p r o v i n g r e s e a r c h t r a n s p a r e n c y a n d r e p r o d u c i b i l i t y " R e l a t i n g t o m e t a -s c i e n c e O b : T h e B e r k e l e y I n i t i a t i v e f o r T r a n s p a r e n c y i n t h e S o c i a l S c i e n c e s ( B I T S S ) , C e n t r e f o r E f f e c t i v e G l o b a l A c t i o n , U n i v e r s i t y o f C a l i f o r n i a , B e r k e l e y O T : A c a d e m i c i n s t i t u t i o n P : N o n -p r o f i t o r n o t -f o r -p r o f i t T : U n k n o w n C : N o f e e t o a u t h o r T : O p e n S c i e n c e F r a m e w o r k ( o p e n s o u r c O p e n a r c h i v e f o r r e s e a r c h o n m i n d a n d c o n t e m p l a t i v e p r a c t i c e s " R e l a t i n g t o m i n d a n d c o n t e m p l a t i v e p r a c t i c e s , i n c l u d i n g m e d i c i n e a n d h e a l t h s c i e n c e s , n e u r o s c i e n c e a n d n e u r o b i o l o g y , p s y c h o l o g y , s o c i a l a n d b e h a v i o u r a l s c i e n c e s O b : M i n d a n d L i f e I n s t i t u t e O T : A c a d e m i c i n s t i t u t i o n P : N o n -p r o f i t o r n o t -f o r -p r o f i t T : O p e n S c i e n c e F r a m e w o r k ( o p e i s t h e f i r s t c o m m u n i t y -l e d a n d o p e n a c c e s s s u b j e c t r e p o s i t o r y d e d i c a t e d t o s p o r t , e x e r c i s e , p e r f o r m a n c e , a n d h e a l t h r e s e a r c h " R e l a t i n g t o s p o r t s a n d e x e r c i s e s c i e n c e , i n c l u d i n g r e h a b i l i t a t i o n a n d t h e r a p y , t h e a t r e , d a n c e , p h y s i o l o g y , p h y s i o t h e r a p y , p s y c h o l o g y , s o c i o l o g y O b : S o c i e t y f o r T r a n s p a r e n c y , O p e n n e s s , a n d R e p l i c a t i o n i n K i n e s i o l o g y ( S T O R K ) O T : S c i e n t i f i c s o c i e t y P : N o n -p r o f i t o r n o t -f o r -p r o f i t T : O p e n S c i e n c e F r a m e w o r k ( o p e n s o u r c e ) T : A f e w d a y s C : N o f e e t o a u t h o r T h e s i s C o m m o n s [ 1 7 ] ( 2 1 A p r i l 2 0 1 7 ; 5 8 3 ) V e r i f i e d " A n o p e n a r c h i v e o f t h e s e s " A l l s c i e n t i f i c f i e l d s O b : C e n t e r f o r O p e n S c i e n c e a n d s m a l l g r o u p o f e n t h u s i a s t s O T : A c a d e m i c c o m m u n i t y g r o u p ; c h a r i t y P : NIH enables investigators to include draft preprints in grant proposals INLEXIO: The rising tide of preprint servers Committee on Publication Ethics. Discussion document on preprints Server List Zenodo: Practices and policies of preprint platforms for life and biomedical sciences Reporting Preprints and Other Interim Research Products Preprints boost article citations and mentions Living mapping and living systematic review of Covid-19 studies Biology preprints over time We thank John Inglis (co-founder of bioR X iv, medRxiv) for his advice on developing the data collection form and helpful comments on the manuscript. We also thank Robert Kiley, Geraldine Clement-Stoneham, Michael Parkin, Amy Riegelman, and Claire Yang for helpful feedback and conversations. We also would like to thank collectively the preprint platform owners and representatives who provided both data and verified information. A f r i c a A r x i v h t t p s : / / i n f o . a f r i c a r x i v . o r g / 2 . A g r i X i v h t t p s : / / a g r i x i v . o r g 3 . A r a b i x i v h t t p s : / / a r a b i x i v . o r g / 4 . E c o E v o R x i v h t t p s : / / e c o e v o r x i v . o r g 5 . F o c U S A r c h i v e h t t p s : / / o s f . i o / p r e p r i n t s / f o c u s a r c h i v e / 6 ..