To the Cloud! A Grassroots Proposal to Accelerate Brain Science Discovery Neuron NeuroView To the Cloud! A Grassroots Proposal to Accelerate Brain Science Discovery Neuro Cloud Consortium* *Correspondence: jovo@jhu.edu (Joshua T. Vogelstein) http://dx.doi.org/10.1016/j.neuron.2016.10.033 The revolution in neuroscientific data acquisition is creating an analysis challenge. We propose leveraging cloud-computing technologies to enable large-scale neurodata storing, exploring, analyzing, and modeling. This utility will empower scientists globally to generate and test theories of brain function and dysfunction. Introduction Technological advances from all around the globe (Grillner et al., 2016) are allowing neuroscientists to collect more precise, complex, varied, and extensive data than ever before (Sejnowski et al., 2014). How can we maximally accelerate our collective ability to extract meaning from such data? To answer this question, the United States Congress commissioned the National Science Foundation (NSF) to ‘‘convene government representatives, neuroscience researchers, private entities, and non-profit institutions’’ (https://www. congress.gov/congressional-report/113th- congress/house-report/448). The NSF funded two events. The first was a work- shop of over 75 individuals from 12 coun- tries and 5 continents that was broadcast live over the internet. Each person was invited to bring a single big idea—one that could have maximal impact, while be- ing both feasible, given existing resources, and universally inclusive. Four ideas emerged as grand challenges for global brain science (Vogelstein et al., 2016). A second event was organized to discuss these ideas with a larger (425 participants) and more diverse community,which willbe the subject of another article. The goal of this NeuroView is to describe one of the four grand challenges and propose a strat- egy to overcome it, in order to gather feed- back from the larger community. The authors are participants in the first confer- ence who volunteered to hash out these ideas via emails, online documents, con- ference calls, and in-person visits. The kernel of the idea is based on a view ofthescientificprocessasan‘‘upwardspi- ral’’: a collective effort where each new experiment yields data, upon which anal- ysis is performed, leading to new or refined models, which suggest novel experiments 622 Neuron 92, November 2, 2016 ª 2016 El (see Figure 1). Historically, the process of data analysis has been kept relatively sim- ple by the small scale of data acquired. But recent advances in experimental technol- ogy, such as serial electron microscopy (Denk and Horstmann, 2004), light sheet microscopy (Weber et al., 2014), and models of the whole human brain at the microscopic level (Amunts et al., 2013), have made data analysis significantly more challenging. While experimental neuroscience is enabling the collection of ever larger and more varied datasets, information technology is undergoing a revolutionofitsown. Commercialdevelop- ment of artificial intelligence and cloud computing innovations are changing the computational landscape (The Economist, 2016). Computing is moving toward ‘‘clou- dification,’’ a ‘‘software as a service’’ model, in which locally installed software programs are replaced by web apps. These forces create a massive opportunity to develop new computational technolo- gies that complement advances in data collection in order to accelerate and democratize model building, hypothesis testing, and model refinement. What Would Change If We Capitalize on This Opportunity? Consider sending a letter, watching a movie at home, or obtaining reference in- formation. Ten to twenty years ago, to send a letter, we purchased paper, stamps, and envelopes; to watch a movie at home, we rented or purchased a VHS or DVD; to obtain reference information, we bought an encyclopedia and obtained yearly revisions. Today, each of those op- tions is still available and indeed preferred in certain circumstances. However, web options exist for each activity as well. In each case, we have privacy, bandwidth, sevier Inc. and financial concerns. Nonetheless, for many of our daily practices we use these cyber solutions, sometimes putting our most private information in the cloud. The everyday practice of brain science is just beginning to benefit from similar tech- nology development. Other scientific disciplines have already navigated similar waters with remarkable success. For example, the Sloan Digital Sky Survey (SDSS) changed the daily practice of astronomers and cosmologists (Kent, 1994). They still have the option to wait 6 months for telescope time, analyze their data locally on machines they own and maintain, and publish a summary of the results (and many do). Yet there are moreaccountsinSDSSthan therearepro- fessional cosmologists. Astronomers can now log in to SDSS, find previously pub- lished data, run database queries (a skill they typically did not have prior to SDSS), and publish the queries and results. Simi- larly, molecular geneticists historically sequenced their own data (using ma- chines that they owned and maintained), analyzed it locally, and published the re- sults. Now, they can outsource the se- quencing to avoid owning and maintaining the machines, upload the sequences to a national or international database, quanti- tatively compare their sequences to previ- ously published sequences, and then pub- lish their findings. The success of these efforts is evident from the cultural shift of dailypracticesbymany,ifnotmost,partic- ipants in each field. Both fields resolved issues of data privacy, data ownership, governance, and financial concerns, providing aproofofprinciplethatothersci- entific disciplines can do the same. In neuroscience, many of our scien- tific practices remain based on pre- internet methods. A scientist designs an mailto:jovo@jhu.edu http://dx.doi.org/10.1016/j.neuron.2016.10.033 https://www.congress.gov/congressional-report/113th-congress/house-report/448 https://www.congress.gov/congressional-report/113th-congress/house-report/448 https://www.congress.gov/congressional-report/113th-congress/house-report/448 http://crossmark.crossref.org/dialog/?doi=10.1016/j.neuron.2016.10.033&domain=pdf Figure 1. The Upward Spiral of Science Neuron NeuroView experiment, collects data, stores it locally, keeps meta- data in his head or in some customspreadsheet,analyzes it using software that he buys and installs on local com- puters that he updates regu- larly,andpublishesasummary of the results. We predict that another strategy will be supe- rior for many situations: as the scientist collects data, it gets stored privately or pub- licly in the cloud, and she then selects analyses to occur automatically, having the flexi- bility to pull from a variety of previouslypublishedanalyses, and finally publishes entire ‘‘digital experiments,’’ containing (some of) the data and the entire analysis pipeline. What Are the Primary Goals? We see two key goals that, if achieved, would leverage advances in computing to accelerate brain sciences. The first goal is to make reproducibility and exten- sibility of science as easy as possible, even for small amounts of data or simple data. The current practices of private data storage and siloed analyses make reproducing an analytic result tedious at best and impossible at worst. The steps can include requesting the data, identi- fying the formats and organization, re- questing the code, deciding which func- tions to run and how, getting all necessary dependencies installed, mak- ing sure to use the same software ver- sions, and accessing the same computa- tional hardware. Solutions now exist to mitigate each of these challenges, though they are relatively disparate and uncon- nected. Data can be uploaded to data re- positories (e.g., https://figshare.com/), data standards have been proposed for several domains of brain science (e.g., http://bids.neuroimaging.io/ and http:// www.nwb.org/), code can be stored in publicly accessible repositories (e.g., https://github.com/), interactive tutorials can be provided (e.g., using http:// jupyter.org/), and all necessary software dependencies can be easily packaged together (e.g., using https://www. docker.com/) and run ‘‘in the cloud’’ (e.g., using http://mybinder.org/) on com- mercial service providers (e.g., on https:// aws.amazon.com/ec2/ or https://cloud. google.com/). Nonetheless, given some new data, it is not obvious where to find reference algorithms or how to connect them to the data. Similarly, given a new model, it is not clear how to find reference data, figure out which standard it is using and then fit it, and determine if others have done the same to allow us to compare and assess the results. In either case, once the data are processed, it re- mains difficult to keep track of the result- ing data derivatives and which version of which code resulted in which outputs. So although many of the pieces are in place, there is still no unified ‘‘glue’’ that makes everything work together seam- lessly. Moreover, each of the above- mentioned tools can be used by some brain scientists, but most tools are de- signed for data scientists, so the learning curve can be incredibly steep. Ideally, there would be a place where brain scien- tists could find all relevant analyses and data, run each analysis on each dataset, and see a leaderboard comparing perfor- mances, without writing any lines of code. Cloud-based solutions simplify reproduc- ibility and extensibility by essentially elim- inating activation energy and extraneous sources of analytic variability. The second goal is to enable such a sys- tem to work with ‘‘big data’’ (i.e., data too large to fit on a workstation). Data are scaling in many domains in brain science, either because individual experiments are large (as in calcium imaging and whole- brain CLARITY imaging), there are thou- sands of subjects with gigabytes of data Neu each (as in large-scale human brain imaging projects), or there are millions of time points (as in wearable sensor data). Regardless of source and modality, if it is ‘‘medium data’’ (meaning too large to fit in memory, but small enough to fit on your com- puter), tasksas simple asvisu- alizing, rotating, and opening the data are challenging using standard tools such as MATLAB, Python, or ImageJ. For big data, the challenges are even larger because questions of how to store, compress, manage, and archive the data exceed the computational capabilities and resources of most experimental labs. Cloud-based solutions simplify big data analysis due to their inherently scalable nature. What’s the Big Idea? We are proposing to design, build, and deploy an instance of ‘‘cloud neurosci- ence,’’ meaning that the data, the code, and the analytic results all live in the cloud together. Cloud neuroscience can be thought of as an operating system, a set of programs that run on it, a file system that stores the data, and the data itself, all designed to run in a scalable fashion and to be accessible from anywhere. What Are the Design Criteria? First and foremost, the design and con- struction should be organic, grassroots, and open source, to ensure that it remains intimately connected to the needs of all scientific citizens. Over 100,000 people attend annual brain science conferences, including neuroscience, psychology, psy- chiatry, and neurology. This is a massive human capital resource, so the system should enable contributions from any of them, regardless of background or re- sources. Thus, the system needs to sup- port data and workflows of all kinds, regardless of modality, complexity, or scale—including raw data, derived data, and metadata. Doing so would also further democratize brain sciences, opening the door to the additional 3.5 billion people with mobile broadband access who could contribute if given the opportu- nity. Encouraging and supporting such ron 92, November 2, 2016 623 https://figshare.com/ http://bids.neuroimaging.io/ http://www.nwb.org/ http://www.nwb.org/ https://github.com/ http://jupyter.org/ http://jupyter.org/ https://www.docker.com/ https://www.docker.com/ http://mybinder.org/ https://aws.amazon.com/ec2/ https://aws.amazon.com/ec2/ https://cloud.google.com/ https://cloud.google.com/ Figure 2. Schematic of the Five Proposed Components An individual can adopt any or all of the five roles (color-coded dashed rectangles). For each component, the cloud content is generated by individuals in one of the five roles. Neuron NeuroView involvement motivates an emphasis on ethical standards and cultural sensitivities. Moreover, millions of hours and billions of dollars have been spent developing brain science resources, including vast quanti- ties of data, algorithms, and models. The system should build upon such work. Because different people have different preferences, access controls should be flexible enough to satisfy everyone’s needs. For resources that are open, repro- ducing andextending priorwork should be ‘‘turn-key,’’ allowing researchers to ‘‘swap in’’ different datasets or algorithms as desired. Industry is making tremendous headway in this regard, including digital notebooks to keep track of all analyses, software containers to ease the burden of installing and configuring software, and web servicesthat dynamically provide computational resources as needed. To the extent possible, we should leverage these resources and engage with non- profit, institutional, and corporate partners to express our domain-specific needs. The design should be highly adaptive, to capitalize on rapid advances from within and outside brain sciences, and, of course, open source with permissive li- censes. And the entire system should be able to run not just in a single commercial cloud, but also on other clouds, national resources, institutional clusters, local workstations, and laptops, to enable maximal portability and utility. Perhaps most importantly, the system should be universally useful, helping to answer the grand challenges of brain science while facilitating much greater participation in the scientific process. 624 Neuron 92, November 2, 2016 The motivation underlying this en- deavor is to accelerate the scientific pro- cess by improving the experience of doing brain science. Thus, the community can determine the worst pain points in our process and design solutions around them. For example, if looking at data is the largest bottleneck, then one could use a cloud-based visualization app (like Google Maps, CATMAID, or NeuroDataViz). On the other hand, if the largest bottleneck is getting data into a common format before running analyses, then one would benefit from having all the data stored in a format with a standard- ized application programming interface (API) so every dataset can be accessed in the same way. In other words, it is time for the scientific community to prior- itize the user experience to focus the sub- sequent software development. How Might We Achieve It? In this section, we propose a potential design of the constituent components that could comprise an instance of cloud neuroscience (see Figure 2). The required elements can be divided into five cate- gories: data, infrastructure, apps, algo- rithms, and education. The goal of breaking down the problem this way is to ensurethatallbrainscientists,professional and citizen alike, can contribute to and benefitfromthesystem.Crucialtosuccess will be tight integration across compo- nents, each of which is described in some detail below. Some brain scientists are able to span the full range from design to analysis, including running experiments, analyzing data, making discoveries, and even writing articles. Such polymaths can seamlessly alternate between different roles. Others might be highly skilled in soft- ware engineering, but not data collection. To ensure that all brain scientists can contribute to this effort, we have organized typesofactivities according tothe‘‘role’’ of the individual performing those activities. These roles are not meant to be prescrip- tive; rather, they serve to help guide scien- tists to the different kinds of contributions they could make (see Box 1 for detailed description of the roles). Data The data component is intended to miti- gate difficulties with storing and accessing data, regardless of the modality, scale, or complexity of the data. Anybody would be able to upload raw data, derived data, and metadata as they flow off the sensors and dynamically control access. Func- tionality would build on and incorporate existing brain science data repositories (Ascoli et al., 2007; Burns et al., 2013; Crawford et al., 2016; Poldrack et al., 2013; Teeters et al., 2008), as well as more general services (e.g., FigShare). Therefore, the technical challenges for small and large data storage and access, for the most part, already have reasonable solutions for many data types. The re- maining challenges are to further lower the barrier to entry, making data upload and access easier, especially for multi- terabyte datasets. Data contributions will be able to come from anyone and could be stored in a variety of accessible places to minimize transfer cost and time. Access controls would enable scalable sharing Box 1. Roles We enumerate six different roles for participants. Note that these are not characterizing individuals but roles that any individual can play. Roles differ in their degree of interest and expertise in various aspects of the scientific process, all of which are important. d Experimentalist: A person in this role is acquiring data. This includes activities such as recruiting subjects and specifying in- clusion guidelines (for human studies), experimental setup, subject care, and data acquisition, as well as some aspects of data management and quality control. In this role, a person has extensive knowledge of the experiment details, though computational acumen can be quite modest. d Architect: A person in this role is developing the infrastructure component. In this role, professional software engineer skills are required. Architects work collaboratively on open-source repositories, possibly co-localized. d App Engineer: A person in this role is writing apps. These apps might wrap algorithms written by the engineer or others. In this role, best practices of software development for science, including proper scientific documentation, are crucial. d Data Scientist: A person in this role is writing and running algorithms. These algorithms might serve any step of the scientific process. Data scientists have a wide variety of computational backgrounds, including engineering, physics, mathematics, statistics, and computer science. d Scientific User: A person in this role is using tools to analyze and understand the data. This can take many forms, ranging from looking at images and figures generated directly from the data acquisition system to fitting statistical models and combining multiple disparate datasets. In this role, computational acumen is not required. Familiarity with the data, experimental details, etc. can vary widely. d Educator: A person in this role is either creating or presenting educational content, including documentation, tutorials, and massive online open courses, as well as running workshops, hackathons, and summer courses. Neuron NeuroView with minimal effort. Storage costs would be the responsibility of the data provider if the data are private; if public, others could financially contribute. In either case, economies of scale would reduce storage costs, and we would work with commercial clouds and national infra- structures to offset costs to the extent possible. The data storage formats would allow visualization and analysis at scale. Data contribution would be desirable and possible from any lab, regardless of its financial resources or location. For example, some methods are relatively inexpensive, such as EEG, fNIRS, and wearable technologies. Moreover, certain important subpopulations are better rep- resented in less wealthy countries, enabling unique contributions from those places. If the same measures are included in more expensive projects, analysis bridges could be established between the datasets. This would enhance transla- tional research at a global scale. These factors would lead to important collabora- tions in which less wealthy countries could influence the content and useful- ness of this effort (Neuroinformatics Col- laboratory, 2016). Data types would include raw, derived, and metadata (see Box 2 for additional de- tails). Raw data include data from any kind of experiment, including functional, struc- tural, omics (e.g., genetic and epigenetic), behavioral,andmedicaldata.Everyexper- iment will be given a unique data identifier. Medicaldata will be given special attention to ensure compliance with national guide- lines for patient privacy. Each data type will yield a wide diversity of derived data, including summary statistics, matrices, networks, shapes, and more. Associated with each entry is a collection of metadata, including a community-driven controlled vocabulary, as well as custom ad hoc fields. Metadata on the derived data will include detailed provenance history. The system would be seeded with existing reference datasets spanning spatial, tem- poral, and phylogenetic scales, including data from the Human Brain Project, the Human Connectome Project, the Allen Institute for Brain Science’s data portal, IARPA’s MICrONs program, and more. Infrastructure The infrastructure component is intended to mitigate difficulties in finding data or tools, linking them together, installing soft- ware, managing computers, and repro- ducing and extending results. When the infrastructure is operational, much of the scientific process can be conducted from a tablet or smartphone, replacing the needtobuyandmaintainhigh-powercom- puters or keep software up to date. The infrastructure is essentially the operating system upon which all the services would run, akin to NeuroDebian (Halchenko and Hanke, 2012), but designed specifically for the cloud. This virtual operating system will run in the commercial cloud, on institu- tional resources, national centers, or local workstations, regardless of hardware configuration (e.g., Mac, Windows, Linux, etc.). The software could be designed and written by a small and distributed team of architects to facilitate design deci- sions considering diverse use cases. The infrastructure could be composed of two core sub-components. First, a data management system would store and organize all the data. This could include managing access, assigning digital object identifiers (DOIs), and supporting common data formats, and would be easily exten- sible to new or custom formats. Data could also be compressed with or without loss, as desired by the contributor. Technically, data would be stored in a set of databases optimized for different brain science use cases. Second, a workflow management system would store and organize analyses, leveraging existing web services such as Github and continuous integration to the extent possible. This would enable ‘‘digital experiments,’’ including all stages of data processing. Crucially, such experiments could be done on different hardware platforms, applied to different data (by merely swapping the DOI), or use different algorithms (a similarly simple modification). All infrastructure services would have easy-to-use APIs to maximize utility and extensibility. Neuron 92, November 2, 2016 625 Box 2. Types of Brain Science Data d Functional data are fundamentally temporal and dynamic. Whether univariate or multivariate, the standard operations to apply include zooming in time, subsampling, smoothing, and converting to other domains such as Fourier. Functional data also have a spatial domain, which links them to structural data. The subdivision between functional and structural data may be, for some data, ambiguous. d Structural data are fundamentally spatial in nature, include 2D images, 3D volumes, and 4D and 5D hypervolumes for multispectral and/or time-varying data (spatiotemporal data, such as fMRI and calcium imaging, are both structural and functional). This can include structural images, as well as sparse fluorescent images, gene expression maps, etc. Standard operations for these data include compression, downloads of volumes of arbitrary sizes and shapes, maximum projections, averages, and more. d Omics data are sequential and categorical, including the genome, epigenome, metabolome, and microbiome. Standard queries for genetic data include sequence compression, alignment, and comparisons. Omics data may also have a spatial domain (e.g., gene expression data). d Behavioral data can be of several different types. For example, behavior can be captured via video capture (e.g., behavioral observation of children during play), time series of task events during physiological measurements, questionnaires (e.g., symptom checklists), performance testing instruments (e.g., the NIH Toolbox), and other devices (e.g., actigraphy and voice recorders). Each datum has unique qualities and, therefore, functionality. d Medical data include all electronic health data, including semi-structured text. They are among the most challenging of data types to aggregate, for until recently, the vast majority of the field has relied on paper charts or poorly structured electronic health record (EHR) systems. Fortunately, regulatory and funding agencies are incentivizing the widespread use of EHRs, as well as common data elements that are more amenable to data aggregation for the purposes of discovery science (e.g., the eMerge Network). Additionally, informatics frameworks are being developed to safely link disparate EHR data (e.g., https:// www.i2b2.org/), and calls for the creation of open APIs are gaining attention. Neuron NeuroView Apps The apps component is intended to miti- gate difficulties in maintaining software versions, paying for software, and finding tools appropriate to run on data. Apps are the programs that run on the system, akin to tools like Dropbox (to upload/down- load), Google Maps (to visualize), PubMed Central (to search for informa- tion), BLAST (to compare your data with other data), and pipelines (to process your data). Apps can be developed by anybody with minimal programming skills, due to the careful design of the APIs in the infrastructure. A specification would be formalized and quality standards agreed upon by the community of users to pub- lish apps in the open app marketplace. Different apps would be designed for users with different backgrounds, roles, and goals. For example, apps targeted at people in the experimentalist role could include features to enable uploading, downloading, and managing access without having to learn the APIs. On the other hand, apps targeted at people in the data analysis role could include pre- processing data, fitting models, testing hypotheses, plotting results, and running digital experiments. General purpose apps would include tools to visualize, manipulate, and manually annotate data. 626 Neuron 92, November 2, 2016 These general purpose apps enable a much broader community of users to participate in the scientific process, including those without extensive tech- nical training or financial resources. Algorithms The algorithms component is intended to mitigate difficulties in analyzing data with increasing scale or complexity. Recent advances in artificial intelligence, including distributed machine learning libraries and deep learning, could be lever- aged here. Algorithms operate on simu- lated, measured, or derived data to pro- duce transformed representations or summary statistics of the data. Algorithms can be written by anybody with minimal data-science skills, including many cur- rent brain scientists, without knowledge of this proposed system (unlike apps). Al- gorithms are essentially ‘‘wrapped’’ in apps to run and therefore inherit many of the conveniences of the system. We parti- tion algorithms into three different types. Scalable data-processing algorithms can be applied to a wide variety of data types. These will be easily daisy-chained together to obtain pipelines, which can similarly be adapted to apply different al- gorithms or data. Because algorithms will be applied more generally to less familiar data, or less familiar algorithms will be applied to familiar data, quality assess- ment will be particularly important. This wouldincludebothqualitativedashboards providing figures and quantitative metrics to evaluate and compare performances along different metrics. Finally, to optimize resources and avoid duplicating efforts across labs, experiments will need to be useful for a large number of people. Exper- imental design will therefore be a key algo- rithmic component as well. Education Just like there is a learning curve when switching from Windows to Mac, so too switching from current practices to this system will involve a learning curve. There- fore, the success of this endeavor will depend on extensive educational material, including documentation, tutorials, online courses, hackathons, workshops, and summer courses. All the content will be designed to complement existing educa- tional resources, such as Coursera courses. The variety of educational resources would reflect the backgrounds and skills of the user and contributor communities, with the goal of universal ac- cess. Because of this variety, community- driven cultural sensitivity guidelines would be posted for all contribution types. https://www.i2b2.org/ https://www.i2b2.org/ Neuron NeuroView Discussion Here we describe an immediately action- able grassroots proposal to marry recent advances in neurodata acquisition with scalable cloud computing to accelerate the process of discovery by scientists independently of how well resourced they are (we have developed a proof-of- concept example using multimodal MRI data; see http://neurodata.io for details). There are several mechanisms by which Cloud Neuroscience may yield benefits. Global collaborations may become much simpler and therefore more preva- lent. Open science may be facilitated, and the barriers and benefits to con- ducting open science may become more transparent by virtue of the design. Many models can be tested on the same data- set, and individual models can be sub- jected to greater diversity of data-based reality checks. In the near term, any effort that generates reference data of interest to a large segment of the community can benefit from Cloud Neuroscience. One example is the upcoming �10 petabytes from the IARPA MICrONS program. Several potential criticisms are worth addressing, and many details need to be fleshed out. Privacy concerns for human data will require careful additional thinking so that best practices of anonymization and security can be implemented—prece- dent is provided by ongoing large research initiatives (e.g., Jack et al., 2008; Murphy et al., 2010; Sarwate et al., 2014). A viable financial model will be required. Potential partners include national laboratories that could contribute computing and storage resources, or companies interested in providing cloud-based web services for specific scientific subdomains. Return on investment must be considered. Cosmol- ogy, molecular genetics, and plant biology (see http://www.cyverse.org/) are existing proofs that when designed well, such re- sources can a yield dramatic and positive impact on the field. Other cloud- computing neuroscience efforts that focus on the human brain are already underway, such as CBRAIN (Das et al., 2016) and the Human Brain Project. Such efforts are important; the proposed project has been designed to leverage the develop- ments from those projects and extend them to address a greater diversity of brain science questions, species, data modal- ities, and functionalities. The above plans and challenges sug- gest immediately actionable next steps. A field engineer has been appointed to develop asurveyto determinewhichexist- ing resources are most useful (pooling in- formation from places like https://github. com/ and https://www.nitrc.org/) and what new resources would be most useful. A software engineer has agreed to contribute significant effort toward build- ing a ‘‘Neuroscience as a Service’’ frame- work (the virtual operating system and apps described above) based upon exist- ing related services. They will begin formalizing minimal specifications for all resources. We have also obtained private seed funding to hire an additional senior software engineer. To gather community feedback, we will be monitoring https:// neurostars.org/ for any posts that contain the tag ‘‘neurostorm.’’ Next, sustainable governance, funding, and advisory models will be devised. Pablo Picasso famously quipped, ‘‘Every child is an artist. The problem is how to remain an artist once we grow up.’’ As the next generation of brain scien- tists grows up, we have an opportunity to provide them with a canvas on which they can craft ever more creative portraits of our minds. Cloud neuroscience is one step we can take in that direction. SUPPLEMENTAL INFORMATION Supplemental Information includes a complete author list with affiliations and can be found with this article online at http://dx.doi.org/10.1016/j. neuron.2016.10.033. ABOUT THE AUTHORS Joshua T. Vogelstein is a neurostatistician; an Assistant Professor of Biomedical Engineering at Johns Hopkins University (JHU); and a member of the Institute for Computational Medicine, Center for Imaging Science, and Kavli Neuroscience Dis- covery Institute (KNDI). Brett Mensh founded Opti- mize Science, a science consulting agency, and is Scientific Advisor at Janelia Research Campus. Drs. Vogelstein and Mensh co-organized the Global Brain Workshop, an event in April 2016 with Richard Huganir, Professor and Director of the Department of Neuroscience and Director of KNDI, JHU, and Michael I. Miller, Herschel and Ruth Seder Professor and University Gilman Scholar, Director of the Center for Imaging Sci- ence, and Co-director of KNDI, JHU. All the co-au- thors were invited to the Global Brain Workshop on the basis of their international leadership spanning different spatial, temporal, and phylogenetic scales. They each subsequently volunteered to continue discussing this content for the ensuing weeks and months. REFERENCES Amunts, K., Lepage, C., Borgeat, L., Mohlberg, H., Dickscheid, T., Rousseau, M.-E., Bludau, S., Bazin, P.-L., Lewis, L.B., Oros-Peusquens, A.-M., et al. (2013). Science 340, 1472–1475. Ascoli, G.A., Donohue, D.E., and Halavi, M. (2007). J. Neurosci. 27, 9247–9251. Burns, R., Roncal, W.G., Kleissas, D., Lillaney, K., Manavalan, P., Perlman, E., Berger, D.R., Bock, D.D., Chung, K., Grosenick, L., et al. (2013). Sci Stat Database Manag. http://dx.doi.org/10.1145/ 2484838.2484870. Crawford, K.L., Neu, S.C., and Toga, A.W. (2016). Neuroimage 124 (Pt B), 1080–1083. Das, S., Glatard, T., MacIntyre, L.C., Madjar, C., Rogers, C., Rousseau, M.-E., Rioux, P., MacFar- lane, D., Mohades, Z., Gnanasekaran, R., et al. (2016). Neuroimage 124 (Pt B), 1188–1195. Denk, W., and Horstmann, H. (2004). PLoS Biol. 2, e329. Grillner, S., Ip, N., Koch, C., Koroshetz, W., Okano, H., Polachek, M., Poo, M.-M., and Sejnowski, T.J. (2016). Nat. Neurosci. 19, 1118–1122. Halchenko, Y.O., and Hanke, M. (2012). Front. Neuroinform. 6, 22. Jack, C.R., Jr., Bernstein, M.A., Fox, N.C., Thomp- son, P., Alexander, G., Harvey, D., Borowski, B., Britson, P.J., L Whitwell, J., Ward, C., et al. (2008). J. Magn. Reson. Imaging 27, 685–691. Kent, S.M. (1994). Science with Astronomical Near-Infrared Sky Surveys, N. Epchtein, A. Omont, B. Burton, and P. Persi, eds. (Springer), pp. 27–30. Murphy, S.N., Weber, G., Mendis, M., Gainer, V., Chueh, H.C., Churchill, S., and Kohane, I. (2010). J. Am. Med. Inform. Assoc. 17, 124–130. Neuroinformatics Collaboratory (2016). Neuroinfor- matics Collaboratory, http://www.neuroinformatics- collaboratory.org. Poldrack, R.A., Barch, D.M., Mitchell, J.P., Wager, T.D., Wagner, A.D., Devlin, J.T., Cumba, C., Koyejo, O., and Milham, M.P. (2013). Front. Neuro- inform. 7, 12. Sarwate, A.D., Plis, S.M., Turner, J.A., Arbabshir- ani, M.R., and Calhoun, V.D. (2014). Front. Neuro- inform. 8, 35. Sejnowski, T.J., Churchland, P.S., and Movshon, J.A. (2014). Nat. Neurosci. 17, 1440–1441. Teeters, J.L., Harris, K.D., Millman, K.J., Olshau- sen, B.A., and Sommer, F.T. (2008). Neuroinfor- matics 6, 47–55. The Economist (2016). The future of computing. The Economist, http://www.economist.com/news/ leaders/21694528-era-predictable-improvement- computer-hardware-ending-what-comes-next- future. Vogelstein, J.T., Amunts, K., Andreou, A., Angelaki, D., Ascoli, G., Bargmann, C., Burns, R., Cali, C., Chance, F., Chun, M., et al. (2016). arXiv, ar- Xiv:1608.06548, https://arxiv.org/abs/1608.06548. Weber, M., Mickoleit, M., and Huisken, J. (2014). Methods Cell Biol. 123, 193–215. Neuron 92, November 2, 2016 627 http://neurodata.io http://www.cyverse.org/ https://github.com/ https://github.com/ https://www.nitrc.org/ https://neurostars.org/ https://neurostars.org/ http://dx.doi.org/10.1016/j.neuron.2016.10.033 http://dx.doi.org/10.1016/j.neuron.2016.10.033 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref1 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref1 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref1 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref1 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref2 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref2 http://dx.doi.org/10.1145/2484838.2484870 http://dx.doi.org/10.1145/2484838.2484870 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref4 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref4 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref5 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref5 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref5 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref5 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref6 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref6 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref7 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref7 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref7 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref8 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref8 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref9 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref9 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref9 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref9 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref10 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref10 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref10 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref11 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref11 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref11 http://www.neuroinformatics-collaboratory.org http://www.neuroinformatics-collaboratory.org http://refhub.elsevier.com/S0896-6273(16)30783-8/sref13 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref13 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref13 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref13 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref14 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref14 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref14 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref15 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref15 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref16 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref16 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref16 http://www.economist.com/news/leaders/21694528-era-predictable-improvement-computer-hardware-ending-what-comes-next-future http://www.economist.com/news/leaders/21694528-era-predictable-improvement-computer-hardware-ending-what-comes-next-future http://www.economist.com/news/leaders/21694528-era-predictable-improvement-computer-hardware-ending-what-comes-next-future http://www.economist.com/news/leaders/21694528-era-predictable-improvement-computer-hardware-ending-what-comes-next-future https://arxiv.org/abs/1608.06548 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref19 http://refhub.elsevier.com/S0896-6273(16)30783-8/sref19 Neuron, Volume 92 Supplemental Information To the Cloud! A Grassroots Proposal to Accelerate Brain Science Discovery Neuro Cloud Consortium Joshua T. Vogelstein,1,33,34,35,36,* Brett Mensh,2,3,5 Michael Häusser,4 Nelson Spruston,5 Alan C. Evans,6 Konrad Kording,7 Katrin Amunts,8,9,10 Christoph Ebell,10 Jeff Muller,10 Martin Telefont,10 Sean Hill,11 Sandhya P. Koushika,12 Corrado Calì,13 Pedro Antonio Valdés-Sosa,14,15 Peter B. Littlewood,16 Christof Koch,17 Stephan Saalfeld,5 Adam Kepecs,18 Hanchuan Peng,17 Yaroslav O. Halchenko,19 Gregory Kiar,1,33 Mu-Ming Poo,20 Jean-Baptiste Poline,21 Michael P. Milham,22,23 Alyssa Picchini Schaffer,24 Rafi Gidron,25 Hideyuki Okano,26,27 Vince D. Calhoun,28,29 Miyoung Chun,30 Dean M. Kleissas,31 R. Jacob Vogelstein,32 Eric Perlman,33 Randal Burns,34,35 Richard Huganir,36,37 and Michael I. Miller1,33,37 1Department of Biomedical Engineering, Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD 21218, USA 2Optimize Science, Mill Valley, CA 94941, USA 3UCSF Kavli Institute for Fundamental Neuroscience, San Francisco, CA 94143, USA 4Wolfson Institute for Biomedical Research and Department of Neuroscience, Physiology, and Pharmacology, University College London, Gower Street, London WC1E 6BT, UK 5Janelia Research Campus, Howard Hughes Medical Institute, 19700 Helix Drive, Ashburn, VA 20147, USA 6Montreal Neurological Institute, McGill University, 3801 University Street, Montreal, QC H3A 2B4, Canada 7Departments of Physical Medicine and Rehabilitation, Physiology, Applied Mathematics, and Biomedical Engineering, Northwestern University, 345 East Superior Street, Chicago, IL 60611, USA 8Institute for Neuroscience and Medicine, INM-1, Forschungszentrum Jülich, 52428 Jülich, Germany 9Cécile and Oskar Vogt Institute of Brain Research, University Hospital Duesseldorf, University Duesseldorf, 40225 Düsseldorf, Germany 10Human Brain Project, EPFL, 1202 Geneva, Switzerland 11Blue Brain Project, EPFL, Campus Biotech, 1202 Geneva, Switzerland 12Department of Biological Sciences, Tata Institute of Fundamental Research, Homi Bhabha Road, Navy Nagar, Colaba, Mumbai 400005, India 13Biological and Environmental Science and Engineering, KAUST, Thuwal 23955-6900, Saudi Arabia 14University of Electronic Science and Technology of China, Shahe Campus, Chengdu, Sichuan 610054, PRC 15Cuban Neurosciences Center, Cubanacan, Playa, Havana CP 11600, Cuba 16Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439, USA 17Allen Institute for Brain Science, 615 Westlake Avenue North, Seattle, WA 98109, USA 18Cold Spring Harbor Laboratory, Marks Building, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA 19Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH 03755, USA 20Institute of Neuroscience, Chinese Academy of Sciences Center for Excellence in Brain Science and Intelligence Technology, 320 Yue Yang Road, Shanghai 200031, China 21Henry H. Wheeler Jr. Brain Imaging Center, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720, USA 22Center for the Developing Brain, Child Mind Institute, 445 Park Avenue, New York, NY 10022, USA 23Nathan S. Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962, USA 24Simons Collaboration on the Global Brain, Simons Foundation, 160 Fifth Avenue, 7th Floor, New York, NY 10010, USA 25Israel Brain Technologies, Precede Building, Hakfar Hayarok, Ramat Hasharon 47800, Israel 26Department of Physiology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-ku, Tokyo 160-8582, Japan 27Laboratory for Marmoset Neural Architecture, RIKEN Brain Science Institute, 2-1 Hirosawa, Wako, Saitama 351- 0198, Japan 28The Mind Research Network, Albuquerque, NM 87106, USA 29Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM 87131, USA 30The Kavli Foundation, 1801 Solar Drive, Suite #250, Oxnard, CA 93030, USA 31Johns Hopkins University Applied Physics Laboratory, 11100 Johns Hopkins Road, Laurel, MD 20723, USA 32Intelligence Advanced Research Projects Activity (IARPA), Maryland Square Research Park, 5850 University Research Court, Riverdale Park, MD 20737, USA 33Center for Imaging Science 34Department of Computer Science 35Institute for Data Intensive Engineering and Science Johns Hopkins University, Baltimore, MD 21218, USA 36Department of Neuroscience, Johns Hopkins University, Baltimore, MD 21205, USA 37Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD 21218, USA *Correspondence: jovo@jhu.edu To the Cloud! A Grassroots Proposal to Accelerate Brain Science Discovery Introduction What Would Change If We Capitalize on This Opportunity? What Are the Primary Goals? What’s the Big Idea? What Are the Design Criteria? How Might We Achieve It? Data Infrastructure Apps Algorithms Education Discussion Supplemental Information show $^ABAUTH References