Summary of your 'study carrel' ============================== This is a summary of your Distant Reader 'study carrel'. The Distant Reader harvested & cached your content into a collection/corpus. It then applied sets of natural language processing and text mining against the collection. The results of this process was reduced to a database file -- a 'study carrel'. The study carrel can then be queried, thus bringing light specific characteristics for your collection. These characteristics can help you summarize the collection as well as enumerate things you might want to investigate more closely. This report is a terse narrative report, and when processing is complete you will be linked to a more complete narrative report. Eric Lease Morgan Number of items in the collection; 'How big is my corpus?' ---------------------------------------------------------- 178 Average length of all items measured in words; "More or less, how big is each item?" ------------------------------------------------------------------------------------ 891 Average readability score of all items (0 = difficult; 100 = easy) ------------------------------------------------------------------ 47 Top 50 statistically significant keywords; "What is my collection about?" ------------------------------------------------------------------------- 179 datum 24 COVID-19 22 model 14 patient 11 disease 10 system 9 health 9 Fig 8 covid-19 7 social 7 result 7 research 6 network 6 information 6 data 5 trial 5 privacy 5 pandemic 5 machine 5 learning 5 figure 5 dna 5 Twitter 5 Health 5 HIV 4 transmission 4 study 4 SARS 4 Ebola 4 Data 3 time 3 public 3 population 3 method 3 group 3 gene 3 epidemic 3 digital 3 device 3 case 3 big 3 analysis 3 access 3 India 3 ICU 3 Blockchain 2 year 2 woman 2 traffic 2 traceability Top 50 lemmatized nouns; "What is discussed?" --------------------------------------------- 14962 datum 4939 model 4875 % 4201 patient 3922 health 3596 time 3511 system 3460 study 3413 data 3200 case 3087 information 3059 analysis 3001 method 2869 disease 2830 result 2304 research 2256 number 2103 risk 1924 level 1911 approach 1901 network 1838 group 1800 use 1688 population 1672 user 1560 value 1545 rate 1504 care 1428 example 1425 process 1390 service 1372 technology 1358 year 1337 source 1294 pandemic 1275 type 1269 application 1267 dataset 1256 problem 1244 outbreak 1235 community 1186 learning 1178 effect 1159 country 1158 treatment 1126 hospital 1113 protein 1109 area 1084 individual 1080 set Top 50 proper nouns; "What are the names of persons or places?" -------------------------------------------------------------- 1722 al 1448 et 1336 ED 1251 COVID-19 993 AI 917 . 746 Health 661 Data 630 Fig 496 • 390 CI 361 US 343 Blockchain 337 HIV 325 SARS 319 ML 310 IoT 296 Background 291 CT 282 Ebola 277 C 264 China 251 Twitter 239 National 235 United 232 States 223 University 212 ABSTRACT 211 Research 210 EM 208 Table 203 Google 198 New 194 GenV 188 CoV-2 183 RFID 181 European 176 CDC 173 EMS 168 Information 165 FL 164 March 162 GS1 161 Coronavirus 160 ICU 158 Facebook 157 Disease 147 sha 146 Center 144 World Top 50 personal pronouns nouns; "To whom are things referred?" ------------------------------------------------------------- 5429 we 4516 it 2208 they 900 them 594 i 251 us 244 one 196 itself 175 you 155 he 143 themselves 50 she 32 me 20 him 19 ourselves 11 's 9 s 9 oneself 9 herself 8 himself 7 her 6 tsne 6 mine 3 yourself 3 pseudonyms 3 ours 2 u 2 em 1 σt 1 ζ 1 |w| 1 y8ck4lo8 1 whither 1 theirs 1 t,2 1 phylogeotool 1 pages''-or 1 o*-orbital 1 myself 1 mi 1 mg 1 logs 1 its 1 ia2-ib2 1 i-[(3'-allyl-2'-hydroxybenzilidene)amino]-3-hydroxyguanidine 1 her|himself 1 f9l 1 enroll 1 d 1 bmi<25 Top 50 lemmatized verbs; "What do things do?" --------------------------------------------- 49061 be 8830 have 7291 use 3439 base 2735 provide 2641 include 2004 do 1769 show 1711 make 1683 identify 1425 develop 1377 require 1355 need 1324 increase 1258 compare 1192 give 1173 relate 1171 find 1166 collect 1064 report 1010 improve 1007 consider 1001 learn 989 present 989 follow 978 determine 976 allow 943 associate 927 perform 925 take 914 see 819 propose 791 assess 788 obtain 785 predict 785 lead 779 share 768 apply 767 estimate 758 evaluate 753 support 753 describe 748 create 743 represent 742 reduce 727 generate 724 help 708 know 704 become 697 define Top 50 lemmatized adjectives and adverbs; "How are things described?" --------------------------------------------------------------------- 4894 not 3376 such 2992 more 2849 also 2646 - 2585 other 2345 different 2202 high 2062 well 1947 social 1619 only 1596 new 1521 public 1444 large 1423 however 1407 clinical 1403 most 1339 available 1298 many 1259 first 1226 as 1191 low 1130 specific 1116 medical 1051 e.g. 1034 human 999 important 973 real 903 multiple 896 significant 883 then 869 non 847 early 843 same 818 even 791 possible 775 several 755 often 731 further 711 very 692 current 689 digital 677 good 674 infectious 671 global 662 key 656 potential 653 various 651 individual 649 similar Top 50 lemmatized superlative adjectives; "How are things described to the extreme?" ------------------------------------------------------------------------- 516 most 357 good 230 least 188 Most 170 high 76 low 76 large 42 late 42 great 30 near 30 big 30 bad 23 strong 23 early 19 short 17 close 13 small 10 easy 9 simple 9 fast 8 new 7 old 7 long 6 wide 6 Least 4 rich 3 tough 3 poor 3 dense 3 -which 2 stiff 2 hard 2 furth 2 fine 2 busy 2 broad 2 # 1 � 1 ~e 1 young 1 wealthy 1 testret 1 sure 1 scary 1 safe 1 quick 1 plotly 1 outermost 1 narrow 1 heavy Top 50 lemmatized superlative adverbs; "How do things do to the extreme?" ------------------------------------------------------------------------ 887 most 143 least 56 well 3 worst 2 long 2 fast 1 shortest 1 latest 1 highest 1 hard 1 greatest 1 ecommendatio.ns 1 clustalw 1 -generate Top 50 Internet domains; "What Webbed places are alluded to in this corpus?" ---------------------------------------------------------------------------- 44 doi.org 31 github.com 7 orcid.org 6 www.kaggle.com 4 www 4 is.gd 4 covidclinical.net 4 covid19analytics.scinet.utoronto.ca 3 www.rcsb.org 3 www.ncbi.nlm.nih.gov 3 www.mpellert.at 3 plasmodb.org 3 ec.europa.eu 3 creativecommons.org 3 creativecommons 3 cran.r-project.org 2 www.who.int 2 www.scbit.org 2 www.fda.gov 2 www.ebi.ac.uk 2 www.derstandard.at 2 w3id.org 2 spectrum.ieee.org 2 nextstrain.org 2 mcri.figshare.com 2 mc.manuscriptcentral.com 2 eupathdb.org 2 en.wikipedia.org 2 covidclinical 2 covid19watcher.research.cchmc.org 2 covid19kerala.info 2 coronavirus.jhu.edu 2 bigoprogram.eu 2 aws.amazon.com 1 www.yicaiglobal.com 1 www.wwpdb.org 1 www.umass.edu 1 www.transportation.gov 1 www.synapse.org 1 www.stata.com 1 www.shivom.io 1 www.shinyapps.io 1 www.salus 1 www.rsna.org 1 www.pubmed.org 1 www.proteomicsresource 1 www.pewinternet.org 1 www.pestobserver.eu 1 www.perl.org 1 www.next Top 50 URLs; "What is hyperlinked from this corpus?" ---------------------------------------------------- 21 http://doi.org/10.1101/2020.09.28.20203257 7 http://doi.org/10.1101/2020.11.03.20225565 4 http://www 4 http://covid19analytics.scinet.utoronto.ca 3 http://doi.org/10.1101/2020.06.23.20137950 3 http://creativecommons 3 http://covidclinical.net 2 http://www.rcsb.org/pdb/ 2 http://www.mpellert.at/covid19 2 http://www.ebi.ac.uk/intact 2 http://mc.manuscriptcentral.com/jamia 2 http://github.com/mponce0/covid19analytics.datasets 2 http://github.com/mponce0/covid19.analytics 2 http://github.com/midas-network/COVID-19 2 http://github.com/florisvb/PyNumDiff 2 http://github.com/evogytis/baltic 2 http://github.com/ 2 http://eupathdb.org 2 http://doi.org/10.1101/2020.06.01.20118869 2 http://doi.org/10.1101/2020.02.25.20027433 2 http://doi.org/10.1038/ 2 http://doi.org/10 2 http://covidclinical 2 http://covid19watcher.research.cchmc.org/ 2 http://covid19kerala.info/ 1 http://www.yicaiglobal.com/news/chinese-tech-firm-debuts-five-meter-fever-finding-smart-helmet 1 http://www.wwpdb.org/ 1 http://www.who.int/wer/2009/wer8421.pdf?ua=1 1 http://www.who.int/csr/disease/swineflu/en/ 1 http://www.umass.edu/microbio/ 1 http://www.transportation.gov/smartcity 1 http://www.synapse.org 1 http://www.stata.com/ 1 http://www.shivom.io/ 1 http://www.shinyapps.io 1 http://www.scbit.org/smiga/index 1 http://www.scbit.org/ 1 http://www.salus 1 http://www.rsna.org/COVID-19 1 http://www.rcsb.org 1 http://www.pubmed.org 1 http://www.proteomicsresource 1 http://www.pewinternet.org/fact-sheet/social-media/ 1 http://www.pestobserver.eu/ 1 http://www.perl.org 1 http://www.next 1 http://www.ncbi.nlm.nih.gov/sra 1 http://www.ncbi.nlm.nih.gov/research/coronavirus/ 1 http://www.ncbi.nlm.nih.gov/Taxonomy/ 1 http://www.nature.com/ Top 50 email addresses; "Who are you gonna call?" ------------------------------------------------- 1 xuefeng.shao@sydney.edu.au 1 thanh.nguyen@deakin.edu.au 1 info-rdsgofair@go-fair.org 1 hayitg@gmail.com 1 hayit@eng.tau.ac.il 1 4ce@i2b2foundation.org 1 -paul@centiva.heal 1 -gulcin.gumus@eurordis.org Top 50 positive assertions; "What sentences are in the shape of noun-verb-noun?" ------------------------------------------------------------------------------- 26 data are available 20 data is not 18 data are not 14 data are also 14 data is often 13 % were female 12 systems are not 11 data does not 11 data is available 10 % were male 10 data are often 10 models are very 9 information is available 8 analyses are almost 8 data do not 8 information is not 8 use is not 7 data is critical 7 models do not 7 patients did not 6 data is also 6 data is only 6 data were available 5 data is even 5 data is still 5 data use agreements 5 data was not 5 methods are also 5 model is then 5 models are also 5 models are not 5 models are useful 5 patients do not 5 patients were more 5 risk is more 5 study took place 5 study using data 5 system is not 5 user does not 4 approach is not 4 care is not 4 cases is ideal 4 data are already 4 data are still 4 data are typically 4 data have also 4 data have not 4 data is crucial 4 data is highly 4 data is particularly Top 50 negative assertions; "What sentences are in the shape of noun-verb-no|not-noun?" --------------------------------------------------------------------------------------- 4 data are not always 4 data are not available 3 systems are not capable 2 approach is not applicable 2 data did not explicitly 2 data is not fully 2 data is not uniformly 2 groups was not statistically 2 information is not always 2 systems are not robust 1 % had no special 1 % showed no difference 1 % were not satisfied 1 analysis are not criminals 1 analysis showed no difference 1 approach has not yet 1 approach is not integral 1 approaches are not applicable 1 approaches are not robust 1 approaches are not yet 1 care are not automatically 1 care is not accurately 1 care is not as 1 care provided no additional 1 care was not easily 1 case are not globally 1 cases are no longer 1 data are not currently 1 data are not exclusively 1 data are not independent 1 data do not only 1 data includes not only 1 data including not only 1 data is not available 1 data is not enough 1 data is not feasible 1 data is not just 1 data is not necessarily 1 data is not necessary 1 data is not possible 1 data is not yet 1 data was not as 1 data was not reliably 1 data was not well 1 data were no less 1 data were not reliable 1 data were not suitable 1 data were not systematically 1 group had no significant 1 group is no inherent A rudimentary bibliography -------------------------- id = cord-292475-jrl1fowa author = Abry, Patrice title = Spatial and temporal regularization to estimate COVID-19 reproduction number R(t): Promoting piecewise smoothness via convex optimization date = 2020-08-20 keywords = France; datum; time summary = The novelty of the proposed approach is twofold: 1) the estimation of the reproduction number is achieved by convex optimization within a proximal-based inverse problem formulation, with constraints aimed at promoting piecewise smoothness; 2) the approach is developed in a multivariate setting, allowing for the simultaneous handling of multiple time series attached to different geographical regions, together with a spatial (graph-based) regularization of their evolutions in time. In that spirit, the overarching goal of the present work is twofold: (1) proposing a new, more versatile framework for the estimation of R(t) within the semi-parametric model of [8, 10] , reformulating its estimation as an inverse problem whose functional is minimized by using non smooth proximal-based convex optimization; (2) inserting this approach in an extended multivariate framework, with applications to various complementary datasets corresponding to different geographical regions. doi = 10.1371/journal.pone.0237901 id = cord-034545-onj7zpi1 author = Abuelkhail, Abdulrahman title = Internet of things for healthcare monitoring applications based on RFID clustering scheme date = 2020-11-03 keywords = RFID; datum; node; tag summary = The mathematical model optimizes the following objective functions: (1) minimizing the total distance between CHs and CMs to improve positioning accuracy; and (2) minimizing the number of clusters which reduces the signal transmission traffic Feature 6 (F-6): two level security is obtained by when a node writes data to its RFID tag, the data is signed with a signature, which is a hash value, the obtained hash is encrypted with a AES 128 bits shared key doi = 10.1007/s11276-020-02482-1 id = cord-206145-snkdgpym author = Ackermann, Klaus title = Object Recognition for Economic Development from Daytime Satellite Imagery date = 2020-09-11 keywords = Africa; OSM; datum; image summary = doi = nan id = cord-104486-syirijql author = Adiga, Aniruddha title = Data-driven modeling for different stages of pandemic response date = 2020-09-21 keywords = COVID-19; datum; disease; model; pandemic summary = doi = nan id = cord-312366-8qg1fn8f author = Adiga, Aniruddha title = Mathematical Models for COVID-19 Pandemic: A Comparative Analysis date = 2020-10-30 keywords = COVID-19; Sweden; datum; disease; model; population summary = As the pandemic takes hold, researchers begin investigating: (i) various intervention and control strategies; usually pharmaceutical interventions do not work in the event of a pandemic and thus nonpharmaceutical interventions are most appropriate, (ii) forecasting the epidemic incidence rate, hospitalization rate and mortality rate, (iii) efficiently allocating scarce medical resources to treat the patients and (iv) understanding the change in individual and collective behavior and adherence to public policies. Like projection approaches, models for epidemic forecasting can be broadly classified into two broad groups: (i) statistical and machine learning-based data-driven models, (ii) causal or mechanistic models-see 29, 30, 2, 31, 32, 6, 33 and the references therein for the current state of the art in this rapidly evolving field. In the context of COVID-19 case count modeling and forecasting, a multitude of models have been developed based on different assumptions that capture specific aspects of the disease dynamics (reproduction number evolution, contact network construction, etc.). doi = 10.1007/s41745-020-00200-6 id = cord-223332-51670qld author = Agrawal, Prashant title = An operational architecture for privacy-by-design in public service applications date = 2020-06-08 keywords = India; SGX; access; data; datum; individual; privacy summary = In this paper, we present an operational architecture for privacy-by-design based on independent regulatory oversight stipulated by most data protection regimes, regulated access control, purpose limitation and data minimisation. an interest in preventing information about the self from being disseminated and controlling the extent of access to information." It would be the role of a future Indian data protection law to create some objective standards for informational privacy to give all actors in society an understanding of the "ground rules" for accessing an individuals'' personal information. The need for early alignment of legal and technical design principles of data systems, such as access controls, purpose limitation and clear liability frameworks under appropriate regulatory jurisdictions are essential to create secure and trustworthy public data infrastructures [5, 6, 7] . We have presented the design sketch of an operational architecture for privacy-by-design [3] based on regulatory oversight, regulated access control, purpose limitation and data minimisation. doi = nan id = cord-016140-gvezk8vp author = Ahonen, Pasi title = Safeguards date = 2008 keywords = Article; Data; Directive; European; Member; Protection; RFID; datum; information; privacy summary = An example is the EC-supported CONNECT project, which aims to implement a privacy management platform within pervasive mobile services, coupling research on semantic technologies and intelligent agents with wireless communications (including UMTS, WiFi and WiMAX) and context-sensitive paradigms and multimodal (voice/graphics) interfaces to provide a strong and secure framework to ensure that privacy is a feasible and desirable component of future ambient intelligence applications. The fast emergence of information and communication technologies and the growth of online communication, e-commerce and electronic services that go beyond the territorial borders of the Member States have led the European Union to adopt numerous legal instruments such as directives, regulations and conventions on ecommerce, consumer protection, electronic signature, cyber crime, liability, data protection, privacy and electronic communication … and many others. doi = 10.1007/978-1-4020-6662-7_5 id = cord-270721-81axdn0g author = Allam, Zaheer title = The Emergence of Voluntary Citizen Networks to Circumvent Urban Health Data Sharing Restrictions During Pandemics date = 2020-07-24 keywords = Allam; COVID-19; Data; datum summary = In view of required immediate actions, volunteered geographic information (VGI) and citizen science concept have emerged, where people voluntarily share location and health status data to circumvent data sharing restrictions imposed upon corporations and governments. With all these, in the case of COVID-19, startups engaged in providing more insights are observed to access data from those sources, including airline ticketing and from governments of different countries, and with these, they are able to run simulation and predictive algorithms to come up with conclusions guiding policy orientations. Such were shared by BlueDot and Metabiota, some of the modern startups that use data, and through advanced technologies, such as natural language processing and machine learning, they were able to predict some of the geographical location that the virus would spread next from Wuhan, days before first cases were reported in those regions. On the Coronavirus (COVID-19) Outbreak and the Smart City Network: Universal Data Sharing Standards Coupled with Artificial Intelligence (AI) to Benefit Urban Health Monitoring and Management doi = 10.1016/b978-0-12-824313-8.00005-x id = cord-278913-u6vihq3u author = Allam, Zaheer title = The Rise of Machine Intelligence in the COVID-19 Pandemic and Its Impact on Health Policy date = 2020-07-24 keywords = datum; outbreak; virus summary = For instance, despite the challenges raised earlier, some startup companies were able to use the available data from social media, airline ticketing, and medical institutions to identify that the world is experiencing a new virus outbreak days before those in medical fraternity had made similar findings (Gaille, 2019) . According to Niiler (2020) , BlueDot, whose profile is shared in the following, was able to employ the services of AIdriven algorithms, to analyze data gathered from sources such as new reports, air ticketing, and animal disease outbreaks to predict that the world is facing a new type of virus outbreak. In the recent case of COVID-19, Metabiota was in the forefront to analyze the outbreak, and during the analysis of the data, some even sourced from social media, the company was able to predict which neighboring countries were at high risk of being the next target of the virus spread, more so because the panic in Wuhan had stated to trigger some fear, forcing people to flee. doi = 10.1016/b978-0-12-824313-8.00006-1 id = cord-324198-b8f99z8r author = Allam, Zaheer title = Underlining the Role of Data Science and Technology in Supporting Supply Chains, Political Stability and Health Networks During Pandemics date = 2020-07-24 keywords = COVID-19; China; Coronavirus; United; datum summary = Besides those, even when countries went on lockdown, the use of technology became even more apparent, as devices such as drones, robots, sensors, smart helmets, and thermal detectors were widely used for different purposes such as delivery, identifying potential coronavirus virus cases and other purposes (WHO, 2020b) . Going further, even post-COVID-19, the role of computation technologies will continue, especially in reevaluating the policy responses, and hence help different stakeholders to identify areas of weakness and how such could be strengthened in case of similar future major disruptive events. According to The World Bank (2020), data transparency not only would help in reducing political tension and win over the coronavirus but is also prerequisite in weathering down the economic shocks affecting the global economy, especially by helping enhancing trust in governments, hence promoting investments especially post-COVID-19. On the Coronavirus (COVID-19) Outbreak and the Smart City Network: Universal Data Sharing Standards Coupled with Artificial Intelligence (AI) to Benefit Urban Health Monitoring and Management doi = 10.1016/b978-0-12-824313-8.00010-3 id = cord-018133-2otxft31 author = Altman, Russ B. title = Bioinformatics date = 2006 keywords = datum; dna; information; sequence; structure summary = Experimentation and bioinformatics have divided the research into several areas, and the largest are: (1) genome and protein sequence analysis, (2) macromolecular structure-function analysis, (3) gene expression analysis, and (4) proteomics. With the completion of the human genome and the abundance of sequence, structural, and gene expression data, a new field of systems biology that tries to understand how proteins and genes interact at a cellular level is emerging. The Entrez system from the National Center for Biological Information (NCBI) gives integrated access to the biomedical literature, protein, and nucleic acid sequences, macromolecular and small molecular structures, and genome project links (including both the Human Genome Project and sequencing projects that are attempting to determine the genome sequences for organisms that are either human pathogens or important experimental model organisms) in a manner that takes advantages of either explicit or computed links between these data resources. doi = 10.1007/0-387-36278-9_22 id = cord-019050-a9datsoo author = Ambrogi, Federico title = Bioinformatics and Nanotechnologies: Nanomedicine date = 2014 keywords = analysis; cancer; cell; datum; dna; expression; gene summary = In this chapter we focus on the bioinformatics strategies for translating genome-wide expression analyses into clinically useful cancer markers with a specific focus on breast cancer with a perspective on new diagnostic device tools coming from the field of nanobiotechnology and the challenges related to high-throughput data integration, analysis, and assessment from multiple sources. In this chapter we focus on the bioinformatics strategies for translating genome-wide expression analyses into clinically useful cancer markers with a specific focus on breast cancer with a perspective on new diagnostic device tools coming from the field of nanobiotechnology and the challenges related to high-throughput data integration, analysis, and assessment from multiple sources. In particular, DNA microarray-based technology, with the simultaneous evaluation of thousands of genes, has provided researchers with an opportunity to perform comprehensive molecular and genetic profiling of breast cancer able to classify it into some clinically relevant subtypes and in the attempt to predict the prognosis or the response to treatment [32.5-8]. doi = 10.1007/978-3-642-30574-0_32 id = cord-275069-opuwyaiv author = Amram, Denise title = Building up the “Accountable Ulysses” model. The impact of GDPR and national implementations, ethics, and health-data research: Comparative remarks date = 2020-07-31 keywords = GDPR; datum; research summary = For this reason, considering the new ethical-legal issues emerging from the scientific-technological progress that involves a daily use of health-related data, our comparative analysis will firstly discuss the legal bases for health data processing for research purposes in order to identify the critical profiles as well possible practical solutions that might help Ulysses 4.0. Some critical profiles emerge from article 9, para 4, GDPR which allows Member States to decide whether or not maintaining the legal bases provided by the EU Regulation or introducing further conditions, including limitations, with regard to the processing of particularly sensitive data, like the genetic data, the biometric ones, or those concerning health. According to the above-discussed system, the data controller (i.e. the university/research institute in person of the legal representative) shall involve the principal investigator in the data management activities, authorizing to data processing under article 29 GDPR, in order to proactively guarantee the adoption of those technical and organizational measures aimed at safeguarding the rights and freedoms of data subjects in her project. doi = 10.1016/j.clsr.2020.105413 id = cord-226956-n5qwsvtr author = Arbia, Giuseppe title = A Note on Early Epidemiological Analysis of Coronavirus Disease 2019 Outbreak using Crowdsourced Data date = 2020-03-13 keywords = datum; test summary = doi = nan id = cord-310406-5pvln91x author = Asbury, Thomas M title = Genome3D: A viewer-model framework for integrating and visualizing multi-scale epigenomic information within a three-dimensional genome date = 2010-09-02 keywords = datum; genome; model summary = RESULTS: We have applied object-oriented technology to develop a downloadable visualization tool, Genome3D, for integrating and displaying epigenomic data within a prescribed three-dimensional physical model of the human genome. In addition, in spite of the many recent efforts to measure and model the genome structure at various resolutions and detail [3] [4] [5] [6] [7] [8] [9] [10] , little work has focused on combining these models into a plausible aggregate, or has taken advantage of the large amount of genomic and epigenomic data available from new high-throughput approaches. The viewer is designed to display data from multiple scales and uses a hierarchical model of the relative positions of all nucleotide atoms in the cell nucleus, i.e., the complete physical genome. An integrated physical genome model can show the interplay between histone modifications and other genomic data, such as SNPs, DNA methylation, the structure of gene, promoter and transcription machinery, etc. In addition to epigenomic data, the physical genome model also provides a platform to visualize highthroughput gene expression data and its interplay with global binding information of transcription factors. doi = 10.1186/1471-2105-11-444 id = cord-028802-ko648mzz author = Asri, Hiba title = Big Data and Reality Mining in Healthcare: Promise and Potential date = 2020-06-05 keywords = big; datum; reality summary = We illustrate the benefits of reality mining analytics that lead to promote patients'' health, enhance medicine, reduce cost and improve healthcare value and quality. This paper gives insight on the challenges and opportunities related to analyzing larger amounts of health data and creating value from it, the capability of reality mining in predicting outcomes and saving lives, and the Big Data tools needed for analysis and processing. Reality Mining is about using big data to study our behavior through mobile phone and wearable sensors [4] . Another study use pregnant woman''s mobile phone health data like user''s activity, user''s sleep quality, user''s location, user''s age, user''s Body Mass Index (BMI)among others, considered as risk factors of miscarriage, in order to make an early prediction of miscarriage and react as earlier as possible to prevent it. The use of both Big data and reality mining in healthcare industry has the capability to provide new opportunities with respect to patients, treatment monitoring, healthcare service and diagnosis. doi = 10.1007/978-3-030-51935-3_13 id = cord-002366-t94aufs3 author = Aurrecoechea, Cristina title = EuPathDB: the eukaryotic pathogen genomics database resource date = 2017-01-04 keywords = analysis; datum; figure; gene summary = To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user''s data. The near-seamless integration of strategy results with tools for functional enrichment analyses and transcript interpretation as well as our new Galaxy workspace and the availability of publicly shared strategies augment the data mining experience in EuPathDB. doi = 10.1093/nar/gkw1105 id = cord-290003-pmf7aps6 author = Avtar, Ram title = Assessing sustainable development prospects through remote sensing: A review date = 2020-09-03 keywords = Remote; Sensing; datum summary = Remote sensing allows for the measurement, integration, and presentation of useful information for effective decision-making at various temporal and spatial scales. Although 50 several approaches and techniques are available to monitor natural resources and hazards, 51 remote sensing (RS) technology has been particularly popular since the 1970s because of its 52 low acquisition costs and high utility for data collection, interpretation, and management. Based on these RS data, forest fragmentation, land use and cover, and species distributions 211 have been mapped and monitored over time (Kerr et al., 2001; Menon and Bawa, 1997) . • Sustainable transportation mapping and analysis in developing countries is 856 greatly affected by the availability, cost, licensing and access to high resolution 857 real-time imageries and image processing software. With the 870 development of new and improved satellite and airborne sensors, data with increasingly 871 higher spatial, spectral, and/or temporal resolution will become available for researchers, decision-making in many areas of sustainable development. doi = 10.1016/j.rsase.2020.100402 id = cord-275742-7jxt6diq author = Batarseh, Feras A. title = Preventive healthcare policies in the US: solutions for disease management using Big Data Analytics date = 2020-06-23 keywords = CDC; NHANES; datum; patient summary = Our work''s main objective (hypothesis) is two-tier: through one of the largest and most representative national health datasets for population-based surveillance, data imputations and machine learning models (such as clustering) offer preventive care pointers by grouping patients into heterogeneous clusters, and providing data-driven predictions and policies for healthcare in the US. The Center for Disease Control and Prevention (CDC) reported on those states, and presented multiple cases to help increase public trust in immunizations: "We hope this report is a reminder to healthcare professionals to make a strong vaccine recommendation to their patients at every visit and make sure parents understand how important it is for their children to get all their recommended vaccinations on time" [5, 8] . 2. We aim to collect more CDC data variables to provide more correlations and further tests for imputations, and compare with other NHANES predictive models for specific diseases such as periodontitis [39] . doi = 10.1186/s40537-020-00315-8 id = cord-285522-3gv6469y author = Bello-Orgaz, Gema title = Social big data: Recent achievements and new challenges date = 2015-08-28 keywords = Hadoop; Spark; Twitter; big; datum; network; social summary = Big data has become an important issue for a large number of research areas such as data mining, machine learning, computational intelligence, information fusion, the semantic Web, and social networks. The rise of different big data frameworks such as Apache Hadoop and, more recently, Spark, for massive data processing based on the MapReduce paradigm has allowed for the efficient utilisation of data mining methods and machine learning algorithms in different domains. Currently, the exponential growth of social media has created serious problems for traditional data analysis algorithms and techniques (such as data mining, statistics, machine learning, and so on) due to their high computational complexity for large datasets. This section provides a description of the basic methods and algorithms related to network analytics, community detection, text analysis, information diffusion, and information fusion, which are the areas currently used to analyse and process information from social-based sources. doi = 10.1016/j.inffus.2015.08.005 id = cord-130507-baheh8i5 author = Benreguia, Badreddine title = Tracking COVID-19 by Tracking Infectious Trajectories date = 2020-05-12 keywords = datum; person; suspect; system summary = doi = nan id = cord-252984-79jzkdu2 author = Bickman, Leonard title = Improving Mental Health Services: A 50-Year Journey from Randomized Experiments to Artificial Intelligence and Precision Mental Health date = 2020-07-26 keywords = Bickman; Health; Mental; RCT; approach; datum; machine; research; service; study; treatment summary = I describe five principal causes of this failure, which I attribute primarily, but not solely, to methodological limitations of RCTs. Lastly, I make the case for why I think AI and the parallel movement of precision medicine embody approaches that are needed to augment, but probably not replace, our current research and development efforts in the field of mental health services. (1) harmonize terminology and specify MBC''s core components; (2) develop criterion standard methods for monitoring fidelity and reporting quality of implementation; (3) develop algorithms for MBC to guide psychotherapy; (4) test putative mechanisms of change, particularly for psychotherapy; (5) develop brief and psychometrically strong measures for use in combination; (6) assess the critical timing of administration needed to optimize patient outcomes; (7) streamline measurement feedback systems to include only key ingredients and enhance electronic health record interoperability; (8) identify discrete strategies to support implementation; (9) make evidence-based policy decisions; and (10) align reimbursement structures. doi = 10.1007/s10488-020-01065-8 id = cord-029865-zl0romvl author = Bowe, Emily title = Learning from lines: Critical COVID data visualizations and the quarantine quotidian date = 2020-07-27 keywords = COVID-19; Data; available; datum; map summary = In response to the ubiquitous graphs and maps of COVID-19, artists, designers, data scientists, and public health officials are teaming up to create counter-plots and subaltern maps of the pandemic. Together, the official maps and counter-plots acknowledge that the pandemic plays out differently across different scales: COVID-19 is about global supply chains and infection counts and TV ratings for presidential press conferences, but it is also about local dynamics and neighborhood mutual aid networks and personal geographies of mitigation and care. The widespread availability of consumer-friendly mapping platforms and open data repositories has equipped cartographers and information designers to plot their own charts and graphs-some of which then circulate on social media or appear on slide shows at official public health briefings (Bazzaz, 2020; Mattern, 2020a; "Triplet Kids," 2020) . Available at: www.medium.com/nightingale/covid-19-data-literacy-isfor-everyone-46120b58cec9 Available at: www.expressnews.com/news/local/article/Thousands-h it-hard-by-coronavirus-pandemic-s-15189948 doi = 10.1177/2053951720939236 id = cord-276405-yfvu83r9 author = Brat, Gabriel A. title = International electronic health record-derived COVID-19 clinical course profiles: the 4CE consortium date = 2020-08-19 keywords = COVID-19; datum; laboratory; patient; site summary = Because EHRs are not themselves agile analytic platforms, we have been successfully building upon the open source and free i2b2 (for Informatics for Integrating Biology and the Bedside) toolkit [10] [11] [12] [13] [14] [15] [16] [17] to manage, compute, and share data extracted from EHRs. In response to COVID-19, we have organized a global community of researchers, most of whom are or have been members of the i2b2 Academic Users Group, to rapidly set up an ad hoc network that can begin to answer some of the clinical and epidemiological questions around COVID-19 through data harmonization, analytics, and visualizations. Laboratory value trajectories Our initial data extraction included 14 laboratory markers of cardiac, renal, hepatic, and immune dysfunction that have been strongly associated with poor outcomes in COVID-19 patients in previous publications. doi = 10.1038/s41746-020-00308-0 id = cord-197127-o30tiqel author = Breugel, Floris van title = Numerical differentiation of noisy data: A unifying multi-objective optimization framework date = 2020-09-03 keywords = Fig; datum; derivative; estimate summary = In this work, we take a principled approach and propose a multi-objective optimization framework for choosing parameters that minimize a loss function to balance the faithfulness and smoothness of the derivative estimate. To understand the qualities of the derivative estimates resulting from parameters selected by our loss function, we begin by analyzing the derivative estimates of noisy sinusoidal curves using the Savitzky-Golay filter and return to our original metrics, RMSE and error correlation to evaluate the results. To characterize this relationship, we evaluated the performance of derivative estimates achieved by a Savitzky-Golay filter by sweeping through different values of γ for a suite of sinusoidal data with various frequencies (f ), noise levels (additive white (zero-mean) Gaussian noise with variance σ 2 ), temporal resolutions (∆t), and dataset lengths (in time steps, L) ( Fig. 2A-B) . doi = nan id = cord-348244-1py0k53e author = Buyse, Marc title = Central statistical monitoring of investigator-led clinical trials in oncology date = 2020-06-23 keywords = clinical; datum; trial summary = We describe the principles of central statistical monitoring, provide examples of its use, and argue that it could help drive down the cost of randomized clinical trials, especially investigator-led trials, whilst improving their quality. Yet, there is no evidence showing that extensive data monitoring has any major impact on the quality of clinical-trial data, and none of the randomized studies assessing more intensive versus less intensive monitoring has shown any difference in terms of clinically relevant treatment outcomes [18] [19] [20] [21] [22] . Both types of trials may benefit from central statistical monitoring of the data; industry-sponsored trials to target centers that are detected as having potential data quality issues, which may require an on-site audit, and investigatorled trials as the primary method for checking data quality. An evidence-based study of the cost for data monitoring in clinical trials A statistical approach to central monitoring of data quality in clinical trials doi = 10.1007/s10147-020-01726-6 id = cord-138627-jtyoojte author = Buzzell, Andrew title = Public Goods From Private Data -- An Efficacy and Justification Paradox for Digital Contact Tracing date = 2020-07-14 keywords = datum; dct; privacy; public summary = Privacy-centric analysis treats data as private property, frames the relationship between individuals and governments as adversarial, entrenches technology platforms as gatekeepers, and supports a conception of emergency public health authority as limited by individual consent and considerable corporate influence that is in some tension with the more communitarian values that typically inform public health ethics. They require populations be persuaded to use the DCT app, and that hardware and software vendors cooperate with public health authorities to resolve barriers to adoption and usage, such as the need for software modifications to enable passive RSSI measurement. The privacy preserving model serves vendor interests, allowing them to cooperate with public health authorities, thus avoiding regulatory or coercive measures, by limiting the possibility that the use of DCT apps breaks tacit or contractual agreements with their users that could damage already wavering public trust. doi = nan id = cord-259247-7loab74f author = CAPPS, BENJAMIN title = Where Does Open Science Lead Us During a Pandemic? A Public Good Argument to Prioritize Rights in the Open Commons date = 2020-06-05 keywords = datum; open; public; science summary = doi = 10.1017/s0963180120000456 id = cord-327810-kquh59ry author = Canhoto, Ana Isabel title = Leveraging machine learning in the global fight against money laundering and terrorism financing: An affordances perspective date = 2020-10-17 keywords = AML; BANK; datum; financial; machine; money summary = These requirements mean that financial services organisations are wary of adopting technologies where they lack complete control over use of customer data, or whose workings they do not fully understand, as in the case of black-box type of algorithms. In addition to the specific technical and organisational challenges associated with the specific types of algorithms discussed above, there are some generic issues that condition BANK''s ability to use machine learning in AML profiling. Machine learning''s ability to discover patterns in data, process various types of data and act autonomously promises to enable financial intermediaries to detect money laundering activity in a cost-effective manner (Fernandez, 2019) . While financial services organisations may be essential enablers of money laundering and, indirectly, criminal activity, their perspective is limited to the transaction data for their own customers and their own institution. doi = 10.1016/j.jbusres.2020.10.012 id = cord-183016-ajwnihk6 author = Carrillo, Dick title = Containing Future Epidemics with Trustworthy Federated Systems for Ubiquitous Warning and Response date = 2020-10-26 keywords = base; datum; privacy; system summary = In this context, one main factor is to design a special set of incentives that would allow the citizens to provide secured anonymized access to their data while actively participating in the crowd platform to support early disease detection, a public information system, and possible mitigation measures. 2) a federated global epidemiological warning system is proposed based on DLTs. 3) a proof of concept of the integration between DLT and NB-IoT is used to evaluate the wireless network performance on the IoT infrastructure supporting a remote patient monitoring use case. There are three principal sources of epidemic-relevant data acquired through wireless connectivity: (1) online social networks; (2) personal smart phone and mobile data; and (3) sensory and Internet of Things (IoT) devices. In the context of the proposed federated global epidemiological warning system, the remote patient monitoring is a representative use case, in which the integration between DLTs and IoT devices plays a key role. doi = nan id = cord-209932-1lsv7cel author = Challet, Damien title = Predicting financial markets with Google Trends and not so random keywords date = 2013-07-17 keywords = datum; keyword summary = doi = nan id = cord-184194-zdxebonv author = Chen, Lichin title = Using Deep Learning and Explainable Artificial Intelligence in Patients' Choices of Hospital Levels date = 2020-06-24 keywords = datum; hospital; patient; provider summary = doi = nan id = cord-133273-kvyzuayp author = Christ, Andreas title = Artificial Intelligence: Research Impact on Key Industries; the Upper-Rhine Artificial Intelligence Symposium (UR-AI 2020) date = 2020-10-05 keywords = CNN; Fig; ICU; base; datum; feature; figure; learn; model; network; result; robot; system summary = During the literature review it was evident the presence of few works dedicated to evaluating comprehensively the complete cycle of biofeedback, which comprises using the wearable devices, applying Machine Learning patterns detection algorithms, generate the psychologic intervention, besides monitoring its effects and recording the history of events [9, 3] . This solution is being proposed by several literature study about stress patterns and physiological aspects but with few results, for this reason, our project will address topics like experimental study protocol on signals acquisition from patients/participants with wearables to data acquisition and processing, in sequence will be applied machine learning modeling and prediction on biosignal data regarding stress (Fig. 1) . We will present first results of the project concerning a new process model for cooperating data scientists and quality engineers, a product testing model as knowledge base for machine learning computing and visual support of quality engineers in order to explain prediction results. doi = nan id = cord-025545-s6t9a7z8 author = Christantonis, Konstantinos title = Using Classification for Traffic Prediction in Smart Cities date = 2020-05-06 keywords = datum; traffic; weather summary = This work focuses on analyzing different approaches regarding data manipulation in order to predict day-ahead traffic loads at random places around cities, based on weather conditions. Based on that, we used weather data collected from sensors installed around carefully chosen specific city spots for predicting the day-ahead traffic volume. To select the most appropriate locations to install sensors that either measure traffic loads or collect weather data, it is crucial to define their objective in advance. Our efforts focus on the question ''How can one exploit sensor data that are not personalized and create meaningful conclusions for the general public?'' Deployment of smart city infrastructure requires a deep understanding of the traffic problem. Our approach, besides examining traffic predictability based on weather data, also aims to clarifying differences among locations. For example, if a sensor captures information every h(e.g. at 07:10, 08:10, 09:10 etc.), we computed and assigned the average value for each weather metric and the traffic load for that specific day period. doi = 10.1007/978-3-030-49161-1_5 id = cord-270703-c8mv2eve author = Christensen, Paul A title = Real-time Communication With Health Care Providers Through an Online Respiratory Pathogen Laboratory Report date = 2018-11-30 keywords = datum; influenza summary = We implemented a real-time report to distribute respiratory pathogen data for our 8-hospital system to anyone with an Internet connection and a web browser. We implemented a real-time report to distribute respiratory pathogen data for our 8-hospital system to anyone with an Internet connection and a web browser. To address these local needs in a major US metropolitan area, our clinical microbiology laboratory implemented an online dashboard to distribute respiratory pathogen data for our 8-hospital system to clinicians, epidemiologists, infection control practitioners, system leadership, and the public. Development of this report began in the Fall 2017, before the respiratory virus season, during which influenza reached an epidemic status across the United States that resulted in supply shortages, testing difficulties, and a widespread public health crisis [4, 5] . In summary, our microbiology laboratory implemented a near real-time Internet report to distribute respiratory pathogen data for our 8-hospital system to clinicians, hospital epidemiologists, infection control committees, system leadership, and the public. doi = 10.1093/ofid/ofy322 id = cord-343962-12t247bn author = Cori, Anne title = Key data for outbreak evaluation: building on the Ebola experience date = 2017-05-26 keywords = Ebola; african; case; datum; west summary = Here we build on experience gained in the West African Ebola epidemic and prior emerging infectious disease outbreaks to set out a checklist of data needed to: (1) quantify severity and transmissibility; (2) characterize heterogeneities in transmission and their determinants; and (3) assess the effectiveness of different interventions. Dynamic transmission models, which account for saturation effects, can be used to assess the long-term impact of the outbreak such as predicting the timing and magnitude of the epidemic peak or the attack rate (final proportion of population infected) [39, 40] . Estimates of the secondary attack rate have been obtained for the West African Ebola epidemic by reconstructing household data based on information reported by cases, in particular, as part of contact-tracing activities [86, 87] . Such data were widely used during the West African Ebola epidemic to quantify the risk of international spread of the disease, and to assess the potential impact of airport screening and travel restrictions on the outbreak [9,94 -96] . doi = 10.1098/rstb.2016.0371 id = cord-306375-cs4s2o8y author = Costa-Santos, C. title = COVID-19 surveillance - a descriptive study on data quality issues date = 2020-11-05 keywords = covid-19; dataset; datum summary = Nevertheless, to our knowledge, there is no study performing a structured assessment of data quality issues from the datasets provided by National Surveillance Systems for research purposes during the COVID-19 pandemic. This updated database had an inconsistent manifest, including some variables presented in a different format (for example, instead of a variable with the outcome of the patient, the second dataset presented two dates: death and recovery date), or with different definitions (for example, variable age was defined as the age at the time of COVID-19 onset or as age at the time of COVID-19 notification, in the first and second datasets, respectively), which raised concerns regarding their use for valid research and replication of the analysis made using the first version of data. The DGSAugust dataset included 38520 COVID-19 cases diagnosed between March and June, less 4,003 cases (9%) than the daily public report provided by Portuguese Directorate-General of Health. doi = 10.1101/2020.11.03.20225565 id = cord-176677-exej3zwh author = Coveney, Peter V. title = When we can trust computers (and when we can't) date = 2020-07-08 keywords = datum; machine; model; result; science; system summary = doi = nan id = cord-224516-t5zubl1p author = Daubenschuetz, Tim title = SARS-CoV-2, a Threat to Privacy? date = 2020-04-21 keywords = SARS; datum; information; privacy summary = We furthermore discuss the issues with privacy that can occur during a crisis such as this global pandemic and what can be done to ensure information security and hence appropriate data protection. When we are considering the example of doctors treating their patients, we can use the framework of contextual integrity to reason about the appropriate information flow as follows: the patient is both the sender and the subject of the data exchange, the doctor is the receiver, the information type is the patient''s medical information, the transmission principle includes, most importantly, doctor-patient confidentiality aside from public health issues. In Germany, the authority for disease control and prevention, the Robert Koch Institute (RKI), made headlines on March 18, 2020, as it became public that telecommunication provider Telekom had shared an anonymized set of mobile phone movement data to monitor citizens'' mobility in the fight against SARS-CoV-2. doi = nan id = cord-320040-h8v6cs5b author = Delaunay, Sophie title = Knowledge sharing during public health emergencies: from global call to effective implementation date = 2016-04-01 keywords = datum summary = To improve epidemic emergency response and to accelerate related research, health authorities in potentially exposed countries must put in place the necessary frameworks for collecting, managing and swiftly making available good-quality, standardized data and for safely securing and sharing biomaterial -such as patient samples -collected during the outbreak. As the Zika outbreak shows, the global public health community is still unprepared to collect good quality, standardized data and biomaterials during emergencies and to share them in ways that provide equitable access to researchers. Together, a virtual biobank and a data repository could provide a global resource for the essential research needed to plan effective outbreak responses. ■ Knowledge sharing during public health emergencies: from global call to effective implementation Sophie Delaunay, a Patricia Kahn, a Mercedes Tatay b & Joanne Liu b doi = 10.2471/blt.16.172650 id = cord-176472-4sx34j90 author = Diou, Christos title = BigO: A public health decision support system for measuring obesogenic behaviors of children in relation to their local environment date = 2020-05-06 keywords = BigO; behavior; datum summary = doi = nan id = cord-326908-l9wrrapv author = Duchêne, David A. title = Evaluating the Adequacy of Molecular Clock Models Using Posterior Predictive Simulations date = 2015-07-10 keywords = clock; datum; model summary = We test the power of this approach using simulated data and find that the method is sensitive to bias in the estimates of branch lengths, which tends to occur when using underparameterized clock models. 2001) ; uncorrelated beta-distributed rate variation among lineages; misleading node-age priors (i.e., node calibrations that differ considerably from the true node ages); and when data were generated under a strict clock but analyzed with an underparameterized substitution model ( fig. The substitution model was identified as inadequate for the coronavirus data set by the multinomial test statistic estimated using posterior predictive data sets from a clock analysis (P < 0.05); however, it was identified as adequate when using a clock-free method (P = 0.20). In addition, our metric of uncertainty in posterior predictive branch lengths is sensitive to some cases of misspecification of clock models and node-age priors, but not to substitution model misspecification, as shown for our analyses of the coronavirus data set. doi = 10.1093/molbev/msv154 id = cord-032763-cdhu2pfi author = Efroni, Zohar title = Location Data as Contractual Counter-Performance: A Consumer Perspective on Recent EU Legislation date = 2020-06-22 keywords = DCSD; Directive; art; datum; digital; location summary = 38 Therefore, this Regulation should require providers of electronic communications services to obtain end-users'' consent to process electronic communications metadata, which should include data on the location of the device generated for the purposes of granting and maintaining access and connection to the service. The initial Commission''s proposal (COM-DCD) included a provision that extended the scope of the Directive to cases where the consumer actively provides, in exchange for digital content, counter-performance other than money in the form of personal data or any other data. 94 It follows that data which qualify as ''metadata'' will trigger protection only if the exchange of such data against digital content/services is specifically recognised under domestic law as a 88 COM-DCD, recital 14: ''As regards digital content supplied not in exchange for a price but against counter-performance other than money, this Directive should apply only to contracts where the supplier requests and the consumer actively provides data'' (emphasis added). doi = 10.1007/978-3-662-61920-9_13 id = cord-279125-w6sh7xpn author = Egli, Adrian title = Digital microbiology date = 2020-06-27 keywords = datum; machine summary = Making an efficient use of big data, machine learning, and artificial intelligence in clinical microbiology requires a profound understanding of data handling aspects. clinical decision support systems based on machine learning to provide automated feedback 7 regarding empiric antibiotic prescription adapted to specific patient groups 46 . As physiology and laboratory parameters can rapidly change 9 during an infection, time-series data greatly impact the predictive values of such algorithms -similar 10 to a doctor, who observers the patient during disease progression -machine learning algorithms will 11 also follow the patient''s data stream. Machine 18 learning algorithms may be used at each step of the microbiological diagnostic process from pre-to 19 post-analytics, helping us to deal with the increasing quantities and complexity of data 113,114 (Table 1) . Machine learning radically changes the way we 8 handle healthcare-related data -including data of clinical microbiology and infectious diseases. doi = 10.1016/j.cmi.2020.06.023 id = cord-301405-7ijaxk4v author = El Mouden, Zakariyaa Ait title = Towards Using Graph Analytics for Tracking Covid-19 date = 2020-12-31 keywords = covid-19; datum; graph summary = The purpose of this paper is to introduce a graph-based approach of communities detection in the novel coronavirus Covid-19 countries'' datasets. Recent works combined between spectral methods and deep learning models, such as the case of [24] where the authors presented their deep clustering approach to cluster data using both neural networks and graph analytics. Our proposed approach consists of a SC based communities detection where the objective is to have an unsupervised grouping of countries having similar behaviors of Covid-19 spreading. In this paper, we proposed a graph-based approach for clustering Covid-19 data using spectral clustering. Ongoing work intends to link the different processes of the model, developed with two different programming languages (Java and R) to build a model able to cluster heterogeneous data based on graph analytics and spectral clustering for communities'' detection. An application of spectral clustering approach to detect communities in data modeled by graphs doi = 10.1016/j.procs.2020.10.029 id = cord-148109-ql1tthyr author = El-Din, Doaa Mohey title = E-Quarantine: A Smart Health System for Monitoring Coronavirus Patients for Remotely Quarantine date = 2020-05-05 keywords = LSTM; datum; patient summary = doi = nan id = cord-026935-586w2cam author = Fang, Zhichao title = An extensive analysis of the presence of altmetric data for Web of Science publications across subject fields and research topics date = 2020-06-17 keywords = Fig; Twitter; altmetric; datum summary = doi = 10.1007/s11192-020-03564-9 id = cord-347121-5drl3xas author = Farah, I. title = A global omics data sharing and analytics marketplace: Case study of a rapid data COVID-19 pandemic response platform. date = 2020-09-29 keywords = COVID-19; SARS; September; Shivom; data; datum; patient; platform; research summary = The platform combines patient genomic & omics data sets, a marketplace for AI & bioinformatics algorithms, new diagnostic tools, and data-sharing capabilities to advance virus epidemiology and biomarker discovery. The platform is a proven research ecosystem used by universities, biotech, and bioinformatics organizations to share and analyze omics data and can be used for a variety of use cases; from precision medicine, drug discovery, translational science to building data repositories, and tackling a disease outbreak. Our approach is designed to provide healthcare professionals with an urgently needed platform to find and analyze genetic data, and securely and anonymously share sensitive patient data to fight the disease outbreak. Among other use-cases, the provided platform can be used to rapidly study SARS-CoV-2, including analyses of the host response to COVID-19 disease, establish a multi-institutional collaborative datahub for rapid response for current and future pandemics, characterizing potential co-infections, and identifying potential therapeutic targets for preclinical and clinical development. doi = 10.1101/2020.09.28.20203257 id = cord-329986-sbyu7yuc author = Farrokhi, Aydin title = Using artificial intelligence to detect crisis related to events: Decision making in B2B by artificial intelligence date = 2020-11-30 keywords = Enron; Management; crisis; datum; email; event; organization summary = The study extends the situational crisis communication theory (SCCT) and Attribution theory frameworks built on big data and machine learning capabilities for early detection of crises in the market. This pioneering study is among the first studies that endeavour to use email data and sentiment analysis for extracting meaningful information that helps early detection of a crisis in an organization. This study aims to develop a big data analytics framework by deploying artificial intelligence rational agents generated by R/Python programming language capable of collecting data from different sources, such as emails, Tweets, Facebook, weblogs, online communities, databases, and documents, among others (structured, semistructured, and unstructured data). Previous studies have considered the use of network data for situational awareness; however, to the authors'' knowledge, none have specifically investigated or analyzed the use of email communication by major organizations for situational assessment of a developing crisis. doi = 10.1016/j.indmarman.2020.09.015 id = cord-297811-8gyejoc5 author = Finnie, Thomas J.R. title = EpiJSON: A unified data-format for epidemiology date = 2015-12-29 keywords = JSON; datum; record summary = We introduce ''EpiJSON'', a new, flexible, and standards-compliant format for the interchange of epidemiological data using JavaScript Object Notation. With this and the common morphology of a dataset in mind, we propose a standard for the storage and transmission of data for infectious disease epidemiology: EpiJSON (Epidemiological JavaScript Object Notation). Fundamentally, the structure of an EpiJSON file consists of three levels that we term "metadata", "records" and "events" (Fig. 2) . An "attribute" object is used for storing unambiguously a discrete piece of information, recording not only the value of the data but also its name, type and units. It provides a variety of functions that can convert data to each of the levels within EpiJSON (metadata, attributes, records, events and objects). In EpiJSON we provide a well-understood file structure with a verifiable format for storing and exchanging epidemiological data. doi = 10.1016/j.epidem.2015.12.002 id = cord-264994-j8iawzp8 author = Fitzpatrick, Meagan C. title = Modelling microbial infection to address global health challenges date = 2019-09-20 keywords = Ebola; HIV; datum; disease; model; transmission summary = Epidemiological modelling is a tool that can be used to mitigate this risk by predicting disease spread or quantifying the impact of different intervention strategies on disease transmission dynamics. Epidemiological modelling is a tool that can be used to mitigate this risk by predicting disease spread or quantifying the impact of different intervention strategies on disease transmission dynamics. We illustrate how four decades of methodological advances and improved data quality have facilitated the contribution of modelling to address global health challenges, exemplified by models for the HIV crisis, emerging pathogens and pandemic preparedness. We illustrate how four decades of methodological advances and improved data quality have facilitated the contribution of modelling to address global health challenges, exemplified by models for the HIV crisis, emerging pathogens and pandemic preparedness. Compartmental models analysing the interplay between vaccine uptake and disease dynamics confirmed the hypothesis that increases in vaccination were a response to the pertussis infection risk 61 , and showed that incorporating this interplay can improve epidemiological forecasts. doi = 10.1038/s41564-019-0565-8 id = cord-238342-ecuex64m author = Fong, Simon James title = Composite Monte Carlo Decision Making under High Uncertainty of Novel Coronavirus Epidemic Using Hybridized Deep Learning and Fuzzy Rule Induction date = 2020-03-22 keywords = Eqn; FRI; datum; epidemic; model summary = doi = nan id = cord-339440-qu913a8q author = Fonseca, David title = New methods and technologies for enhancing usability and accessibility of educational data date = 2020-10-26 keywords = datum; educational; learning; student summary = • The invited session entitled "Emerging interactive systems for education", in the thematic area "Learning and This special issue focuses on how to improve universal access to educational data, with emphasis on (a) new technologies and associated data in educational contexts: artificial intelligence systems [70] , robotics [71] [72] [73] , augmented [74] [75] [76] and virtual reality (VR) [77] [78] [79] [80] [81] , and educational data integration and management [82] ; (b) the role of data in the digital transformation and future of higher education: Personal Learning Environments (PLE) [83, 84] , mobile PLE [85, 86] , stealth assessment [87] , technology-supported collaboration and teamwork in educational environments [88] , and student''s engagement and interactions [89, 90] ; (c) user and case studies on ICTs in education [91, 92] ; (d) educational data in serious games and gamification: gamification design [93] [94] [95] [96] , serious game mechanics for education [97, 98] , ubiquitous/pervasive gaming [99] , and game-based learning and teaching programming [100, 101] ; and (e) educational data visualization and data mining [102] : learning analytics [103] , knowledge discovery [104] , user experience [105, 106] , social impact [107] , good practices [108] , and accessibility [109, 110] . doi = 10.1007/s10209-020-00765-0 id = cord-356353-e6jb0sex author = Fourcade, Marion title = Loops, ladders and links: the recursivity of social and machine learning date = 2020-08-26 keywords = Bourdieu; Facebook; Twitter; datum; learning; machine; medium; people; platform; social; system summary = Both practices rely upon and reinforce a pervasive appetite for digital input or feedback that we characterize as "data hunger." They also share a propensity to assemble insight and make meaning accretively-a propensity that we denote here as "world or meaning accretion." Throughout this article, we probe the dynamic interaction of social and machine learning by drawing examples from one genre of online social contention and connection in which the pervasive influence of machine learning is evident: namely, that which occurs across social media channels and platforms. In such settings, the data accretion upon which machine learning depends for the development of granular insights-and, on social media platforms, associated auctioning and targeting of advertising-compounds the cumulative, sedimentary effect of social data, making negative impressions generated by "revenge porn," or by one''s online identity having been fraudulently coopted, hard to displace or renew. doi = 10.1007/s11186-020-09409-x id = cord-016448-7imgztwe author = Frishman, D. title = Protein-protein interactions: analysis and prediction date = 2009-10-01 keywords = Cytoscape; Fig; PSI; datum; domain; interaction; network; protein summary = In general, investigating the topology of protein interaction, metabolic, signaling, and transcriptional networks allows researchers to reveal the fundamental principles of molecular organization of the cell and to interpret genome data in the context of large-scale experiments. The basic principle is fairly simple and rests implicitly on a multigraph representation: several interaction networks to be integrated, each resulting from a specific experimental or predictive method, are defined over the same set of proteins. This software provides functionalities for (i) generating biological networks, either manually or by importing interaction data from various sources, (ii) filtering interactions, (iii) displaying networks using graph layout algorithms, (iv) integrating and displaying additional information like gene expression data, and (v) performing analyses on networks, for instance, by calculating topological network properties or by identifying functional modules. The evidence can be derived from literature mining, functional associations based on Gene Ontology annotations, co-occurrence of transcriptional motifs, correlation of expression data, sequence similarity, common protein domains, shared metabolic pathway membership, and protein-protein interactions. doi = 10.1007/978-3-211-75123-7_17 id = cord-275300-4phjvxat author = Galván‐Casas, C. title = Sars‐CoV‐2 infection: the same virus can cause different cutaneous manifestations: reply from authors date = 2020-06-22 keywords = datum summary = key: cord-275300-4phjvxat title: Sars‐CoV‐2 infection: the same virus can cause different cutaneous manifestations: reply from authors cord_uid: 4phjvxat We have reported and included in the supplementary material a few cases that were noticed by their doctors and were the first descriptions of enanthem in COVID‐19. Given the low number of cases and their non‐systematic acquisition, we avoided any analysis of these data. We have reported and included in the supplementary material a few cases that were noticed by their doctors and were the first descriptions of enanthem in COVID-19. Given the low number of cases and their non-systematic acquisition, we avoided any analysis of these data. All the included patients gave informed consent before incorporating their data in the study. Sars-CoV-2 infection: the same virus can cause different cutaneous manifestations Classification of the cutaneous manifestations of COVID-19: a rapid prospective nationwide consensus study in Spain with 375 cases doi = 10.1111/bjd.19317 id = cord-233012-ltbvpv8b author = Garcia-Gasulla, Dario title = Global Data Science Project for COVID-19 Summary Report date = 2020-06-10 keywords = COVID-19; Instagram; March; datum summary = doi = nan id = cord-339886-th1da1bb author = Gardy, Jennifer L. title = Towards a genomics-informed, real-time, global pathogen surveillance system date = 2017-11-13 keywords = Ebola; Health; Zika; datum; genomic; pathogen; surveillance summary = Given that outbreaks of emerging infectious diseases (EIDs) most often occur in settings with minimal laboratory capacity, where routine culture and bench-top sequencing are simply not feasible, the need for a portable diagnostic platform capable of in situ clinical metagenomics and outbreak surveillance is evident. Portable genome sequencing technology and digital epidemiology platforms form the foundation for both real-time pathogen and disease surveillance systems and outbreak response efforts, all of which exist within the One Health context, in which surveillance, outbreak detection and response span the human, animal and environmental health domains. For example, genome sequences from a raccoon-associated variant of rabies virus (RRV), when paired with fine-scale geographic information and data from Canadian and US wildlife rabies vaccination programmes, demonstrated that multiple cross-border incursions were responsible for the expansion of RRV into Canada and sustained outbreaks in several provinces 70 ; this finding led to renewed concern about and action against rabies on the part of public health authorities 71 . doi = 10.1038/nrg.2017.88 id = cord-025550-nr3goxs5 author = Gizelis, Christos-Antonios title = Towards a Smart Port: The Role of the Telecom Industry date = 2020-05-04 keywords = Smart; datum; port summary = "DataPorts project aims to boost the transition of European seaports from connected and digital to smart and cognitive, by providing a secure environment for the aggregation and integration of data coming from different sources existing in the digital ports and owned by diverse stakeholders, so that the whole port community could benefit from this data in order to improve their processes, offer new services and devise new AI based and data driven business models" [10] . A Telecom/ICT Provider in order to enter this emerging ecosystem and potentially benefit from its growth should firstly address real-life data market use cases in Ports that are related to its areas of operations. DataPorts since January 2020 is planning to implement a data management platform to be operated by Port Authorities in order to provide advanced services (Fig. 1) and create a value-chain between stakeholders, internal and external ones (Fig. 2 ). doi = 10.1007/978-3-030-49190-1_12 id = cord-296208-uy1r6lt2 author = Greenspan, Hayit title = Position paper on COVID-19 imaging and AI: from the clinical needs and technological challenges to initial AI solutions at the lab and national level towards a new era for AI in healthcare date = 2020-08-19 keywords = COVID-19; CXR; datum; disease summary = We focus on three specific use-cases for which AI systems can be built: early disease detection, management in a hospital setting, and building patient-specific predictive models that require the combination of imaging with additional clinical data. Many studies have emerged in the last several months from the medical imaging community with many research groups as well as companies introducing deep learning based solutions to tackle the various tasks: mostly in detection of the disease (vs normal), and more recently also for staging disease severity. In Section 2 of this paper we focus on three specific use-cases for which AI systems can be built: detection, patient management, and predictive models in which the imaging is combined with additional clinical features. Rapid ai development cycle for the coronavirus (covid-19) pandemic: Initial results for automated detection and patient monitoring using deep learning ct image analysis doi = 10.1016/j.media.2020.101800 id = cord-146850-5x6qs2i4 author = Gupta, Abhishek title = The State of AI Ethics Report (June 2020) date = 2020-06-25 keywords = Ethics; datum; different; example; human; impact; information; lead; like; need; people; social; system; work summary = Another point brought up in the article is that social media companies might themselves be unwilling to tolerate scraping of their users'' data to do this sort of vetting which against their terms of use for access to the APIs. Borrowing from the credit reporting world, the Fair Credit Reporting Act in the US offers some insights when it mentions that people need to be provided with a recourse to correct information that is used about them in making a decision and that due consent needs to be obtained prior to utilizing such tools to do a background check. Given that AI systems operate in a larger socio-technical ecosystem, we need to tap into fields like law and policy making to come up with effective ways of integrating ethics into AI systems, part of which can involve creating binding legal agreements that tie in with economic incentives.While policy making and law are often seen as slow to adapt to fast changing technology, there are a variety of benefits to be had, for example higher customer trust for services that have adherence to stringent regulations regarding privacy and data protection. doi = nan id = cord-198180-pwmr3m4o author = Gupta, Deepti title = Future Smart Connected Communities to Fight COVID-19 Outbreak date = 2020-07-20 keywords = IoT; covid-19; datum; device; internet; smart summary = doi = nan id = cord-290251-ihq8gdwj author = Hasell, Joe title = A cross-country database of COVID-19 testing date = 2020-10-08 keywords = datum; source; test summary = The database consists of two parts, provided for each included country: (1) a time series for the cumulative and daily number of tests performed, or people tested, plus derived variables (discussed below); (2) metadata including a detailed description of the source and any available information on data quality or comparability issues needed for the interpretation of the time series. Firstly, for a number of countries, figures reported in official sources -including press releases, government websites, dedicated dashboards, and social media accounts of national authorities -are recorded manually as they are released. The time series for cumulative and daily testing for each country-series is then provided in the covid-testing-all-observations.csv file. In covid-testing-all-observations.csv, for those sources only providing daily testing figures, this field is derived as the running total of the raw daily data, and is also provided per thousand people of the country''s 2020 population. doi = 10.1038/s41597-020-00688-8 id = cord-137263-mbww0yyt author = Hayashi, Teruaki title = Data Requests and Scenarios for Data Design of Unobserved Events in Corona-related Confusion Using TEEDA date = 2020-09-08 keywords = COVID-19; TEEDA; datum summary = Using TEEDA, we collect data items (data requests and providable data) in the corona-related confusion in the workshop, discuss the characteristics of missing data, and create three scenarios for data design of unobserved events focusing on variables. In this study, this item will be useful for understanding what types of data and variables are needed and for what purpose in regard to corona-related confusion. The aim of the experiment was to understand the characteristics of data requests and providable data in the corona-related confusion and create scenarios for new data design of unobserved events focusing on variables. Subsequently, participants input the information on the data requests and the providable data about corona-related confusion on TEEDA for 45 min via discussion with other participants. In this study, to discuss the data design of unobserved events in corona-related confusion, we used TEEDA to externalize the information about data items from data users and data providers and analyzed their characteristics. doi = nan id = cord-289447-d93qwjui author = Helmy, Mohamed title = Systems biology approaches integrated with artificial intelligence for optimized food-focused metabolic engineering date = 2020-10-09 keywords = annotation; datum; metabolic; model summary = Here, we review the latest attempts of combining systems biology and AI in metabolic engineering research, and highlight how this alliance can help overcome the current challenges facing industrial biotechnology, especially for food-related substances and compounds using microorganisms. On the other hand, Jervis et al implemented an ML algorithm to model the bacterial ribosome binding sites (RBSs) sequence-phenotype relationship and accurately predicted the optimal high-producers, an approach that directly apply on wide range of metabolic engineering applications [106] . To understand the key regulatory or emergent bottleneck scenarios that limit their industrial applicability, they undertook a large scale -omics based systems biology approach where they performed time-series proteomics and metabolomics measurements, and analyzed the resultant high-throughput data using statistical analytics and genome-scale modeling. Although genome annotation, both structural and functional, affects most of the biomedical research aspects, it has a special impact on metabolic engineering in general and applications in food industry in particular. doi = 10.1016/j.mec.2020.e00149 id = cord-010406-uwt95kk8 author = Hu, Paul Jen-Hwa title = System for Infectious Disease Information Sharing and Analysis: Design and Evaluation date = 2007-07-10 keywords = BioPortal; IDI; University; datum; system summary = Motivated by the importance of infectious disease informatics (IDI) and the challenges to IDI system development and data sharing, we design and implement BioPortal, a Web-based IDI system that integrates cross-jurisdictional data to support information sharing, analysis, and visualization in public health. In this paper, we discuss general challenges in IDI, describe BioPortal''s architecture and functionalities, and highlight encouraging evaluation results obtained from a controlled experiment that focused on analysis accuracy, task performance efficiency, user information satisfaction, system usability, usefulness, and ease of use. To support the surveillance and detection of infectious disease outbreaks by public health professionals, we design and implement the BioPortal system, a web-based IDI system that provides convenient access to distributed, cross-jurisdictional health data pertaining to several major infectious diseases including West Nile virus (WNV), foot-and-mouth disease (FMD), and botulism. doi = 10.1109/titb.2007.893286 id = cord-021088-9u3kn9ge author = Huberty, Mark title = Awaiting the Second Big Data Revolution: From Digital Noise to Value Creation date = 2015-02-18 keywords = Flu; Google; big; datum; online summary = Instead, today''s successful big data business models largely use data to scale old modes of value creation, rather than invent new ones altogether. Four of these assumptions merit special attention: First, N = all, or the claim that our data allow a clear and unbiased study of humanity; second, that today = tomorrow, or the claim that understanding online behavior today implies that we will still understand it tomorrow; third, offline = online, the claim that understanding online behavior offers a window into economic and social phenomena in the physical world; and fourth, that complex patterns of social behavior, once understood, will remain stable enough to become the basis of new data-driven, predictive products and services in sectors well beyond social and media markets. The rate of change in online commerce, social media, search, and other services undermines any claim that we can actually know that our N = all sample that works today will work tomorrow. doi = 10.1007/s10842-014-0190-4 id = cord-030772-swha1e4m author = Huizinga, Tom W J title = Interpreting big-data analysis of retrospective observational data date = 2020-08-21 keywords = datum summary = 1 In The Lancet Rheumatology, Jennifer Lane and col leagues present a study using claims data and elec tronic medical records (mostly of patients with rheuma toid arthritis) to analyse the longterm risks of cardiovas cular complications (among other outcomes) in about 1 000 000 users of hydroxychloroquine compared with more than 300 000 users of sulfasalazine. It has been convincingly shown that most published data are false, 4 and the corollary that the hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true is a relevant consideration given the recent discussions around use of hydroxychloroquine in patients with COVID19. It is important to note that the authors used stateoftheart methods to deal with the chal lenges of studying retrospective electronic medical record data; they did a newuser cohort study and a selfcontrolled case series to avoid the risk of bias in a casecontrol design, using propensity scores, fitting models with ten-fold cross validation, and negative control outcome analyses. doi = 10.1016/s2665-9913(20)30289-7 id = cord-253918-8g3erth8 author = Ienca, Marcello title = On the responsible use of digital data to tackle the COVID-19 pandemic date = 2020-03-27 keywords = COVID-19; datum summary = doi = 10.1038/s41591-020-0832-5 id = cord-302648-16aq6ai4 author = Iovanovici, Alexandru title = A dataset of urban traffic flow for 13 Romanian cities amid lockdown and after ease of COVID19 related restrictions date = 2020-09-17 keywords = datum; traffic summary = Considering the relative scarcity of real-life traffic data, one can use this data set for micro-simulation during development and validation of Intelligent Transportation Solutions (ITS) algorithms while another facet would be in the area of social and political sciences when discussing the effectiveness and impact of statewide restriction during the COVID19 pandemic. • The main usage of the data, in the field of ITS, is to provide real-life data from a variety of Romanian cities (ranging from small to large in population, area and road network size) useful for training machine learning algorithms for prediction of congestion and for simulation of the impact of traffic incidents over the traffic flow. These are stored into the ./xml.zip archive and follow the naming structure _-For a more depth and complete analysis , taking into account the context of the data (the transportation and traffic restrictions imposed on the national level by the SARS-CoV-2/COVID19 pandemic) we present in Table 3 the most important events with impact over the traffic flow. doi = 10.1016/j.dib.2020.106318 id = cord-025827-vzizkekp author = Jarke, Matthias title = Data Sovereignty and the Internet of Production date = 2020-05-09 keywords = Data; IDS; datum summary = 2006) to the inter-organizational setting by introducing the idea of Industrial Data Spaces as the kernel of platforms in which specific industrial ecosystems could organize their cooperation in a data-sovereign manner (Jarke 2017; Jarke and Quix 2017) . Via numerous use case experiments, the International Data Space (IDS) Association with currently roughly 100 corporate members worldwide has evolved, and agreed on a reference architecture now already in version 3 . In Fig. 1 , we referred to the service-dominant business logic underlying most alliance-driven data ecosystems including the IDS. In this 7-year effort, 27 research groups from production and materials engineering, computer science, business and social sciences cooperate to study not just the sovereign data exchange addressed by the IDS Architecture in a fully globalized setting, but also the question of how to communicate between model-and data-driven approaches of vastly different disciplines and scales. doi = 10.1007/978-3-030-49435-3_34 id = cord-295450-ca7ll1tt author = Jia, Peng title = Early warning of epidemics: towards a national intelligent syndromic surveillance system (NISSS) in China date = 2020-10-26 keywords = China; NISSS; datum summary = The outbreak of the COVID-19 has further advanced the demand for an intelligent disease reporting system, also known as the national intelligent syndromic surveillance system (NISSS), 1 which would be able to analyse these suspected cases on the basis of prior knowledge and real-time information before a disease is confirmed clinically and in the laboratory. ► Literature databases containing valuable research findings and knowledge and internet activity data reflecting cyber user awareness should be incorporated into the NISSS in a real-time way for warning or fighting the epidemic. ► The International Institute of Spatial Lifecourse Epidemiology (ISLE), a global health collaborative research network, has committed to working with multiple stakeholders to codevelop the NISSS in China. Such data-sharing mechanisms and infrastructures would also facilitate timely spatial epidemiological research on the basis of individual-level infected cases linked with respective location data from mobile service providers and/or smartphone-based apps without violating confidentiality requirements. doi = 10.1136/bmjgh-2020-002925 id = cord-024865-umrlsbh5 author = Jiang, Shufan title = Towards the Integration of Agricultural Data from Heterogeneous Sources: Perspectives for the French Agricultural Context Using Semantic Technologies date = 2020-04-29 keywords = datum; ontology summary = Our objective is to integrate such heterogeneous data into knowledge bases that can support farmers in their activities, and to present global, real-time and comprehensive information to researchers. Indeed, important information related to agriculture can also come from different sources such as official periodic reports and journals like the French Plants Health Bulletins (BSV, for its name in French Bulletin de Santé du Végétal ) 1 , social media such as Twitter and farmers experiences. The French National Institute For Agricultural Research (INRA) has been working towards the publishing of the bulletins as Linked Open Data [12] , where BSV from different regions are centralized, tagged with crop type, region, date and published on the Internet. We have introduced in this paper work relevant to our problem, namely: the integration of several data sources to extract information related to the natural hazards in agriculture. doi = 10.1007/978-3-030-49165-9_8 id = cord-225826-bwghyhqx author = Jiang, Zheng title = Combining Visible Light and Infrared Imaging for Efficient Detection of Respiratory Infections such as COVID-19 on Portable Device date = 2020-04-15 keywords = Fig; datum; respiratory summary = doi = nan id = cord-349790-dezauioa author = Johnson, Stephanie title = Ethical challenges in pathogen sequencing: a systematic scoping review date = 2020-06-03 keywords = HIV; datum; public; research summary = Methods: We systematically searched indexed academic literature from PubMed, Google Scholar, and Web of Science from 2000 to April 2019 for peer-reviewed articles that substantively engaged in discussion of ethical issues in the use of pathogen genome sequencing technologies for diagnostic, surveillance and outbreak investigation. We systematically searched indexed academic literature from PubMed, Google Scholar, and Web of Science from 2000 to April 2019 for peer-reviewed articles that substantively engaged in discussion of ethical issues in the use of pathogen genome sequencing technologies for diagnostic, surveillance and outbreak investigation. Implementation science research may also inform best practices for discussing the meaning and limitations of sequence data and cluster membership with community members and help to identify acceptable and evidence-based approaches that impose the least risk to persons within specific contexts. Many noted that there are important reasons to ensure that the public and individuals understand the uses of data collected as part of a sequencing studies, and the potential risks. doi = 10.12688/wellcomeopenres.15806.1 id = cord-144221-ohorip57 author = Kapoor, Mudit title = Authoritarian Governments Appear to Manipulate COVID Data date = 2020-07-19 keywords = Benford; COVID-19; datum summary = First, data on COVID-19 cases and deaths from authoritarian governments show significantly less variation from a 7 day moving average. Second, data on COVID-19 deaths from authoritarian governments do not follow Benford''s law, which describes the distribution of leading digits of numbers. Figure 2 plots the natural logarithm of the mean of the squared deviation of daily cases and deaths per million people, respectively, from the 7 day moving average against the EIU''s overall democracy index score. We investigate whether governments manipulate data by testing whether the COVID-19 data on cumulative cases and deaths across different regimes (authoritarian, hybrid, flawed democracy, and full democracy) confirms to Benford''s law. Natural logarithm of the Mean of squared deviations of observed daily cases and deaths per million people from a 7-day centered moving average, by EIU democracy index score. doi = nan id = cord-315510-vtt8wvm1 author = Keogh, John G. title = Optimizing global food supply chains: The case for blockchain and GSI standards date = 2020-10-16 keywords = Blockchain; FSC; GS1; datum; food; standard; technology; traceability summary = This chapter examines the integration of GS1 standards with the functional components of blockchain technology as an approach to realize a coherent standardized framework of industry-based tools for successful food supply chains (FSCs) transformation. A standardized framework will enhance food traceability, drive FSC efficiencies, enable data interoperability, improve data governance practices, and set supply chain identification standards for products and assets (what), exchange parties (who), locations (where), business processes (why), and sequence (when). The technological attributes of Blockchain can combine with smart contracts to enable decentralized and self-organization to create, execute, and manage business transactions (Schaffers, 2018) , creating a landscape for innovative approaches to information and collaborative systems. The adoption of GS1 standards-enabled Blockchain technology has the potential to enable FSC stakeholders to meet the fast-changing needs of the agri-food industry and the evolving regulatory requirements for enhanced traceability and rapid recall of unsafe goods. doi = 10.1016/b978-0-12-818956-6.00017-8 id = cord-351652-y8p3iznq author = Keogh, John G. title = Data and food supply chain: Blockchain and GS1 standards in the food chain: a review of the possibilities and challenges date = 2020-07-10 keywords = Blockchain; FSC; GS1; datum; food; standard; technology; traceability summary = This chapter examines the integration of GS1 standards with the functional components of Blockchain technology as an approach to realize a coherent standardized framework of industry-based tools for successful food supply chain transformation. The technological attributes of Blockchain can combine with smart contracts to enable decentralized and self-organization to create, execute, and manage business transactions (Schaffers, 2018) , creating a landscape for innovative approaches to information and collaborative systems. The adoption of GS1 standards-enabled Blockchain technology has the potential to enable FSC stakeholders to meet the fast-changing needs of the agri-food industry and the evolving regulatory requirements for enhanced traceability and rapid recall of unsafe goods. Closely resembling the role and function of the EHR in the healthcare industry, the creation of a Digital Food Record (DFR) is vital for FSCs to facilitate whole-chain traceability, interoperability, linking the different actors and data creators in the chain, and enhancing trust in the market on each product delivered. doi = 10.1016/b978-0-12-818956-6.00007-5 id = cord-016146-2g893c2r author = Kim, Yeunbae title = Artificial Intelligence Technology and Social Problem Solving date = 2019-03-14 keywords = datum; problem; social summary = In this letter, we will present the views on how AI and ICT technologies can be applied to ease or solve social problems by sharing examples of research results from studies of social anxiety, environmental noise, mobility of the disabled, and problems in social safety. In this letter, I introduce research on the informatics platform for social problem solving, specifically based on spatio-temporal data, conducted by Hanyang University and cooperating institutions. The research focuses on social problems that involve spatio-temporal information, and applies social scientific approaches and data-analytic methods on a pilot basis to explore basic research issues and the validity of the approaches. Furthermore, (1) open-source informatics using convergent-scientific methodology and models, and (2) the spatio-temporal data sets that are to be acquired in the midst of exploring social problems for potential resolution are developed. Convergent approaches offer the new possibility of building an informatics platform that can interpret, predict and solve various social problems through the combination of social science and data science. doi = 10.1007/978-981-13-6936-0_2 id = cord-344152-pb1e2w7s author = Kolatkar, Anand title = C-ME: A 3D Community-Based, Real-Time Collaboration Tool for Scientific Research and Training date = 2008-02-20 keywords = 3-d; MOSS; datum summary = Collaborative Molecular Modeling Environment (C-ME) is an interactive community-based collaboration system that allows researchers to organize information, visualize data on a two-dimensional (2-D) or three-dimensional (3-D) basis, and share and manage that information with collaborators in real time. These annotations provide additional information about the atomic structure or image data that can then be evaluated, amended or added to by other project members. For example, protein structure/activity data annotations and images may be kept in paper lab notebooks, manuscripts might be stored electronically in Portable Document Format (PDF), and molecular structure coordinate files may be stored on a hard disk to be viewed and analyzed in graphical molecular viewers, to name a few. Most recently we have developed the Collaborative Molecular Modeling Environment (C-ME), a new collaboratory system that integrates many of the key features available on Kinemage, MICE, iSee, and BioCoRE systems into one thin-client Windows application. doi = 10.1371/journal.pone.0001621 id = cord-033721-o1c7m9wy author = Kostovska, Ana title = Semantic Description of Data Mining Datasets: An Ontology-Based Annotation Schema date = 2020-09-19 keywords = datum summary = To semantically describe a DM dataset, we consider three different types of vocabularies/ontologies: (1) vocabularies for annotation of provenance information, such as title, description, license, and format; (2) ontologies for annotation of datasets with DM-specific characteristics, i.e., data mining task, datatypes, and dataset specification; and (3) ontologies for annotation of domain-specific knowledge that helps to contextualize the data originating from a given domain. After describing the four characteristics that govern the modeling of the taxonomies of datatypes, data specification, and tasks, we provide an illustrative example that shows how we can combine them in a single annotation schema for the purpose of semantic annotation of DM datasets. To represent the MTR task and MTR dataset specification, we use the classes defined in OntoDM-core, and connect them with the corresponding datatype class from OntoDT (in our case OntoDT: feature-based completely labeled data with record of numeric ordered primitive output) (see Fig. 7 b) . doi = 10.1007/978-3-030-61527-7_10 id = cord-025576-8oqfn4rg author = Kotouza, Maria Th. title = Towards Fashion Recommendation: An AI System for Clothing Data Retrieval and Analysis date = 2020-05-06 keywords = algorithm; cluster; datum summary = The system combines natural language processing (NLP) techniques to analyze the information accompanying the clothing images, computer vision algorithms to extract characteristics from the images and enrich their meta-data, and machine learning techniques to analyze the raw data and to train models that can facilitate the decision-making process. Several research works have been presented in the field of clothing data analysis, most of them involving clothing classification and feature extraction based on images, dataset creation, as well as product recommendation. In this work, apart from proposing an AI system which involves many subsystems as part of the clothing design process that can be combined together in order to help the designers with the decision-making process, we emphasize on the data collection, meta-data analysis and clustering techniques that can be applied to improve recommendations. doi = 10.1007/978-3-030-49186-4_36 id = cord-328438-irjo0l4s author = Krittanawong, Chayakrit title = Integration of novel monitoring devices with machine learning technology for scalable cardiovascular management date = 2020-10-09 keywords = ECG; datum; device; monitoring; patient summary = Advances in cardiovascular monitoring technologies, such as the use of ubiquitous mobile devices and the development of novel portable sensors with seamless wireless connectivity and machine learning algorithms that can provide specialist-level diagnosis in near real time, have the potential for a more personalized care. Machine learning is a rapidly developing branch of AI that has shown early promise for use in cardiovascular medicine 61 through the extraction of clinically relevant patterns from complex data, such as detecting myocardial ischaemia from cardiac CT images 62 and interpreting arrhythmias from wearable ECG monitors 33 . Machine learning technology (''deep learning'') 60 has also been shown to improve the performance of shock advice algorithms in an automated external defibrillator 66 to predict the onset of ventricular arrhythmias with the use of an artificial neural network 67 and to predict the onset of sudden cardiac arrest within 72 h by incorporating heart rate variability parameters with vital sign data 68 . doi = 10.1038/s41569-020-00445-9 id = cord-025506-yoav2b35 author = Kyriazis, Dimosthenis title = PolicyCLOUD: Analytics as a Service Facilitating Efficient Data-Driven Public Policy Management date = 2020-05-06 keywords = data; datum; policy summary = Prominent examples of such standards in different policy areas include: (i) the INSPIRE Data Specifications [15] for the interoperability of spatial data sets and services, which specify common data models, code lists, map layers and additional metadata on the interoperability to be used when exchanging spatial datasets, (ii) the Common European Research Information Format (CERIF) [16] for representing research information and supporting research policies, (iii) the Internet of Things ontologies and schemas, such as the W3C Semantic Sensor Networks (SSN) ontology [17] and data schemas developed by the Open Geospatial Consortium (e.g., SensorML) [18], (iv) the Common Reporting Standard (CRS) that specifies guidelines for obtaining information from financial institutions and automatically exchanging that information in an interoperable way, and (v) standards-based ontologies appropriate for describing social relationships between individuals or groups, such as the "The Friend Of A Friend" (FOAF) ontology [19] and the Socially Interconnected Online Communities (SIOC) ontology [20] . doi = 10.1007/978-3-030-49161-1_13 id = cord-287027-ahoo6j3o author = Lai, Yuan title = Unsupervised Learning for County-Level Typological Classification for COVID-19 Research date = 2020-08-30 keywords = county; covid-19; datum summary = The analysis of county-level COVID-19 pandemic data faces computational and analytic challenges, particularly when considering the heterogeneity of data sources with variation in geographic, demographic, and socioeconomic factors between counties. The purpose of this study is to summarize publicly available and relevant COVID-19 data sources, to address the benchmarking challenge from the data heterogeneity through clustering, and to classify counties J o u r n a l P r e -p r o o f based on their underlying variations. Particularly at the county-level, previous studies have implemented clustering techniques to analyze various data sources relating J o u r n a l P r e -p r o o f to demographic, geographic, environment, and socioeconomic determinants of health and disease. While previous findings reveal possible geographical clusters of COVID-19 cases at the county-level, our study indicates this is from the underlying typology based on high-dimensional variables. doi = 10.1016/j.ibmed.2020.100002 id = cord-352522-qnvgg2e9 author = Langille, Morgan G. I. title = BioTorrents: A File Sharing Service for Scientific Data date = 2010-04-14 keywords = BioTorrents; datum summary = In this study we present BioTorrents, a website that allows open access sharing of scientific data and uses the popular BitTorrent peer-to-peer file sharing technology. A BitTorrent software client (see Table 1 ) uses the data in the torrent file to contact the tracker and allow transferring of the data between computers containing either full or partial copies of the dataset. Information about each dataset on BioTorrents is supplied on a details page giving a description of the data, number of files, date added, user name of the person who created the dataset, and various other details including a link to the actual torrent file. As the number of datasets and users of BioTorrents increases, and to improve on transfer speeds on a geospatial scale (i.e. across countries and continents), we would encourage other institutions to automatically download and share all or some of the data on BioTorrents. doi = 10.1371/journal.pone.0010071 id = cord-267485-1fu1blu0 author = Lazarus, Ross title = Distributed data processing for public health surveillance date = 2006-09-19 keywords = datum; health; phi summary = All PHI in this system is initially processed within the secured infrastructure of the health care provider that collects and holds the data, using uniform software distributed and supported by the NDP. In the more traditional type of system, individual patient records, often containing potentially identifiable information, such as date of birth and exact or approximate home address, are transferred, usually in electronic form, preferably through some secured method, to a central secured repository, where statistical tools can be used to develop and refine surveillance procedures. These standard line lists are used most often to support requests by public health agencies for additional information about the individual cases that contribute to clusters identified in the aggregate data. In our experience, such requests involve only a tiny fraction of the data that would be transferred in a centralized surveillance model, providing adequate support for public health with minimal risk of inadvertent disclosure of identifiable PHI. doi = 10.1186/1471-2458-6-235 id = cord-338207-60vrlrim author = Lefkowitz, E.J. title = Virus Databases date = 2008-07-30 keywords = NCBI; database; datum; information; sequence summary = (Each arrow points to the table containing the primary key.) Tables are color-coded according to the source of the information they contain: yellow, data obtained from the original GenBank sequence record and the ICTV Eighth Report; pink, data obtained from automated annotation or manual curation; blue, controlled vocabularies to ensure data consistency; green, administrative data. While most of us store our BLAST search results as files on our desktop computers, it is useful to store this information within the database to provide rapid access to similarity results for comparative purposes; to use these results to assign genes to orthologous families of related sequences; and to use these results in applications that analyze data in the database and, for example, display the results of an analysis between two or more types of viruses showing shared sets of common genes. doi = 10.1016/b978-012374410-4.00719-6 id = cord-266626-9vn6yt8m author = Lei, Howard title = Agile Clinical Research: A Data Science Approach to Scrumban in Clinical Medicine date = 2020-10-22 keywords = datum; model; predictive summary = doi = 10.1016/j.ibmed.2020.100009 id = cord-024866-9og7pivv author = Lepenioti, Katerina title = Machine Learning for Predictive and Prescriptive Analytics of Operational Data in Smart Manufacturing date = 2020-04-29 keywords = analytic; datum; model summary = The current paper exploits the recent advancements of (deep) machine learning for performing predictive and prescriptive analytics on the basis of enterprise and operational data aiming at supporting the operator on the shopfloor. In this direction, the recent advancements of machine learning can have a substantial contribution in performing predictive and prescriptive analytics on the basis of enterprise and operational data aiming at supporting the operator on the shopfloor and at extracting meaningful insights. The current paper proposes an approach for predictive and prescriptive analytics on the basis of enterprise and operational data for smart manufacturing. 2 presents the background, the challenges and prominent methods for predictive and prescriptive analytics of enterprise and operational data for smart manufacturing. doi = 10.1007/978-3-030-49165-9_1 id = cord-330148-yltc6wpv author = Lessler, Justin title = Trends in the Mechanistic and Dynamic Modeling of Infectious Diseases date = 2016-07-02 keywords = Ebola; datum; disease; dynamic; model summary = Uncertainty was largely addressed through scenario-based approaches (e.g., different future epidemic trajectories were presented for different plausible sets of parameters), and for the most part, different aspects of the transmission dynamics were derived from independent studies, with only the growth rate (i.e., doubling time) estimated from incidence data. These recent attempts to quickly characterize the properties of emerging diseases are emblematic of an increasing focus on developing statistical methods, grounded in dynamical models, to estimate key epidemic parameters based on diverse data sources. High-resolution geographic data can gain additional power when paired with mechanistic models that capture changes in disease risk, as in recent analyses that accounted for the effect of birth, natural infection, and vaccine disruptions driving increases in measles susceptibility and epidemic risk in the wake of the Ebola outbreak [63] . The formal statistical integration of population genetic and epidemic models allows us to estimate the critical epidemiological parameters such as the basic reproductive number directly from pathogen sequence data [75] [76] [77] . doi = 10.1007/s40471-016-0078-4 id = cord-004464-nml9kqiu author = Lhommet, Claire title = Predicting the microbial cause of community-acquired pneumonia: can physicians or a data-driven method differentiate viral from bacterial pneumonia at patient presentation? date = 2020-03-06 keywords = datum; patient; pneumonia summary = title: Predicting the microbial cause of community-acquired pneumonia: can physicians or a data-driven method differentiate viral from bacterial pneumonia at patient presentation? Whether the etiology of CAP is viral or bacterial should be determined based on the patient interview, clinical symptoms and signs, biological findings and radiological data from the very first hours of the patient''s presentation (a time when microbiological findings are typically not yet available). The aim of our study was to evaluate and compare the abilities of experienced physicians and a data-driven approach to answer this simple question within the first hours of a patient''s admission to the ICU for CAP: is it a viral or a bacterial pneumonia? Step 2: clinician and data-driven predictions of microbial etiology Clinicians and a mathematical algorithm were tasked with predicting the microbial etiology of pneumonia cases based on all clinical (43 items), and biological or radiological (17 items) information available in the first 3-h period after admission except for any microbiological findings (Supplementary Table 1 ). doi = 10.1186/s12890-020-1089-y id = cord-315610-ihh521ur author = Lu, Qiang title = KDE Bioscience: Platform for bioinformatics analysis workflows date = 2005-10-11 keywords = Bioscience; KDE; datum; workflow summary = KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. In this paper, we present a significant integrative informatics platform, Knowledge Discovery Environment of Bioscience (KDE Bioscience), which is supposed to provide a solution of integration of biological data, algorithms, computing hardware, and biologist intelligence for bioinformatics. Providing biologists with an easyto-use bioinformatics platform requires the integration of sequence and annotation data in different formats from DBMS, flat files, and web pages. KDE Bioscience provides a mechanism for metadata processing that executes before the workflow operates on the actual data. KDE Bioscience has so far collected more than 60 commonly used bioinformatics programs covering the analysis and alignment of nucleotide and protein sequences. KDE Bioscience, which adopts workflow and J2EE, provides an integrative platform for biologists to collaborate and use distributed computing resources in a simple manner. doi = 10.1016/j.jbi.2005.09.001 id = cord-344307-541hu7so author = Marsch, Lisa A. title = Digital health data-driven approaches to understand human behavior date = 2020-07-12 keywords = behavior; datum; digital; health summary = It provides a synthesis of the scientific literature evaluating how digitally derived empirical data can inform our understanding of health behavior, with a particular focus on understanding the assessment, diagnosis and clinical trajectories of psychiatric disorders. Finally, it concludes with a discussion of future directions and timely opportunities in this line of research and its clinical application, including the development of personalized digital interventions (e.g., behavior change interventions) informed by digital health assessment. Overview of the scientific literature on the application of digitally derived empirical data to understand health behavior and psychopathology A robust and rapidly growing scientific literature is increasingly demonstrating the potential utility of digital assessment in revealing new insights into human behavior, including psychological and psychiatric disorders. And, the real-world precision assessment that digital health methods enable are providing unprecedented insights into human behavior and psychiatric disorders and can inform interventions that are personalizable and adaptive to individuals'' changing needs and preferences over time. doi = 10.1038/s41386-020-0761-5 id = cord-315531-2gc2dc46 author = McGarvey, Peter B. title = Systems Integration of Biodefense Omics Data for Analysis of Pathogen-Host Interactions and Identification of Potential Targets date = 2009-09-25 keywords = Bacillus; MPD; Proteomics; datum; protein summary = (1) The identification of a hypothetical protein with differential gene and protein expressions in two host systems (mouse macrophage and human HeLa cells) infected by different bacterial (Bacillus anthracis and Salmonella typhimurium) and viral (orthopox) pathogens suggesting that this protein can be prioritized for additional analysis and functional characterization. The centers have generated a heterogeneous set of experimental data using various technologies loosely defined as proteomic, but encompassing genomic, structural, immunology and protein interaction technologies, as well as more standard cell and molecular biology techniques used to validate potential targets identified via high-throughput methods. Here we describe in detail a protein-centric approach for systems integration of such a large and heterogeneous set of data from the NIAID Biodefense Proteomics program, and present scientific case studies to illustrate its application to facilitate the basic understanding of pathogen-host interactions and for the identification of potential candidates for therapeutic or diagnostic targets. doi = 10.1371/journal.pone.0007162 id = cord-160526-27kmder5 author = Meyer, R. Daniel title = Statistical Issues and Recommendations for Clinical Trials Conducted During the COVID-19 Pandemic date = 2020-05-21 keywords = covid-19; datum; ice; pandemic; trial summary = doi = nan id = cord-003243-u744apzw author = Michael, Edwin title = Quantifying the value of surveillance data for improving model predictions of lymphatic filariasis elimination date = 2018-10-08 keywords = MDA; datum; model summary = METHODOLOGY AND PRINCIPAL FINDINGS: We report on the development of an analytical framework to quantify the relative values of various longitudinal infection surveillance data collected in field sites undergoing mass drug administrations (MDAs) for calibrating three lymphatic filariasis (LF) models (EPIFIL, LYMFASIM, and TRANSFIL), and for improving their predictions of the required durations of drug interventions to achieve parasite elimination in endemic populations. We report on the development of an analytical framework to quantify the relative values of various longitudinal infection surveillance data collected in field sites undergoing mass drug administrations (MDAs) for calibrating three lymphatic filariasis (LF) models (EPIFIL, LYM-FASIM, and TRANSFIL), and for improving their predictions of the required durations of drug interventions to achieve parasite elimination in endemic populations. doi = 10.1371/journal.pntd.0006674 id = cord-282724-zzkqb0u2 author = Moore, Jason H. title = Ideas for how informaticians can get involved with COVID-19 research date = 2020-05-12 keywords = COVID-19; SARS; datum; health; patient; research summary = Some key considerations and targets of research include: (1) feature engineering, transforming raw data into features (i.e. variables) that ML can better utilize to represent the problem/target outcome, (2) feature selection, applying expert domain knowledge, statistical methods, and/or ML methods to remove ''irrelevant'' features from consideration and improve downstream modeling, (3) data harmonization, allowing for the integration of data collected at different sites/institutions, (4) handling different outcomes and related challenges, e.g. binary classification, multi-class, quantitative phenotypes, class imbalance, temporal data, multi-labeled data, censored data, and the use of appropriate evaluation metrics, (5) ML algorithm selection for a given problem can be a challenge in itself, thus strategies to integrate the predictions of multiple machine learners as an ensemble are likely to be important, (6) ML modeling pipeline assembly, including critical considerations such as hyper-parameter optimization, accounting for overfitting, and clinical interpretability of trained models, and (7) considering and accounting for covariates as well as sources of bias in data collection, study design, and application of ML tools in order to avoid drawing conclusions based on spurious correlations. doi = 10.1186/s13040-020-00213-y id = cord-027431-6twmcitu author = Mukhina, Ksenia title = Spatiotemporal Filtering Pipeline for Efficient Social Networks Data Processing Algorithms date = 2020-05-25 keywords = Twitter; datum; location; user summary = To do that we propose a spatiotemporal data processing pipeline that is general enough to fit most of the problems related to working with LBSNs. The proposed pipeline includes four main stages: an identification of suspicious profiles, a background extraction, a spatial context extraction, and a fake transitions detection. Efficiency of the pipeline is demonstrated on three practical applications using different LBSN: touristic itinerary generation using Facebook locations, sentiment analysis of an area with the help of Twitter and VK.com, and multiscale events detection from Instagram posts. Thus, all studies based on social networks as a data source face two significant issues: wrong location information stored in the service (wrong coordinates, incorrect titles, duplicates, etc.) and false information provided by users (to hide an actual position or to promote their content). doi = 10.1007/978-3-030-50433-5_7 id = cord-295013-ew9n9i7z author = Nambiar, Devaki title = Field-testing of primary health-care indicators, India date = 2020-11-01 keywords = Health; India; Kerala; datum; indicator summary = [34] [35] [36] Objective To develop a primary health-care monitoring framework and health outcome indicator list, and field-test and triangulate indicators designed to assess health reforms in Kerala, India, 2018-2019. [34] [35] [36] Objective To develop a primary health-care monitoring framework and health outcome indicator list, and field-test and triangulate indicators designed to assess health reforms in Kerala, India, 2018-2019. As already observed in India and other low-and middle-income countries, 29 our results indicate that any approach to improving or monitoring the quality of health-care must be adaptable to local methods of data production and reporting, while ensuring that emerging concerns of local staff are considered. The Every Newborn-BIRTH study was a triangulation of maternal and newborn healthcare data in low-and middle-income countries, 47 and some smaller-scale primary-care indicator triangulation exercises have been undertaken by India''s National Health Systems Resource Centre. doi = 10.2471/blt.19.249565 id = cord-154587-qbmm5st9 author = Nguyen, Thanh Thi title = Artificial Intelligence in the Battle against Coronavirus (COVID-19): A Survey and Future Research Directions date = 2020-07-30 keywords = COVID-19; datum; deep; learning summary = doi = nan id = cord-266898-f00628z4 author = Nikitenkova, S. title = It''s the very time to learn a pandemic lesson: why have predictive techniques been ineffective when describing long-term events? date = 2020-06-03 keywords = datum; epidemic summary = Processing statistical data of countries that have reached an epidemic peak has shown that this regular monitoring obeys a simple analytical regularity which allows us to answer the question: is this or that country that has already passed the threshold of the epidemic close to its peak or is still far from it? To achieve this goal, it is necessary to identify, evaluate and study the mentioned regular component of the error, using the statistics of those countries that have already reached a peak -the stationary level of the epidemic dynamics. This regular error component explains the reason for the failure of a priori mathematical modelling of probable epidemic events in different countries of the world. Processing statistical data of countries that have reached an epidemic peak has shown that this regular monitoring obeys a simple analytical regularity. doi = 10.1101/2020.06.01.20118869 id = cord-305542-zyxqcfa3 author = Oliver, Nuria title = Mobile phone data for informing public health actions across the COVID-19 pandemic life cycle date = 2020-06-05 keywords = covid-19; datum; mobile; phone summary = In the following sections, we outline the ways in which different types of mobile phone data can help to better target and design measures to contain and slow the spread of the COVID-19 pandemic. Government and public health authorities broadly raise questions in at least four critical areas of inquiries for which the use of mobile phone data is relevant. Furthermore, around the world, public opinion surveys, social media, and a broad range of civil society actors including consumer groups and human rights organizations have raised legitimate concerns around the ethics, potential loss of privacy, and long-term impact on civil liberties resulting from the use of individual mobile data to monitor COVID-19. Governments should be aware of the value of information and knowledge that can be derived from mobile phone data analysis, especially for monitoring the necessary measures to contain the pandemic. doi = 10.1126/sciadv.abc0764 id = cord-292835-zzc1a7id author = Otoom, Mwaffaq title = An IoT-based Framework for Early Identification and Monitoring of COVID-19 Cases date = 2020-08-15 keywords = COVID-19; datum; internet summary = The proposed system would employ an Internet of Things (IoTs) framework to collect real-time symptom data from users to early identify suspected coronaviruses cases, to monitor the treatment response of those who have already recovered from the virus, and to understand the nature of the virus by collecting and analyzing relevant data. To quickly identify potential coronaviruses cases from this real-time symptom data, this work proposes eight machine learning algorithms, namely Support Vector Machine (SVM), Neural Network, Naïve Bayes, K-Nearest Neighbor (K-NN), Decision Table, Decision Stump, OneR, and ZeroR. Based on these results we believe that real-time symptom data would allow these five algorithms to provide effective and accurate identification of potential cases of COVID-19, and the framework would then document the treatment response for each patient who has contracted the virus. The proposed framework consists of five main components: (1) real-time symptom data collection (using wearable devices), (2) treatment and outcome records from quarantine/isolation centers, (3) a data analysis center that uses machine learning algorithms, (4) healthcare physicians, and (5) a cloud infrastructure. doi = 10.1016/j.bspc.2020.102149 id = cord-317602-ftcs7fvq author = O’Reilly-Shah, Vikas N. title = The COVID-19 Pandemic Highlights Shortcomings in US Health Care Informatics Infrastructure: A Call to Action date = 2020-05-12 keywords = COVID-19; EHR; datum; health summary = Although it appears that there is general consensus on the use of the Substitutable Medical Apps, Reusable Technologies on Fast Healthcare Interoperability Resources (SMART on FHIR) standard developed by the nonprofit Health Level Seven International (HL7) for the interchange of data, the standard is not specific enough to ensure, and regulators have failed to require, that different vendors implement the specification in compatible ways. To briefly recap, if hospitals across the country were able to observe and interpret data being gathered at other institutions in real time and to contribute their own data to the shared repository, the health care system could be learning about and improving its care of COVID-19 patients continuously and collaboratively, based on the sum total of available information rather than incrementally in silos. The public has a pressing interest in ensuring that data standards (eg, OMOP, FHIR) are rapidly developed, adopted by appropriate international standards organizations (eg, HL7), and implemented by EHR vendors in a manner that facilitates interoperability for individual patient care, public health, and research purposes. doi = 10.1213/ane.0000000000004945 id = cord-185121-f6vjm4j4 author = Paiva, Henrique Mohallem title = A computational tool for trend analysis and forecast of the COVID-19 pandemic date = 2020-10-20 keywords = COVID-19; datum; figure; value summary = Country-wise data from the European Centre for Disease Prevention and Control (ECDC) concerning the daily number of cases and demises around the world are used, as well as detailed data from Johns Hopkins University and from the Brasil.io project describing individually the occurrences in United States counties and in Brazilian states and cities, respectively. Conclusion: The main contributions of this work lie in (i) a straightforward model of the curves to represent the data, which allows automation of the process without requiring interventions from experts; (ii) an innovative approach for trend analysis, whose results provide important information to support authorities in their decision-making process; and (iii) the developed computational tool, which is freely available and allows the user to quickly update the COVID-19 analyses and forecasts for any country, United States county or Brazilian state or city present in the periodic reports from the authorities. doi = nan id = cord-004647-0fuy5tlp author = Patson, Noel title = Systematic review of statistical methods for safety data in malaria chemoprevention in pregnancy trials date = 2020-03-20 keywords = analysis; datum; safety; trial summary = METHODS: The search included five databases (PubMed, Embase, Scopus, Malaria in Pregnancy Library and Cochrane Central Register of Controlled Trials) to identify original English articles reporting Phase III randomized controlled trials (RCTs) on anti-malarial drugs for malaria prevention in pregnancy published from January 2010 to July 2019. This review, therefore, aims at identifying applied statistical methods and their appropriateness in the analysis of safety data in anti-malarial drugs for malaria prevention during pregnancy clinical trials. This review sought to provide a detailed overview of the actual practice of the statistical analysis of safety data in the unique setting of drug trials for the preventions of malaria in pregnancy as reflected published literature. Advantageously, methods based on causal inference framework, such as mediation analysis [28] [29] [30] [31] could be adapted/extended to assess the influence of the AEs on non-adherence in RCTs. Despite about three-quarters of the trials reporting p-values after comparing safety outcomes by treatment arms, only about half of the reviewed trials adhered to International Harmonisation Conference Guideline E9 in reporting of confidence intervals in quantifying the safety effect size [3, 4] . doi = 10.1186/s12936-020-03190-z id = cord-169484-mjtlhh5e author = Pellert, Max title = Dashboard of sentiment in Austrian social media during COVID-19 date = 2020-06-19 keywords = Austria; COVID-19; Twitter; datum summary = To track online emotional expressions of the Austrian population close to real-time during the COVID-19 pandemic, we build a self-updating monitor of emotion dynamics using digital traces from three different data sources. The interactive dashboard showcasing our data is available online under http://www.mpellert.at/covid19_monitor_austria/. We gather these data in the form of text from platforms such as Twitter and news forums, where large groups of users discuss timely issues. To fill a gap, we build a dashboard with processed data from three different sources to track the sentiment in Austrian social media during COVID-19. In addition, measures that strongly affect people''s daily lives over a long period of time, as well as high level of uncertainty, likely contribute to the unprecedented changes of collective emotional expression in online social media. doi = nan id = cord-354833-vvlsqy36 author = Peters, Bjoern title = Integrating epitope data into the emerging web of biomedical knowledge resources date = 2007 keywords = IEDB; datum; epitope summary = As described in this Innovation article, the Immune Epitope Database and Analysis Resource aims to achieve the same for the more complex and context-dependent information on immune epitopes, and to integrate this data with existing and emerging knowledge resources. With the emergence and consolidation of new databases, this information will expand to include single-nucleotide polymorphisms (SNPs), biomedical imaging and disease association, as well as immune epitope data, such as in the Immune Epitope Database and Analysis Resource (IEDB), which is the focus of this article. We accomplished this by using several hundred different fields encompassing the database, grouped into several main classes or categories, such as the literary reference, the structure of the epitope, the source organism of the epitope and information on the context of epitope recognition, such as the host species, immunization strategy and the type of assay used to detect a response. doi = 10.1038/nri2092 id = cord-001470-hn288o97 author = Pivette, Mathilde title = Drug sales data analysis for outbreak detection of infectious diseases: a systematic literature review date = 2014-11-18 keywords = datum; drug; sale summary = CONCLUSIONS: Drug sales data analyses appear to be a useful tool for surveillance of gastrointestinal and respiratory disease, and OTC drugs have the potential for early outbreak detection. Published articles were searched for on electronic databases (Pubmed, Embase, Scopus, LILACS, African Index Medicus, Cochrane Library), using combinations of the following key words: ("surveillance" OR outbreak detection OR warning system) AND (overthe-counter OR "prescription drugs" OR pharmacy OR (pharmaceutical OR drug OR medication) sales). Articles excluded based on fulltext review (no drug sales data, no infectious disease, no outbreak detection) N= 85 Figure 1 Flow chart of study selection process in a systematic review of drug sales data analysis for syndromic surveillance of infectious diseases. Nineteen of the 27 studies were descriptive retrospective studies assessing the strength of the correlation between drug sales and reference surveillance data of the corresponding disease or evaluating outbreak-detection performance [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] . doi = 10.1186/s12879-014-0604-2 id = cord-131678-rvg1ayp2 author = Ponce, Marcelo title = covid19.analytics: An R Package to Obtain, Analyze and Visualize Data from the Corona Virus Disease Pandemic date = 2020-09-02 keywords = PPE; SIR; case; datum; function summary = This paper is organized as follow: in Sec. 2 we describe the covid19.analytics , in Sec. 3 we present some examples of data analysis and visualization, in Sec. 4 we describe in detail how to deploy a web dashboard employing the capabilities of the covid19.analytics package providing full details on the implementation so that this procedure can be repeated and followed by interested users in developing their own dashboards. As the amount of data available for the recorded cases of CoViD19 can be overwhelming, and in order to get a quick insight on the main statistical indicators, the covid19.analytics package includes the report.summary function, which will generate an overall report summarizing the main statistical estimators for the different datasets. The covid19.analytics package provides three different functions to visualize the trends in daily changes of reported cases from time series data. doi = nan id = cord-339491-lyld3up2 author = Prakash, A. title = Using Machine Learning to assess Covid-19 risks date = 2020-06-23 keywords = cluster; covid; datum summary = A dataset based on these statistics were generated and was then fed into an unsupervised learning algorithm to reveal patterns and identify similar groups of people in the population. PARTICIPANTS: The adult population were considered for the analysis, development and validation of the model RESULTS: Of 1 million observations generated, 20% of them exhibited Covid symptoms and patterns, and 80% of them belonged to the asymptomatic and non-infected group of people. Using this, our proposed method captures these statistics along with some clinical background and generates a dataset on which we intend to apply an unsupervised learning algorithm to identify patterns and classify them into risk cohorts. Covid based research has evidently increased since the pandemic has struck and related resources are available extensively today, and this method has tried to capture these studies into an interpretable form for analysis and categorization of different risk cohorts that were validated against current data. doi = 10.1101/2020.06.23.20137950 id = cord-347952-k95wrory author = Prieto, Diana M title = A systematic review to identify areas of enhancements of pandemic simulation models for operational use at provincial and local levels date = 2012-03-30 keywords = datum; model; pandemic; parameter summary = Conclusions: To adequately address the concerns of the policymakers, we need continuing model enhancements in critical areas including: updating of epidemiological data during a pandemic, smooth handling of large demographical databases, incorporation of a broader spectrum of social-behavioral aspects, updating information for contact patterns, adaptation of recent methodologies for collecting human mobility data, and improvement of computational efficiency and accessibility. Conclusions: To adequately address the concerns of the policymakers, we need continuing model enhancements in critical areas including: updating of epidemiological data during a pandemic, smooth handling of large demographical databases, incorporation of a broader spectrum of social-behavioral aspects, updating information for contact patterns, adaptation of recent methodologies for collecting human mobility data, and improvement of computational efficiency and accessibility. Of the existing computer simulation models addressing PHP, those focused on disease spread and mitigation of pandemic influenza (PI) have been recognized by the public health officials as useful decision support tools for preparedness planning [1] . doi = 10.1186/1471-2458-12-251 id = cord-261809-ccc8wzne author = Ram, Natalie title = Mass Surveillance in the Age of COVID-19 date = 2020-05-08 keywords = Amendment; Fourth; datum summary = doi = 10.1093/jlb/lsaa023 id = cord-301888-f1drinpl author = Raoult, Didier title = Lancet gate: A matter of fact or a matter of concern date = 2020-09-22 keywords = datum summary = This shows that hic et nunc (here and now), there is not a single 23 truth, but at this stage there are opinions, each one having data that it analyzes in the most 24 appropriate way with the method considered best to answer yes to the hypothesis (3). In fact, the studies reported by the physicians themselves may correct dubious data by their own experience, the computer will 32 not. In practice, under these conditions, nothing is verifiable anymore and a painful 33 experience has just shown us this with the episode of Surgisphere who managed to publish in 34 the two best journals of the medical world, series whose sources are unknown, whose 35 methods are unknown and were retracted. The most extreme case was recently revealed in London, where the 45 most rated restaurant on TripAdvisor called "The Shed at Dulwich" did not exist, and which 46 was, in fact, pure farce fuelled by false comments placed on TripAdvisor. doi = 10.1016/j.nmni.2020.100758 id = cord-327651-yzwsqlb2 author = Ray, Bisakha title = Network inference from multimodal data: A review of approaches from infectious disease transmission date = 2016-09-06 keywords = bayesian; datum; method; network; transmission summary = In infectious disease transmission network inference, Bayesian inference frameworks have been primarily used to integrate data such as dates of pathogen sample collection and symptom report date, pathogen genome sequences, and locations of patients [24] [25] [26] . Pathogen genomic data can capture within-host pathogen diversity (the product of effective population size in a generation and the average pathogen replication time [25, 26] ) and dynamics or provide information critical to understanding disease transmission such as evidence of new transmission pathways that cannot be inferred from epidemiological data alone [40, 41] . As molecular epidemiology and infectious disease transmission are areas in which network inference methods have been developed for bringing together multimodal data we use this review to investigate the foundational work in this specific field. In this section we briefly review multimodal integration methods for combining pathogen genomic data and epidemiological data in a single analysis, for inferring infection transmission trees and epidemic dynamic parameters. doi = 10.1016/j.jbi.2016.09.004 id = cord-346309-hveuq2x9 author = Reis, Ben Y title = An Epidemiological Network Model for Disease Outbreak Detection date = 2007-06-26 keywords = datum; model; network; stream summary = CONCLUSIONS: The integrated network models of epidemiological data streams and their interrelationships have the potential to improve current surveillance efforts, providing better localized outbreak detection under normal circumstances, as well as more robust performance in the face of shifts in health-care utilization during epidemics and major public events. In order to both improve overall detection performance and reduce vulnerability to baseline shifts, we introduce a general class of epidemiological network models that explicitly capture the relationships among epidemiological data streams. In order to evaluate the practical utility of this approach for surveillance, we constructed epidemiological network models based on real-world historical health-care data and compared their outbreak-detection performance to that of standard historical models. In this study, the researchers developed a new class of surveillance systems called ''''epidemiological network models.'''' These systems aim to improve the detection of disease outbreaks by monitoring fluctuations in the relationships between information detailing the use of various health-care resources over time (data streams). doi = 10.1371/journal.pmed.0040210 id = cord-159103-dbgs2ado author = Rieke, Nicola title = The Future of Digital Health with Federated Learning date = 2020-03-18 keywords = datum; learning; model; training summary = The medical FL use-case is inherently different from other domains, e.g. in terms of number of participants and data diversity, and while recent surveys investigate the research advances and open questions of FL [14, 11, 15] , we focus on what it actually means for digital health and what is needed to enable it. Transfer Learning, for example, is a well-established approach of model-sharing that makes it possible to tackle problems with deep neural networks that have millions of parameters, despite the lack of extensive, local datasets that are required for training from scratch: a model is first trained on a large dataset and then further optimised on the actual target data. To adopt this approach into a form of collaborative learning in a FL setup with continuous learning from different institutions, the participants can share their model with a peer-to-peer architecture in a "round-robin" or parallel fashion and train in turn on their local data. doi = nan id = cord-327784-xet20fcw author = Rieke, Nicola title = The future of digital health with federated learning date = 2020-09-14 keywords = Federated; data; datum; model summary = We envision a federated future for digital health and with this perspective paper, we share our consensus view with the aim of providing context and detail for the community regarding the benefits and impact of FL for medical applications (section "Datadriven medicine requires federated efforts"), as well as highlighting key considerations and challenges of implementing FL for digital health (section "Technical considerations"). FL addresses this issue by enabling collaborative learning without centralising data (subsection "The promise of federated efforts") and has already found its way to digital health applications (subsection "Current FL efforts for digital health"). Current FL efforts for digital health Since FL is a general learning paradigm that removes the data pooling requirement for AI model development, the application range of FL spans the whole of AI for healthcare. Multi-institutional deep learning modeling without sharing patient data: a feasibility study on brain tumor segmentation doi = 10.1038/s41746-020-00323-1 id = cord-274019-dao10kx9 author = Rife, Brittany D title = Phylodynamic applications in 21(st) century global infectious disease research date = 2017-05-08 keywords = Bayesian; HIV-1; datum; disease; population summary = These innovative tools have greatly enhanced scientific investigations of the temporal and geographical origins, evolutionary history, and ecological risk factors associated with the growth and spread of viruses such as human immunodeficiency virus (HIV), Zika, and dengue and bacteria such as Methicillin-resistant Staphylococcus aureus. CONCLUSIONS: Capitalizing on an extensive review of the literature, we discuss the evolution of the field of infectious disease epidemiology and recent accomplishments, highlighting the advancements in phylodynamics, as well as the challenges and limitations currently facing researchers studying emerging pathogen epidemics across the globe. The reliance on phylodynamic methods for estimating a pathogen''s population-level characteristics (e.g., effective population size) and their relationships with epidemiological data suffers from a high costincreasing the number of inference models, and thus parameters associated with these models, requires an even greater increase in the information content, or phylogenetic resolution, of the sequence alignment and associated phenotypic data. doi = 10.1186/s41256-017-0034-y id = cord-351454-mc7pifep author = Rowhani-Farid, Anisa title = What incentives increase data sharing in health and medical research? A systematic review date = 2017-05-05 keywords = datum; health; incentive; research; sharing summary = METHODS: A systematic review (registration: 10.17605/OSF.IO/6PZ5E) of the health and medical research literature was used to uncover any evidence-based incentives, with preand post-empirical data that examined data sharing rates. This review considered published journal articles with empirical data that trialed any incentive to increase data sharing in health and medical research. Articles must have tested an incentive that could increase data sharing in health and medical research. These articles did not fit the inclusion criteria, but based on the abstracts they were mostly concerned with observing data sharing patterns in the health and medical research community, using quantitative and qualitative methods. Given that the systematic review found only one incentive, we classified the data sharing strategies tested in the health and medical research community. This systematic review verified that there are few evidence-based incentives for data sharing in health and medical research. doi = 10.1186/s41073-017-0028-9 id = cord-347199-slq70aou author = Safta, Cosmin title = Characterization of partially observed epidemics through Bayesian inference: application to COVID-19 date = 2020-10-07 keywords = COVID-19; Fig; Sect; datum; model summary = The method is cast as one of Bayesian inference of the latent infection rate (number of people infected per day), conditioned on a time-series of Developing a forecasting method that is applicable in the early epoch of a partially-observed outbreak poses some peculiar difficulties. This infection rate curve is convolved with the Probability Density Function (PDF) of the incubation period of the disease to produce an expression for the time-series of newly symptomatic cases, an observable that is widely reported as "daily new cases" by various data sources [2, 5, 6] . 2, with postulated forms for the infection rate curve and the derivation of the prediction for daily new cases; we also discuss a filtering approach that is applied to the data before using it to infer model parameters. doi = 10.1007/s00466-020-01897-z id = cord-024870-79hf7q2r author = Salierno, Giulio title = An Architecture for Predictive Maintenance of Railway Points Based on Big Data Analytics date = 2020-04-29 keywords = datum; layer; railway summary = In this paper, we propose a four-layers big data architecture with the goal of establishing a data management policy to manage massive amounts of data produced by railway switch points and perform analytical tasks efficiently. The goal of our work is to design a big data architecture for enabling analytical tasks typical required by the railway industry as well as enabling an effective data management policy to allows end-users to manage huge amounts of data coming from railway lines efficiently. As already mentioned, we considered predictive maintenance as the main task of our architecture; hence to show the effectiveness of the proposed architecture, we use real data collected from points placed over the Italian railway line (Milano -Monza -Chiasso). These log files are heterogeneous in type and contain different information resumed as: Data 3 and 4 are considered to train and evaluate the proposed model to estimate the health status of the points, thus to estimate its RUL (see Sect. doi = 10.1007/978-3-030-49165-9_3 id = cord-145831-ag0xt2nj author = Schmidt, Lena title = Data Mining in Clinical Trial Text: Transformers for Classification and Question Answering Tasks date = 2020-01-30 keywords = BERT; PICO; datum; sentence summary = doi = nan id = cord-103813-w2sb6h94 author = Schumacher, Garrett J. title = Genetic information insecurity as state of the art date = 2020-07-10 keywords = datum; dna; genetic; information; security summary = Therefore, human genetic information is a uniquely confidential form of data that requires increased security controls and scrutiny. Sensitive genetic information, which includes both biological material and digital genetic data, is the primary asset of concern, and associated assets, such as metadata, electronic health records and intellectual property, are also vulnerable within this ecosystem. ❖ Private Sensitive Genetic Information can be expected to cause a moderate level of risk to a nation, ethnic group, individual, or stakeholder if it is disclosed, modified, or destroyed without authorization. The genetic information ecosystem is a distributed cyber-physical system containing numerous stakeholders (Supplementary Material, Appendix 1), personnel, and devices for computing and networking purposes. Genetic information security is a shared responsibility between sequencing laboratories and device vendors, as well as all other involved stakeholders. Examples include biorepositories, DNA sequencing laboratories, researchers, cloud and other service providers, and supply chain entities responsible for devices, software and materials. doi = 10.1101/2020.07.08.192666 id = cord-162326-z7ta3pp9 author = Shahi, Gautam Kishore title = AMUSED: An Annotation Framework of Multi-modal Social Media Data date = 2020-10-01 keywords = datum; medium; news; social summary = AMUSED can be applied in multiple application domains, as a use case, we have implemented the framework for collecting COVID-19 misinformation data from different social media platforms. To present a use case, we apply the proposed framework to gather data on COVID-19 misinformation on multiple social media platforms. In the following sections, we discuss the related work, different types of data circulated and its restrictions on social media platforms, current annotation techniques, proposed methodology and possible application domain; then we discuss the implementation and result. Nowadays, the journalists cover some of the common issues like misinformation, mob lynching, hate speech, and they also link the social media post in the news articles Cui and Liu (2017) . Step 5: Social Media Link From the crawled data, we fetch the anchor tag( a ) mentioned in the news content, then we filter the hyperlinks to identify social media platforms like Twitter and YouTube. doi = nan id = cord-269693-9tsy79lt author = Shao, Xue-Feng title = Multistage implementation framework for smart supply chain management under industry 4.0 date = 2020-10-06 keywords = chain; datum; industry; supply summary = Industry 4.0, or smart manufacturing, are the terms that are being used for digital transformation, using technologies such as the Internet of Things (IoT), Artificial Intelligence (AI), Cloud Computing (CC), Machine Learning (ML), and Data Analytics (DA), etc. Many researchers have explained the phenomena of smart manufacturing, or industry 4.0 technologies, in terms of an augmented and virtual reality (Wu et al., 2013; Rüßmann et al., 2015; Kolberg and Zühlke, 2015) , additive manufacturing (Huang et al., 2013; Chan et al., 2018) , internet of things (Wu et al., 2017) , big data analytics (De Mauro et al., 2015; Addo-Tenkorang and Helo, 2016; Lenz et al.,2018) , and cyber-physical systems (Monostori, 2014; Lee et al., 2015; Zhong and Nof, 2015) . The organization for this study was selected using the theoretical sampling, as it provided the opportunity to capture the evolution of the industry 4.0 implementation across a supply chain that included the focal firm, along with its supplier and a downstream customer (Eisenhardt, 1989 , Siggelkow, 2007 . doi = 10.1016/j.techfore.2020.120354 id = cord-319828-9ru9lh0c author = Shi, Shuyun title = Applications of Blockchain in Ensuring the Security and Privacy of Electronic Health Record Systems: A Survey date = 2020-07-15 keywords = EHR; access; blockchain; datum; healthcare; system summary = The potential benefits associated with EHR systems (e.g. public healthcare management, online patient access, and patients medical data sharing) have also attracted the interest of the research community [1, 2, 3, 4, 5, 6, 7, 8, 9] . In theory, EHR systems should ensure the confidentiality, integrity and availability of the stored data, and data can be shared securely among authorized users (e.g. medical practitioners with the right need to access particular patient''s data to facilitate diagno-70 sis). 2. all of data will be exposed once the corresponding symmetric key is lost Table 2 : systems requirements that have been met in Table 1 paper security privacy anonymity integrity authentication controllability auditability accountability [48] designed a system that integrates smart contract with IPFS to improve decentralized cloud storage and controlled data sharing for better user access management. Secure and efficient data accessibility in blockchain based healthcare systems doi = 10.1016/j.cose.2020.101966 id = cord-016889-7ih6jdpe author = Shibuya, Kazuhiko title = Identity Health date = 2019-12-03 keywords = Fukushima; Japan; Shibuya; datum; nuclear; social summary = These are a kind of mental illnesses and conditions as a maladaptation of gaming and social withdrawals from actual society, or they are overadaptation in somewhat online communities rather than physical environment. Those assessed data might intend to statistically reveal our strength of mental health and degree of adaptation in social relations, and then automatic prediction for those who answered personality tests enables to trustfully measure financial limitations for loans and transactions in actual contexts. (1973) and Giddens (1991) , they commonly argued that western post-modernizations could reconstruct mindsets on reality and social identification ways among citizens during achieving industrial progresses, if above severe incidents of nuclear power plants and those systems failures could be regarded as malfunctions as a symbol of modernity, above consequences of nuclear crisis on the Fukushima case (and other human-made disasters) might be contextualized to reexamine social adaptation and consciousness among Fukushima citizens by sociological verifications. As social networking services clearly indicate a part of human relationships online (Lazakidou 2012) , it can consider that their relations itself still have sharing illness personalities and depressed mental health. doi = 10.1007/978-981-15-2248-2_11 id = cord-317853-vd35a2eq author = Shu, Yuelong title = GISAID: Global initiative on sharing all influenza data – from vision to reality date = 2017-03-30 keywords = GISAID; datum summary = In 2006, the reluctance of data sharing, in particular of avian H5N1 influenza viruses, created an emergency bringing into focus certain limitations and inequities, such that the World Health Organization (WHO)''s Global Influenza Surveillance Network (now the Global Influenza Surveillance and Response System (GISRS) [5] ) was criticised on several fronts, including limited global access to H5N1 sequence data that were stored in a database hosted by the Los Alamos National Laboratories in the United States (US) [6, 7] . Scientists charged with the day to day responsibilities of running WHO Collaborating Centres (CCs) for Influenza, National Influenza Centres and the World Organisation for Animal Health (OIE)/ Food and Agriculture Organization of the United Nations (FAO) [8] reference laboratories, were therefore eager to play a key role and provide scientific oversight in the creation and development of GISAID''s data sharing platform that soon became essential for our work. doi = 10.2807/1560-7917.es.2017.22.13.30494 id = cord-032383-2dqpxumn author = Shuja, Junaid title = COVID-19 open source data sets: a comprehensive survey date = 2020-09-21 keywords = covid-19; data; datum; set summary = Our survey is motivated by the open source efforts that can be mainly categorized as (a) COVID-19 diagnosis from CT scans, X-ray images, and cough sounds, (b) COVID-19 case reporting, transmission estimation, and prognosis from epidemiological, demographic, and mobility data, (c) COVID-19 emotional and sentiment analysis from social media, and (d) knowledge-based discovery and semantic analysis from the collection of scholarly articles covering COVID-19. Automated CT scan based COVID-19 detection techniques work with training the learning model on existing CT scan data sets that contain labeled images of COVID-19 positive and normal cases. Triggered by this challenge limiting the adoption of AI/ML-powered COVID-19 diagnosis, forecasting, and mitigation, we make the first effort in surveying research works based on open source data sets concerning COVID-19 pandemic. The authors enlist the application of deep and transfer learning on their extracted data set for identification of COVID-19 while utilizing motivation from earlier studies that learned the type of pneumonia from similar images [47] . doi = 10.1007/s10489-020-01862-6 id = cord-291975-y8ck4lo8 author = Simon, Perikles title = Robust Estimation of Infection Fatality Rates during the Early Phase of a Pandemic date = 2020-04-10 keywords = CFR; IFR; Iceland; datum summary = The estimation of an IFR is based on two different and -regarding the influence of selection biasdivergent procedures to calculate a CFR from infection-related population data. This formula is not relying anymore on cases reported in the official databases of JH or ECDC and it served as a cross-validation figure for the IFR and the CFRs, which are solely based on these data and the population data of Iceland in the validation part of the results section. The IFRdeCode is the figure derived from testing the general population of Iceland and served to cross validate the mortality figures CFR and classic CFR that have been calculated from the data repositories of JH and the IFR that used this repository in conjunction with the test data published by Iceland''s Department of Public Health. doi = 10.1101/2020.04.08.20057729 id = cord-008584-4eylgtbc author = Singh, David E. title = Evaluating the impact of the weather conditions on the influenza propagation date = 2020-04-05 keywords = SISSS; datum; influenza summary = Our goal is to estimate the effects of changes in temperature and relative humidity on the patterns of epidemic influenza based on data provided by the Spanish Influenza Sentinel Surveillance System (SISSS) and the Spanish Meteorological Agency (AEMET). In this work we use the same data sources (SISSS and AEMET agencies) following a different approach: we study some of these relationships from a simulation perspective, considering not only the existing influenza distributions but also the ones related to the climate change. Fig. 10 Effect of short-term changes in the temperature on the influenza propagation for the different communities considered in the simulation One important thing to underline is that the data that the study [5] (whose model we adopt) is based on is of real cases and spans 30 years. doi = 10.1186/s12879-020-04977-w id = cord-301300-nfl9z8c7 author = Slavova, Svetla title = Operationalizing and selecting outcome measures for the HEALing Communities Study date = 2020-10-02 keywords = HCS; datum; death; opioid summary = Three secondary outcome measures will support hypothesis testing for specific evidence-based practices known to decrease opioid overdose deaths: (1) number of naloxone units distributed in HCS communities; (2) number of unique HCS residents receiving Food and Drug Administration-approved buprenorphine products for treatment of opioid use disorder; and (3) number of HCS residents with new incidents of high-risk opioid prescribing. The Helping to End Addiction Long-term (HEALing) Communities Study (HCS) is a multisite, parallel-group, cluster randomized wait-list controlled trial evaluating the impact of the Communities That HEAL intervention to reduce opioid overdose deaths and other associated adverse outcomes (Walsh et al., in press) . The research site teams established multiple data use agreements with data owners to support the calculation for more than 80 study measures based on administrative data collections, such as death certificates, emergency medical services data, inpatient and emergency department discharge billing records, Medicaid claims, syndromic surveillance data, PDMP data, Drug Enforcement Administration data on drug take back collection sites and events, DATA 2000 waivered prescriber data, HIV registry, naloxone distribution and dispensed prescription data. doi = 10.1016/j.drugalcdep.2020.108328 id = cord-285379-ljg475sj author = Slotwiner, David J. title = Digital Health in Electrophysiology and the COVID-19 Global Pandemic date = 2020-10-03 keywords = datum; health summary = The tools of digital health are facilitating a much needed paradigm shift to a more patient-centric health care delivery system, yet our healthcare infrastructure is firmly rooted in a 20 th Century model which was not designed to receive medical data from outside the traditional medical environment. The tools of digital health are facilitating a much needed paradigm shift to a more patient-centric health care delivery system, yet our healthcare infrastructure is firmly rooted in a 20 th Century model which was not designed to receive medical data from outside the traditional medical environment. In this article, we describe the present state of heart rhythm digital health tools highlighting some of the effects of J o u r n a l P r e -p r o o f the COVID-19 pandemic and propose ways to develop innovative workflows and technological solutions that will make it possible for practices to efficiently process and manage information. doi = 10.1016/j.hroo.2020.09.003 id = cord-032403-9c1xeqg1 author = Sokolov, Michael title = Decision Making and Risk Management in Biopharmaceutical Engineering—Opportunities in the Age of Covid-19 and Digitalization date = 2020-09-08 keywords = available; datum; decision; process summary = 10 The main engineering challenges 9,11−13 are to (1) robustly control the behavior of the living organism involved in the process, (2) efficiently align the often heterogeneous data generated across different process units and scales, (3) include all available prior know-how and experience into the decision process, (4) reduce human errors and introduced inconsistency, and (5) enable an automated and adaptive procedure to assess the critical process characteristics. Because of significant time pressure in development and risk mitigation pressure in manufacturing, decisions are often made on an ad hoc basis involving expert meetings where all readily available data, analysis results, and experience sources are taken into account without ensuring consideration of all possible available information hidden in the databases or inside the potential of (not automatedly retrained or connected) predictive models. However, in manufacturing operations which are based on Industrial & Engineering Chemistry Research pubs.acs.org/IECR Commentary decisions either actively introduced or supported by such models, a detailed assessment of these smart digital solutions is required. doi = 10.1021/acs.iecr.0c02994 id = cord-102490-yvcrv94c author = Souza, Jonatas S. de title = The General Law Principles for Protection the Personal Data and their Importance date = 2020-09-29 keywords = Brazil; LGPD; Law; datum summary = The purpose of this paper is to emphasize the principles of the General Law on Personal Data Protection, informing real cases of leakage of personal data and thus obtaining an understanding of the importance of gains that meet the interests of Internet users on the subject and its benefits to the entire Brazilian society. On April 23rd, 2014, Law No. 12,965, now known as Marco Civil da Internet [1] , was approved, establishing principles, guarantees, rights, and duties for the use of the Internet in Brazil, and has the guarantee of privacy and protection of personal data, and will only make such data available through a court order. Dispõe sobre a proteção de dados pessoais e altera a Lei nº 12.965, de 23 de abril de 2014 (Marco Civil da Internet) doi = 10.5121/csit.2020.101110 id = cord-286288-gduhterq author = Spitzer, Ernest title = Cardiovascular Clinical Trials in a Pandemic: Immediate Implications of Coronavirus Disease 2019 date = 2020-05-01 keywords = datum; pandemic; trial summary = Nevertheless, new or ongoing clinical trials, not related to the disease itself, remain important for the development of new therapies, and require interactions among patients, clinicians and research personnel, which is challenging, given isolation measures. Trials in patient populations with acute presentations (e.g. ST-elevation MI [STEMI]) may identify potentially suitable trial candidates; however, the capacity to comply with study procedures needs to be assessed, as well as considerations related to patient safety during follow-up. Participants in the follow-up phase (when they are generally at home) constitute a higher-risk population in the Reduced capacity at investigational sites will impact on availability to perform study visits (or phone calls) to assess and confirm eligibility, enter data in electronic case report forms (eCRFs), to report (serious) adverse events and to follow the protocol in general. The participation of several committees in clinical trials ensures proper scientific and operational oversight, data integrity and quality, as well as patient safety. doi = 10.15420/cfr.2020.07 id = cord-259929-02765q5j author = Stanley, Philip M. title = Decoding DNA data storage for investment date = 2020-09-28 keywords = datum; dna; storage summary = doi = 10.1016/j.biotechadv.2020.107639 id = cord-017634-zhmnfd1w author = Straif-Bourgeois, Susanne title = Infectious Disease Epidemiology date = 2005 keywords = CDC; case; datum; disease; infectious; outbreak; program; surveillance summary = Use of additional clinical, epidemiological and laboratory data may enable a physician to diagnose a disease even though the formal surveillance case definition may not be met. Another way to detect an increase of cases is if the surveillance system of reportable infectious diseases reveals an unusually high number of people with the same diagnosis over a certain time period at different health care facilities. On the other hand, however, there should be no time delay in starting an investigation if there is an opportunity to prevent more cases or the potential to identify a system failure which can be caused, for example, by poor food preparation in a restaurant or poor infection control practices in a hospital or to prevent future outbreaks by acquiring more knowledge of the epidemiology of the agent involved. In developing countries, surveys are often necessary to evaluate health problems since data collected routinely (disease surveillance, hospital records, case registers) are often incomplete and of poor quality. doi = 10.1007/978-3-540-26577-1_34 id = cord-204835-1yay69kq author = Sun, Chenxi title = A Review of Deep Learning Methods for Irregularly Sampled Medical Time Series Data date = 2020-10-23 keywords = ISMTS; RNN; datum; series; time summary = title: A Review of Deep Learning Methods for Irregularly Sampled Medical Time Series Data Irregularly sampled time series (ISTS) data has irregular temporal intervals between observations and different sampling rates between sequences. Recurrent neural networks (RNNs) [25, 26, 27] , auto-encoder (AE) [28, 29] and generative adversarial networks (GANs) [30, 31] have achieved good performance in medical data imputation and medical prediction thanks to their abilities of learning and generalization obtained by complex nonlinearity. End-to-end approaches process the downstream tasks directly based on modeling the time series with missing data. According to the analysis of technologies and experiment results, in this section, we will discuss ISMTS modeling task from three perspectives -1) imputation task with prediction task, 2) intra-series relation with inter-series relation / local structure with global structure and 3) missing data with raw data. Thus, of particular interest are irregularity-based methods that can learn directly by using multivariate sparse and irregularly sampled time series as input without the need for other imputation. doi = nan id = cord-299254-kqpnwkg5 author = Sun, Yingcheng title = INSMA: An integrated system for multimodal data acquisition and analysis in the intensive care unit date = 2020-04-28 keywords = ICU; INSMA; datum; patient summary = In this paper, we proposed a multimodal data acquisition and analysis system called INSMA, with the ability to acquire, store, process, and visualize multiple types of data from the Philips IntelliVue patient monitor. Enormous volumes of multimodal physiological data are generated including physiological waveform signals, patient monitoring alarm messages, and numerics and if acquired, synchronized and analyzed, this data can been effectively used to support clinical decision-making at the bedside [10, 18] . We have been working on building the Integrated Medical Environment (tIME) [10] to address this critical opportunity and in this paper, we discuss an integrated system (INSMA) that supports multimodal data acquisition, parsing, real-time data analysis and visualization in the ICU. Advances in informatics, whether through data acquisition, physiologic alarm detection, or signal analysis and visualization for decision support have the potential to markedly improve patient treatment in ICUs. Clinical monitors have the ability to collect and visualize important numerics or waveforms, but more work is needed to interface to the monitors and acquire and synchronize multimodal physiological data across a diverse set of clinical devices. doi = 10.1016/j.jbi.2020.103434 id = cord-273163-xm6qvhn1 author = Tarkoma, Sasu title = Fighting pandemics with digital epidemiology date = 2020-08-25 keywords = datum; digital summary = Digital epidemiologists conduct traditional epidemiological studies and health-related research using new data sources and digital methods from data collection to analysis [1, 2] . Digital epidemiology and digital tools have had a profound role in understanding and mitigating the COVID-19 pandemic through analysis of diverse digital data sources such as smartphone, health register, and environmental monitoring data. Combining aggregate and privacy-protected diverse data sources such as mobility, health, environmental, and city data is expected to help understand and mitigate the consequences of pandemics. The digital epidemiology toolkit is likely to be supported by advances in ML, privacy-enhancing technologies, data/ model validation and explainability, and national and transnational policy measures. Increasing data availability and access combined with advances in open source data processing and analysis pave the way for scalable digital epidemiology supporting world health security. doi = 10.1016/j.eclinm.2020.100512 id = cord-343944-nm4dx5pq author = Theys, Kristof title = Advances in Visualization Tools for Phylogenomic and Phylodynamic Studies of Viral Diseases date = 2019-08-02 keywords = bayesian; datum; figure; phylogenetic; tree; visualization summary = As a first example, we illustrate the development of innovative visualization software packages on the output of a Bayesian phylodynamic analysis of a rabies virus (RABV) data set consisting of time-stamped genetic data along with two discrete trait characteristics per sequence, i.e., the sampling location-in this case the state within the United States from which the sample originated-and the bat host type. Coalescent-based phylodynamic models that connect population genetics theory to genomic data can infer the demographic history of viral populations (65) , and plots of FIGURE 4 | The PhyloGeoTool offers a visual approach to explore large phylogenetic trees and to depict characteristics of strains and clades-including for example the geographic context and distribution of sampling dates-in an interactive way (17) . doi = 10.3389/fpubh.2019.00208 id = cord-303651-fkdep6cp author = Thompson, Robin N. title = Key questions for modelling COVID-19 exit strategies date = 2020-08-12 keywords = COVID-19; SARS; datum; epidemic; estimate; model; transmission summary = This leads to a roadmap for future research (figure 1) made up of three key steps: (i) improve estimation of epidemiological parameters using outbreak data from different countries; (ii) understand heterogeneities within and between populations that affect virus transmission and interventions; and (iii) focus on data needs, particularly data collection and methods for planning exit strategies in low-to-middle-income countries (LMICs) where data are often lacking. Three key steps are required: (i) improve estimates of epidemiological parameters (such as the reproduction number and herd immunity fraction) using data from different countries ( §2a-d); (ii) understand heterogeneities within and between populations that affect virus transmission and interventions ( §3a-d); and (iii) focus on data requirements for predicting the effects of individual interventions, particularly-but not exclusively-in data-limited settings such as LMICs ( §4a-c). doi = 10.1098/rspb.2020.1405 id = cord-026356-zm84yipu author = Tzouros, Giannis title = Fed-DIC: Diagonally Interleaved Coding in a Federated Cloud Environment date = 2020-05-15 keywords = DIC; Fed; datum summary = In this paper we present Fed-DIC, a framework which combines Diagonally Interleaved Coding on client devices at the edge of the network with organized storage of encoded data in a federated cloud system comprised of multiple independent storage clusters. Yet the most critical challenge with erasure coding is that it suffers from high reconstruction cost as it needs to access multiple blocks stored across different sets of storage nodes or racks (groups of nodes inside a distributed system) in order to retrieve lost data [7] , leading to high read access and network bandwidth latency. Fed-DIC''s topology in terms of the stored data among the clusters of the federated cloud, combined with the reduced storage size of the data chunks generated from its encoding process, provide significantly smaller read access costs and transfer bandwidth overhead for nodes in the cloud. doi = 10.1007/978-3-030-50323-9_4 id = cord-288264-xs08g2cy author = Ulahannan, Jijo Pulickiyil title = A citizen science initiative for open data and visualization of COVID-19 outbreak in Kerala, India date = 2020-08-06 keywords = Kerala; covid-19; datum summary = MATERIALS AND METHODS: Through a citizen science initiative, we leveraged publicly available and crowd-verified data on COVID-19 outbreak in Kerala from the government bulletins and media outlets to generate reusable datasets. RESULTS: From the sourced data, we provided real-time analysis, and daily updates of COVID-19 cases in Kerala, through a user-friendly bilingual dashboard (https://covid19kerala.info/) for non-specialists. CONCLUSION: We reported a citizen science initiative on the COVID-19 outbreak in Kerala to collect and deposit data in a structured format, which was utilized for visualizing the outbreak trend and describing demographic characteristics of affected individuals. Here, we report a citizen science initiative to leverage publicly available data on COVID-19 cases in Kerala from the daily bulletins released by the DHS, Government of Kerala, and various news outlets. The multi-sourced data was refined to make a structured live dataset to provide real-time analysis and daily updates of COVID-19 cases in Kerala through a bilingual (English and Malayalam) user-friendly dashboard (https://covid19kerala.info/). doi = 10.1093/jamia/ocaa203 id = cord-027712-2o4svbms author = Urošević, Vladimir title = Baseline Modelling and Composite Representation of Unobtrusively (IoT) Sensed Behaviour Changes Related to Urban Physical Well-Being date = 2020-05-31 keywords = activity; datum; indicator; pulse summary = We present the grounding approach, deployment and preliminary validation of the elementary devised model of physical well-being in urban environments, summarizing the heterogeneous personal Big Data (on physical activity/exercise, walking, cardio-respiratory fitness, quality of sleep and related lifestyle and health habits and status, continuously collected for over a year mainly through wearable IoT devices and survey instruments in 7 global testbed cities) into 5 composite domain indicators/indexes convenient for interpretation and use in predictive public health and preventive interventions. In the first approach, daily and intra-daily underlying measurements (Table 1 ) are used to estimate levels of adherence to rule-and range-based recommendations matured from institutional knowledge of relevant authorities and population-significant studies in the field, accumulated for over decades in the stated four example domains of motility, physical activity, sleep quality and cardio-respiratory fitness [8, 10, 11] . doi = 10.1007/978-3-030-51517-1_13 id = cord-009797-8mdie73v author = Valle, Denis title = Extending the Latent Dirichlet Allocation model to presence/absence data: A case study on North American breeding birds and biogeographical shifts expected from climate change date = 2018-08-26 keywords = LDA; datum; group; number summary = title: Extending the Latent Dirichlet Allocation model to presence/absence data: A case study on North American breeding birds and biogeographical shifts expected from climate change The Latent Dirichlet Allocation (LDA) model is a mixed‐membership method that can represent gradual changes in community structure by delineating overlapping groups of species, but its use has been limited because it requires abundance data and requires users to a priori set the number of groups. Furthermore, by comparing the estimated proportion of each group for two time periods (1997–2002 and 2010–2015), our results indicate that nine (of 18) breeding bird groups exhibited an expansion northward and contraction southward of their ranges, revealing subtle but important community‐level biodiversity changes at a continental scale that are consistent with those expected under climate change. It is important to note that even in the absence of MM sampling units, LDA can still estimate well the true number of groups and has similar fit to the data as the other clustering approaches (results not shown). doi = 10.1111/gcb.14412 id = cord-027704-zm1nae6h author = Vito, Domenico title = The PULSE Project: A Case of Use of Big Data Uses Toward a Cohomprensive Health Vision of City Well Being date = 2020-05-31 keywords = datum; health; pulse summary = In the year 2015 ITU and the United Nations Economic Commission for Europe (UNECE) gave the definition of smart and sustainable city as "an innovative city that uses information and communication technologies (ICTs) and other means to improve quality of life, efficiency of urban operation and services, and competitiveness, while ensuring that it meets the needs of present and future generations with respect to economic, social, environmental as well as cultural aspects". The project is currently active in eight pilot cities, Barcelona, Birmingham, New York, Paris, Singapore, Pavia, Keelung and Taiwan, following a participatory approach where citizen provide data through personal devices and the PulsAIR app, that are integrated with information from heterogeneous sources: open city data, health systems, urban sensors and satellites. The clinical is on asthma and Type 2 Diabetes in adult populations: the project has been pioneer in the development of dynamic spatiotemporal health impact assessments through exposure-risk simulation model with the support of WebGis for geolocated population-based data. doi = 10.1007/978-3-030-51517-1_39 id = cord-272276-83f0ruku author = Wagner, Joseph E. title = A computer based system for collection, storage, retrieval and reporting accession information in a veterinary medical diagnostic laboratory date = 1984-12-31 keywords = Fig; datum summary = Abstract Substantial data collected from large numbers of accessions, the need for comprehensive reporting of negative as well as positive laboratory findings, and the necessity for obtaining rapid diagnostic correlations prompted the development of a computer based system of accession data management for collection, storage, rapid retrieval, reporting, concording, and administrative compiling in a state-university Veterinary Medical Diagnostic Laboratory. Demographic-zoographic panel ( Fig. 1) When an accession is presented to the RADIL section of the Veterinary Medical 12 13 14 15 16 17 18 19 20 21 22 23 24 ,ll,l111lll~llLllll1111111111111111~1111~~~~~~~~~~~~''~~~~~~"''~~"''~''~''~~~~~~~~~ Diagnostic Laboratory, demographic and zoographic information is immediately entered by a data controller or data entry operator from information on a form submitted with the accession. Reports of negative findings and normal necropsy observations, as well as reports of the kinds of techniques used (such as the kind of blood collection method used, arrow, Fig. 2 , line 8) can be entered by a code number, thus reducing data entry time. doi = 10.1016/0010-4825(84)90033-7 id = cord-032607-bn8g02gi author = Wake, Melissa title = Integrating trials into a whole-population cohort of children and parents: statement of intent (trials) for the Generation Victoria (GenV) cohort date = 2020-09-24 keywords = Fig; GenV; Victoria; cohort; datum; trial summary = Keywords: Research methodology, Randomization, Registry trials, Multiple baseline randomized trials, Trials within cohorts, Population studies, Generation Victoria (GenV), Clinical trial as topic, Children, Intervention Background Randomized controlled trials (RCT) provide high-quality evidence with regards to the effectiveness of therapies and prevention and are critical to guide translation and optimal resource allocation. If feasibility (potentially demonstrated through pilot studies) and mutual alignment appear likely [29] , the trial would proceed to a partnering agreement that defines at least the following 8 items: 1) Which GenV trial model is being followed; 2) Design and high-level (or draft) protocol; 3) Timelines; 4) Data sharing and governance plans; 5) Status of ethical approval; 6) Communication with participants, including information statement and consent; 7) Trial oversight and 8) Capacity assessment, including trial quality, human resource and funding. doi = 10.1186/s12874-020-01111-x id = cord-016528-j7lflryj author = Waller, Anna E. title = Using Emergency Department Data For Biosurveillance: The North Carolina Experience date = 2010-07-27 keywords = Carolina; North; datum summary = The benefits and challenges of using Emergency Department data for surveillance are described in this chapter through examples from one biosurveillance system, the North Carolina Disease Event Tracking and Epidemiologic Collection Tool (NC DETECT). With electronic health information systems, these data are available in near real-time, making them particularly useful for surveillance and situational awareness in rapidly developing public health outbreaks or disasters. Biosurveillance is an emerging field that provides early detection of disease outbreaks by collecting and interpreting data on a variety of public health threats, including emerging infectious diseases (e.g., avian influenza), vaccine preventable diseases (e.g., pertussis) and bioterrorism (e.g., anthrax). NC DETECT has since grown to incorporate ED visit data from 98% of 24/7 acute care hospital EDs in the state of North Carolina and has developed and implemented many innovative surveillance tools, including the Emergency Medicine Text Processor (EMT-P) for ED chief complaint data and research-based syndrome definitions. doi = 10.1007/978-1-4419-6892-0_3 id = cord-282938-1if7bl2u author = Wang, Yanxin title = Using Mobile Phone Data for Emergency Management: a Systematic Literature Review date = 2020-09-16 keywords = datum; mobile; phone summary = Three research objectives are undertaken to achieve the goal of synthesizing the fragmented knowledge and providing research guidance: (i) extract basic knowledge (e.g. types of mobile phone data, situations) of EM from the selected studies; (ii) break the boundaries of different disciplines and aggregate each analysis perspective; and (iii) study the identified knowledge and integrate it into a single framework that draws a comprehensive map of existing findings under this subject, and provides future implications. Two iterations were processed: (1) searching 26 terms in the keywords list ("mobile phone data" OR "short message service" OR "call detail record" OR "phone GPS data" OR "cellular network data" OR "app data" OR "application data" OR "Bluetooth data") AND ("emergency" OR "extreme situation" OR "extreme event" OR "large-scale event" OR "special event" OR "special situation" OR "anomalous event" OR "anomalous situation" OR "unusual event" OR "unusual situation" OR "crisis" OR"disaster" OR "catastrophe" OR "traffic accident" OR "epidemics" OR "infectious disease") AND (2013 < PUBYEAR<2019); (2) searching papers in the reference list of the five previously identified review articles and including additional studies. doi = 10.1007/s10796-020-10057-w id = cord-199267-cm6tqbzk author = Wang, Zijie title = Survive the Schema Changes: Integration of Unmanaged Data Using Deep Learning date = 2020-10-15 keywords = GPT-2; cell; data; datum; schema summary = In this work, we propose to use deep learning to automatically deal with schema changes through a super cell representation and automatic injection of perturbations to the training data to make the model robust to schema changes. The contributions of this work include: (1) As to our best knowledge, we are the first to systematically investigate the application of deep learning and adversarial training techniques to automatically handle schema changes occurring in the data sources. A deep learning model, once trained, can handle most schema evolution without any human intervention, and does not require any data migration, or version management overhead. Our work has a potential to integrate data discovery and schema matching into a deep learning model inference process. doi = nan id = cord-229198-aju7xkel author = Wei, Viska title = Sketch and Scale: Geo-distributed tSNE and UMAP date = 2020-11-11 keywords = Count; Sketch; UMAP; datum summary = doi = nan id = cord-102238-g6dsnhmm author = Wescoat, Ethan title = Frequency Energy Analysis in Detecting Rolling Bearing Faults date = 2020-12-31 keywords = bearing; datum; fft summary = doi = 10.1016/j.promfg.2020.05.137 id = cord-102634-0n42h72w author = Willforss, Jakob title = OmicLoupe: Facilitating biological discovery by interactive exploration of multiple omic datasets and statistical comparisons date = 2020-10-22 keywords = dataset; datum; feature; figure summary = Use cases are, for example, (1) Biomarker studies where an initial set of candidates is to be validated (2) Time-series experiment where the global expression is inspected, for instance, at different times after infection (3) Multiomics experiments where multiple types of data are produced for the same or similar biological systems and (4) Detailed studies of comparisons between methods or software approaches. We thus investigated how OmicLoupe can be used for direct comparisons of different data types taken from the same set of samples, to reveal features only detected in certain conditions, and common patterns of observed abundance level changes. To study the similarity of the statistical comparisons across the two data types, features with positive abundance change and with low p-values were highlighted in the RNA-seq contrast (by dragging directly in the figure) between CNV high and CNV low to see how these distribute in the corresponding contrast in the proteomics dataset ( Figure 4B ). doi = 10.1101/2020.10.22.349944 id = cord-328826-guqc5866 author = Wissel, Benjamin D title = An Interactive Online Dashboard for Tracking COVID-19 in U.S. Counties, Cities, and States in Real Time date = 2020-04-25 keywords = COVID-19; datum summary = MATERIALS AND METHODS: This R Shiny application aggregates data from multiple resources that track COVID-19 and visualizes them through an interactive, online dashboard. It displays COVID-19 data from every county and 188 metropolitan areas in the U.S. Features include rankings of the worst affected areas and auto-generating plots that depict temporal changes in testing capacity, cases, and deaths. Our team developed a methodology to aggregate county-level COVID-19 data into metropolitan areas and display these data in an interactive dashboard that updates in real-time. To track the proportion of each area''s residents that became infected or died of COVID-19, we used the U.S. Census Bureau''s 2019 population estimate for each county to normalize data to tests, cases, and deaths per 10,000 residents. Users can view COVID-19 cases and deaths from The NYT at the county, city, state, or national level, and the total number of tests reported by the COVID Tracking Project, including the breakdown between positive and negative tests, is shown for each state. doi = 10.1093/jamia/ocaa071 id = cord-014833-ax09x6gk author = Wu, Jia title = Data Decision and Transmission Based on Mobile Data Health Records on Sensor Devices in Wireless Networks date = 2016-06-20 keywords = datum; device; patient summary = title: Data Decision and Transmission Based on Mobile Data Health Records on Sensor Devices in Wireless Networks History data, collection data, and doctor-analyzed data could be computed and transmitted to patients using sensor devices. This study establishes a new method that can decide and transmit effective data based on sensor device mobile health in wireless networks. This study establishes a new method that can decide and transmit effective data based on sensor device mobile health in wireless networks. According to an established mobile health system, patients can obtain timely treatment from doctors or hospitals by using wireless sensor devices. In mobile health, sensor devices and mobile device are the cheapest and most convenient means of data collection and transmission among doctors, patients, and hospitals. Formula (8) assumes that a ¼ 0:15; b ¼ 0:35; c ¼ 0:5: Sensor devices may calculate the probability and transmit diagnosis data to the mobile APP to be evaluated by patients and doctors. doi = 10.1007/s11277-016-3438-y id = cord-103310-qtrquuvv author = Wu, Tianzhi title = Open-source analytics tools for studying the COVID-19 coronavirus outbreak date = 2020-02-27 keywords = datum summary = To provide convenient access to epidemiological data on the coronavirus outbreak, we developed an R package, nCov2019 (https://github.com/GuangchuangYu/nCov2019). Besides detailed real-time statistics, it offers access to three data sources with detailed daily statistics from December 1, 2019, for 43 countries and more than 500 Chinese cities. We also developed a web app (http://www.bcloud.org/e/) with interactive plots and simple time-series forecasts. [3] , our web app enables users to select their regions of interest and check both the historical and real-time data. Generated by the app on February 25, 2020, Figure 2 shows that the total confirmed cases in the provinces outside Hubei are stabilizing, following a similar trend. Interestingly, daily percent changes in both confirmed cases and deaths in China are decreasing linearly except for a few outliers (see Figure 16 and 18 in Supplementary Document 2). doi = 10.1101/2020.02.25.20027433 id = cord-035388-n9hza6vm author = Xu, Jie title = Federated Learning for Healthcare Informatics date = 2020-11-12 keywords = Federated; client; datum; learning; model summary = This creates a big barrier for developing effective analytical approaches that are generalizable, which need diverse, "big data." Federated learning, a mechanism of training a shared global model with a central server while keeping all the sensitive data in local institutions where the data belong, provides great promise to connect the fragmented healthcare data sources with privacy-preservation. For both provider (e.g., building a model for predicting the hospital readmission risk with patient Electronic Health Records (EHR) [71] ) and consumer (patient)-based applications (e.g., screening atrial fibrillation with electrocardiograms captured by smartwatch [79] ), the sensitive patient data can stay either in local institutions or with individual consumers without going out during the federated model learning process, which effectively protects the patient privacy. Federated learning is a problem of training a high-quality shared global model with a central server from decentralized data scattered among large number of different clients (Fig. 1) . doi = 10.1007/s41666-020-00082-4 id = cord-287884-qxk1wfk8 author = Yamin, Mohammad title = Information technologies of 21st century and their impact on the society date = 2019-08-16 keywords = Blockchain; Cloud; datum summary = Some of these technologies are Big Data Analytics, Internet of Things (IoT), Sensor networks (RFID, Location based Services), Artificial Intelligence (AI), Robotics, Blockchain, Mobile digital Platforms (Digital Streets, towns and villages), Clouds (Fog and Dew) computing, Social Networks and Business, Virtual reality. Accordingly, things (technologies, devices and tools) used together in internet based applications to generate data to provide assistance and services to the users from anywhere, at any time. IoT is providing some amazing applications in tandem with wearable devices, sensor networks, Fog computing, and other technologies to improve some the critical facets of our lives like healthcare management, service delivery, and business improvements. Some of the key devices and associated technologies to IoT include RFID Tags [25] , Internet, computers, cameras, RFID, Mobile Devices, coloured lights, RFIDs, Sensors, Sensor networks, Drones, Cloud, Fog and Dew. Blockchain is usually associated with Cryptocurrencies like Bitcoin (Currently, there are over one and a half thousand cryptocurrencies and the numbers are still rising). doi = 10.1007/s41870-019-00355-1 id = cord-330503-w1m1ci4i author = Yamin, Mohammad title = IT applications in healthcare management: a survey date = 2018-05-31 keywords = datum; medical; system summary = Advance data transfer and management techniques have made improvements in disease diagnostic and have been a critical role in national health planning and efficient record keeping. In particular, the medical profession has undergone substantial changes through the capabilities of database management, which has given rise to the Healthcare Information Systems (HIS). According to [1] , many programs are developed with the help of AI to perform specific tasks which make use of many activities including medical diagnostic, time sharing, interactive interpreters, graphical user interfaces and the computer mouse, rapid development environments, the linked listdata structure, automatic storage management, symbolic, functional, dynamic, and object-oriented programming. Thus the first phase of the usage of information technology and systems in hospital and healthcare management was to transform paper based records to database systems. AI, Robots, VR, AR, MR, IoMT, ubiquitous medical services, and big data analytics are all directly or indirectly related to IT. Medical internet of things and big data in healthcare doi = 10.1007/s41870-018-0203-3 id = cord-226263-ns628u21 author = Ye, Yanfang title = $alpha$-Satellite: An AI-driven System and Benchmark Datasets for Hierarchical Community-level Risk Assessment to Help Combat COVID-19 date = 2020-03-27 keywords = COVID-19; Malware; area; datum summary = doi = nan id = cord-219107-klpmipaj author = Zachreson, Cameron title = Risk mapping for COVID-19 outbreaks using mobility data date = 2020-08-14 keywords = COVID-19; Facebook; datum; transmission summary = For community transmission scenarios, our results demonstrate that mobility data adds the most value to risk predictions when case counts are low and spatially clustered. In each case, we use the Facebook mobility data that was available during the early stages of the outbreak to estimate future spatial patterns of relative transmission risk. For each of the three outbreak scenarios, we present the mobility-based estimates of the relative transmission risk distribution, and a time-varying correlation between our estimate and the case numbers ascertained through contact tracing and testing programs. Our results indicate that aggregate mobility data can be a useful tool in estimation of COVID-19 transmission risk diffusion from locations where active cases have been identified. A heat map (Supplemental Figure S1 ) of the average number of Facebook users present during the nighttime period (2am to 10am) as a proportion of the estimated resident population reported by the ABS (2018 [32] ) shows qualitative similarity to the spatial distributions of active cases and relative risk shown in Figure 5 doi = nan id = cord-102760-5tkdwtc0 author = Zambetti, Michela title = Enabling servitization by retrofitting legacy equipment for Industry 4.0 applications: benefits and barriers for OEMs date = 2020-12-31 keywords = datum; equipment; industry summary = In this context, solutions mostly result in the development of low-cost retrofit or upgrade kits that allow integrating legacy equipment into Industry 4.0 environment and thus enable digital servitization. This challenge, however, provides the OEMs with an opportunity to create and capture unique value by upgrading and retrofitting the legacy equipment and then provisioning data-driven value-added services for the manufacturers (equipment users) [5] . In section four we put a special focus on the servitization potential and challenges of the OEMs in supporting the Industry 4.0 transition by means of retrofitting legacy equipment and provisioning data-driven services. Given the fact that the existing literature on the upgradability and retrofitting solution towards Industry 4.0 do not include the OEM and the service perspectives at this point, this research investigated OEM''s potential in providing connectivity and data analytics services to the manufacturers of end products. doi = 10.1016/j.promfg.2020.05.144 id = cord-025289-lhjn97f7 author = Zehnder, Philipp title = StreamPipes Connect: Semantics-Based Edge Adapters for the IIoT date = 2020-05-07 keywords = adapter; datum; source summary = To mitigate these challenges, we present StreamPipes Connect, targeted at domain experts to ingest, harmonize, and share time series data as part of our industry-proven open source IIoT analytics toolbox StreamPipes. Our main contributions are (i) a semantic adapter model including automated transformation rules for pre-processing, and (ii) a distributed architecture design to instantiate adapters at edge nodes where the data originates. The goal of this paper is to simplify the process of connecting new sources, harmonize data, as well as to utilize semantic meta-information about its meaning, by providing a system with a graphical user interface (GUI). Based on this model, adapters are instantiated, to connect and harmonize data according to pre-processing rules applied to each incoming event. Generated adapters connect to the configured data sources and pre-process data directly at the edge by applying pipelines consisting of user-defined transformation rules. doi = 10.1007/978-3-030-49461-2_39 id = cord-351065-nyfnwrtm author = Zhang, Tenghao title = Integrating GIS technique with Google Trends data to analyse COVID-19 severity and public interest date = 2020-09-16 keywords = datum summary = title: Integrating GIS technique with Google Trends data to analyse COVID-19 severity and public interest Some studies suggest that health related issues can cause anxiety which may lead to increased public attention, typically manifested by online information search. Adams et al.''s (2020) GIS-based study points out the shortcomings of using unnormalized COVID-19 demographic data in choropleth mapping, and their use of the normalized data (confirmed cases per 100,000 people) presents a more accurate visualisation of pandemic severity. The COVID-19 case data were retrieved from the US health authority (https://cdc.gov/covid-datatracker). Public interest was captured by people''s Google search data in each state. 7 The data were acquired from the Google Trends service, which uses a normalized relative search volume The role of health anxiety in online health information search The disguised pandemic: The importance of data normalization in COVID-19 web mapping doi = 10.1016/j.puhe.2020.09.005 id = cord-025519-265qdtw6 author = Zouinina, Sarah title = A Two-Levels Data Anonymization Approach date = 2020-05-06 keywords = TCA; anonymization; datum summary = Consequently, privacy preservation through machine learning algorithms were designed based on cryptography, statistics, databases modeling and data mining. To this purpose, we revisited all the previously proposed approaches, and we added a second level of anonymization by incorporating the discriminative information and using Adaptive Weighting of Features to improve the quality of the anonymized data. The paper is organised into four sections: the first dresses the different approaches of privacy preserving using machine learning, the second sums up the previously proposed approaches, the third discusses the introduction of the discriminative information and the fourth validates the method experimentally on six different datasets. The two models propose an algorithm that relies on the classical Self Organizing Maps (SOMs) [10] and collaborative Multiview clustering in purpose to provide useful anonymous datasets [9] . As shown in the Table 5 , the introduction of the discriminant information improves the utility of the anonymized datasets for all of the methods proposed. doi = 10.1007/978-3-030-49161-1_8 id = cord-002774-tpqsjjet author = nan title = Section II: Poster Sessions date = 2017-12-01 keywords = AIDS; Canada; Centre; City; Community; HCV; HIV; Health; India; MSM; National; New; Toronto; Vancouver; York; access; african; age; care; child; datum; drug; group; high; introduction; method; need; patient; population; poster; program; research; result; service; session; social; study; urban; woman; year summary = Results: The CHIP Framework The CHIP framework aims to improve the health and wellness of the urban communities served by St. Josephs Health Centre through four intersecting pillars: • Raising Community Voices provides an infrastructure and process that supports community stakeholder input into health care service planning, decision-making, and delivery by the hospital and across the continuum of care; • Sharing Reciprocal Capacity promotes healthy communities through the sharing of our intellectual and physical capacity with our community partners; • Cultivating Integration Initiatives facilitates vertical, horizontal, and intersectoral integration initiatives in support of community-identified needs and gaps; and • Facilitating Healthy Exchange develops best practices in community integration through community-based research, and facilitates community voice in informing public policy. doi = 10.1093/jurban/jti137 id = cord-004894-75w35fkd author = nan title = Abstract date = 2006-06-14 keywords = ABSTRACT; BMI; Background; CHD; CVD; Germany; Health; Methods; Netherlands; age; cancer; conclusion; datum; discussion; dutch; european; factor; high; increase; objective; patient; result; risk; study; woman; year summary = The unadjusted median (25-75% percentile) sperm concentration in the non-exposed group (n = 90) is 49 (23-86) mill/ml compared to 33 (12-63) mill/ml among men exposed to >19 cigarettes per day in fetal life (n = 26 Aim: To estimate the prevalence of overweight and obesity, and their effects in physical activity (PA) levels of Portuguese children and adolescents aged 10-18 years. Objectives: a) To estimate the sex-and age-adjusted annual rate of tuberculosis infection (ARTI) (per 100 person-years [%py]) among the HCWs, as indicated by tuberculin skin test conversion (TST) conversion, b) to identify occupational factors associated with significant variations in the ARTI, c) to investigate the efficacy of the regional preventive guidelines. Objectives: We assessed the total burden of adverse events (AE), and determined treatment-related risk factors for the development of various AEs. Methods: The study cohort included 1362 5-year survivors, treated in the Emma Childrens Hospital AMC in the Netherlands between 1966-1996. doi = 10.1007/s10654-006-9021-1 id = cord-010310-jqh75340 author = nan title = Next Generation Technology for Epidemic Prevention and Control: Data-Driven Contact Tracking date = 2018-12-24 keywords = GPS; HIV; contact; datum; individual summary = Furthermore, the transmission networks of infectious diseases established using contact tracking technology can aid in the visualization of actual virus transmission paths, which enables simulations and predictions of the transmission process, assessment of the outbreak trend, and further development and deployment of more effective prevention and control strategies. Tracking the contact interactions of individuals can effectively restore the ''''invisible'''' virus transmission paths, quickly locate and isolate high-risk individuals who were in contact with infected persons, and can aid in quantitative analysis of the transmission paths, processes, and trends of the infectious diseases, all leading to the development of corresponding effective epidemic control strategies. With the aim to collect dynamic, complete, and accurate individual contact information, some researchers began to use mobile phone, wireless sensors, RFID, and GPS devices to track individual contact behaviors. Although detailed individual contact information can be collected through non-automatic methods, e.g., offline and online questionnaire, and automatic methods, e.g., mobile phone, wearable wireless sensors, RFID, and GPS devices. doi = 10.1109/access.2018.2882915 id = cord-022633-fr55uod6 author = nan title = SAEM Abstracts, Plenary Session date = 2012-04-26 keywords = ACS; AED; Background; COPD; CPR; EMS; ETCO; Emergency; HIV; Hospital; ICU; IQR; LOS; MDD; OHCA; TBI; University; conclusion; datum; group; level; method; objective; patient; rate; result; study; time summary = Staff satisfaction was evaluated through pre/ post-shift and study surveys; administrative data (physician initial assessment (PIA), length of stay (LOS), patients leaving without being seen (LWBS) and against medical advice [LAMA] ) were collected from an electronic, real-time ED information system. Communication Background: The link between extended shift lengths, sleepiness, and occupational injury or illness has been shown, in other health care populations, to be an important and preventable public health concern but heretofore has not been fully described in emergency medical services (EMS Objectives: To assess the effect of an ED-based computer screening and referral intervention for IPV victims and to determine what characteristics resulted in a positive change in their safety. Objectives: Using data from longitudinal surveys by the American Board of Emergency Medicine, the primary objective of this study was to evaluate if resident self-assessments of performance in required competencies improve over the course of graduate medical training and in the years following. doi = 10.1111/j.1553-2712.2012.01332.x id = cord-023284-i0ecxgus author = nan title = Abstracts of publications related to QASR date = 2006-09-19 keywords = Fig; NMR; chemical; compound; datum; model; molecular; result; structure summary = doi = 10.1002/qsar.19900090309 id = cord-024058-afgvztwo author = nan title = Engineering a Global Response to Infectious Diseases: This paper presents a more robust, adaptable, and scalable engineering infrastructure to improve the capability to respond to infectious diseases.Contributed Paper date = 2015-02-17 keywords = datum; disease; dna; health; infectious summary = Examples of innovative leveraging of infrastructure, technologies to enhance existing disease management strategies, engineering approaches to accelerate the rate of discovery and application of scientific, clinical, and public health information, and ethical issues that need to be addressed for implementation are presented. Because engineers contribute to the design and implementation of infrastructure, there are opportunities for innovative solutions to infectious disease response within existing systems that have utility, and therefore resources, before a public health emergency. Moving forward, addressing privacy issues will be critical so that geographic tracking of a phone''s location could be used to help inform an individual of potential contact with infected persons or animals and support automated, anonymous, electronic integration of those data to accelerate the epidemiological detective work of identifying and surveying those same individuals for public health benefit. doi = 10.1109/jproc.2015.2389146 id = cord-035030-ig4nwtmi author = nan title = 10th European Conference on Rare Diseases & Orphan Products (ECRD 2020) date = 2020-11-09 keywords = AHP; European; datum; disease; health; patient; rare; result summary = Conclusion: With this survey Endo-ERN is provided with a large sample of responses from European patients with a rare endocrine condition, and those patients experience unmet needs in research, though these needs differ between the disease groups. Various factors compound the development of treatments for paediatric rare diseases, including the need for new Clinical Outcome Assessments (COAs), as conventional endpoints such as the 6 Minute Walking Test (6MWT) have been shown to not be applicable in all paediatric age subsets, [3] and therefore may not be useful in elucidating patient capabilities. S18 Background: To help inform cross-national development of genomic care pathways, we worked with families of patients with rare diseases and health professionals from two European genetic services doi = 10.1186/s13023-020-01550-1 id = cord-007708-hr4smx24 author = van Kampen, Antoine H. C. title = Taking Bioinformatics to Systems Medicine date = 2015-08-13 keywords = datum; disease; expression; gene; network; system summary = Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. To enable systems medicine it is necessary to characterize the patient at various levels and, consequently, to collect, integrate, and analyze various types of data including not only clinical (phenotype) and molecular data, but also information about cells (e.g., disease-related alterations in organelle morphology), organs (e.g., lung impedance when studying respiratory disorders such as asthma or chronic obstructive pulmonary disease), and even social networks. Bioinformatics covers many types of analyses including nucleotide and protein sequence analysis, elucidation of tertiary protein structures, quality control, pre-processing and statistical analysis of omics data, determination of genotypephenotype relationships, biomarker identifi cation, evolutionary analysis, analysis of gene regulation, reconstruction of biological networks, text mining of literature and electronic patient records, and analysis of imaging data. doi = 10.1007/978-1-4939-3283-2_2