key: cord-1039660-iwvcskhh authors: Park, Amber; Harris, Laura K. title: Gene Expression Meta-Analysis Identifies Molecular Changes Associated with SARS-CoV Infection in Lungs date: 2020-11-16 journal: bioRxiv DOI: 10.1101/2020.11.14.382697 sha: d504fff7eea5a7a41552d2cca1a0e6a474abcd2d doc_id: 1039660 cord_uid: iwvcskhh Background Severe Acute Respiratory Syndrome (SARS) corona virus (SARS-CoV) infections are a serious public health threat because of their pandemic-causing potential. This work uses mRNA expression data to predict genes associated with SARS-CoV infection through an innovative meta-analysis examining gene signatures (i.e., gene lists ranked by differential gene expression between SARS and mock infection). Methods This work defines 29 gene signatures representing SARS infection across seven strains with established mutations that vary virulence (infectious clone SARS (icSARS), Urbani, MA15, ΔORF6, BAT-SRBD, ΔNSP16, and ExoNI) and host (human lung cultures and/or mouse lung samples) and examines them through Gene Set Enrichment Analysis (GSEA). To do this, first positive and negative icSARS gene panels were defined from GSEA-identified leading-edge genes between 500 genes from positive or negative tails of the GSE47960-derived icSARSvsmock signature and the GSE47961-derived icSARSvsmock signature, both from human cultures. GSEA then was used to assess enrichment and identify leading-edge icSARS panel genes in the other 27 signatures. Genes associated with SARS-CoV infection are predicted by examining membership in GSEA-identified leading-edges across signatures. Results Significant enrichment (GSEA p<0.001) was observed between GSE47960-derived and GSE47961-derived signatures, and those leading-edges defined the positive (233 genes) and negative (114 genes) icSARS panels. Non-random (null distribution p<0.001) significant enrichment (p<0.001) was observed between icSARS panels and all verification icSARSvsmock signatures derived from human cultures, from which 51 over- and 22 under-expressed genes were shared across leading-edges with 10 over-expressed genes already being associated with icSARS infection. For the icSARSvsmock mouse signature, significant, non-random enrichment (both p<0.001) held for only the positive icSARS panel, from which nine genes were shared with icSARS infection in human cultures. Considering other SARS strains, significant (p<0.01), non-random (p<0.002) enrichment was observed across signatures derived from other SARS strains for the positive icSARS panel. Five positive icSARS panel genes, CXCL10, OAS3, OASL, IFIT3, and XAF1, were found in mice and human signatures. Conclusion The GSEA-based meta-analysis approach used here identified genes with and without reported associations with SARS-CoV infections, highlighting this approach’s predictability and usefulness in identifying genes that have potential as therapeutic targets to preclude or overcome SARS infections. Background 41 Human β-coronaviruses (CoV) are enveloped, positive-sense RNA viruses that infect humans and a 42 variety of animal species (Li et al., 2019) . Human CoV infections typically cause mild upper 43 respiratory distress, referred to as the common cold, and are generally non-lethal 44 Li et al., 2019). However, in 2002 a novel CoV was found to cause the potentially life-threatening 45 disease, severe acute respiratory syndrome (SARS) with the initial outbreak of SARS-CoV infecting 46 an estimated 8,400 people with over a 9% mortality rate (Boulos, 2004; Lai, 2005; Yount et al., 2005; 47 Sims et al., 2013). Limited therapeutic options to treat SARS infections were available at the time. 48 Ribavirin and corticosteroids were the cornerstones of treatment during the SARS outbreak despite 49 Ribavirin showing a lack of efficacy in several studies and corticosteroid efficacy requiring further 50 investigations in controlled trials (Lai, 2005) . In 2012, another highly lethal CoV causing Middle 51 East respiratory syndrome (MERS) emerged with an over 30% mortality rate (Bahadur et Hypergeometric test-based approaches are known to be limited because only genes that meet an 78 established cut-off (e.g., T-test p-value<0.05) are considered (Subramanian et al., 2005) . To 79 overcome this limitation by considering all genes in an expression dataset, other studies utilized Gene 80 Set Enrichment Analysis (GSEA) to calculate the enrichment of an established gene set from a public 81 knowledgebase (e.g., MSigDB, Blood Transcriptional Modules, and/or Kyoto Encyclopedia of Genes 82 and Genomes (KEGG) pathways) in a gene signature, which is defined as a gene list ranked by 83 differential expression between mock-and SARS-infected samples by an appropriate statistical 84 method (Mitchell et al., 2013; Gardinassi et al., 2020) . Findings from these studies included 85 positively enriched modules associated with antiviral IFN, cell cycle and proliferation, and 86 monocytes and dendritic cells in peripheral blood mononuclear cells from SARS patients (Gardinassi 87 et al., 2020). Further, by combining GSEA with a network analysis approach, regulatory genes 88 associated with the pathogenicity of SARS have been identified (Mitchell et al., 2013) , demonstrating 89 that a GSEA-based approach is also good for gene candidate identification. While reported studies 90 have identified molecular changes associated with SARS infection, the lack of available treatment 91 options suggests that further examinations are needed. 92 In a prior study to identify novel genes associated with macrolide resistance in Streptococcus 93 pneumoniae, we demonstrated a GSEA-approach for gene identification that compared differential 94 gene expression between mRNA expression datasets (Goad and Harris, 2018 provided Entrez ID were removed from analysis. If multiple probes with the same Entrez ID existed, 160 the probe with the highest coefficient of variance across duplicate probes was selected. 161 From these datasets, we measured differential gene expression by Welch's two-sample T-test score 163 of z-score or log2 normalized values. We then defined 29 gene signatures (i.e., gene lists ranked by 164 differential gene expression between SARS-and mock-infected samples), which included 17 gene 165 signatures from human lung cultures and 12 signatures from mouse lung samples ( SARS-compared to mock-infected samples (e.g., positive T-score) fall within the positive tail of the 173 gene signature while under-expressed genes (e.g., negative T-score) fall in the negative tail 174 ( Figure 1A ). Genes with no substantial change in expression between SARS-and mock-infected 175 samples (e.g., T-score around 0) are located toward the middle of the gene signature. Therefore, 176 genes that fall within the tails of a gene signature likely change in response to a specific SARS 177 infection. We note that three signatures were substantially skewed (i.e., rank of genes where T-score 178 crosses from positive to negative is in top or bottom quartile of signature). For substantially skewed 179 signatures, we adjust all T-scores in the signature by the T-score of the gene at mid-point so that 180 genes in the signature are balanced between positive and negative T-scores. 181 Gene Set Enrichment Analysis (GSEA) is a statistical method that estimates enrichment between a 183 query gene set (i.e., unranked list of genes) and a reference gene signature (Subramanian et al., 184 2005) . GSEA uses the statistical metric used to rank genes in the reference signature (e.g., T-score) to 185 calculate a running summation enrichment score where hits (i.e., matches between query set and 186 reference signature) increase the enrichment score proportional to the ranking statistical metric 187 (e.g., T-score) and a miss (i.e., non-matches between query set and reference signature) decreases the 188 enrichment score. From this, GSEA determines a maximum enrichment score for the specific query 189 set and reference signature. Leading-edge genes contribute to reaching the maximum enrichment 190 score, indicating leading-edge genes are associated with cellular response to a specific β-coronavirus 191 infection. Further, GSEA calculates a normalized enrichment score (NES) from 1000 permutations of 192 the reference signature to estimate the significance of enrichment between a specific query set and 193 reference signature. This work used the javaGSEA Desktop Application release 3.0 version of GSEA 194 available from Broad Institute. 195 To identify gene expression changes associated with icSARS infection, we generated two icSARS 197 gene panels ( Figure 1B ). To do this, we selected 500 genes from the positive and negative tails from 198 the GSE47960-derived icSARSvsmock gene signature and used them to form two individual query 199 gene sets. GSEA compared each query gene set to the GSE47961-derived icSARSvsmock gene 200 signature (reference). Leading-edge genes from each analysis were used to define the two icSARS 201 gene panels, one panel per tail. Pathway enrichment analysis was performed on both icSARS gene 202 panels using Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8. 203 To verify the icSARS gene panels, we performed GSEA between icSARS gene panels and 205 GSE47962-derived, GSE37827-derived, GSE48142-derived, and GSE33267-derived icSARSvsmock 206 signatures ( Figure 2A ). To assess if results generated from GSEA could be achieved randomly, we 207 randomly selected 1000 gene panels consisting of either 233-or 114-genes, to match the number of 208 genes in the positive and negative icSARS panels, respectively, selected from the GPL6480 platform 209 used to define gene signatures (queries) for GSEA against GSE47962-derived, GSE37827-derived, 210 GSE48142-derived and GSE33267-derived icSARSvsmock signatures (references). These analyses 211 generated a null distribution of NES to which we compared the NES achieved by icSARS gene 212 panels for each reference gene signature and count the number of equal or better NES to estimate 213 significance (i.e., distribution p-value). Histogram data and associated graphs (e.g., distribution 214 curves and box and whiskers plot) were calculated using XLStat version 2020.3 (XLSTAT, 2013; 215 Addinsoft, 2019) . 216 This is a provisional file, not the final typeset article To examine differential gene expression of genes from the icSARS gene panels in mice, we 219 performed GSEA between icSARS gene panels (queries) and the GSE17400-derived icSARSvsmock 220 signature from mock-or icSARS-infected mouse lung samples (reference, Figure 2B ). To compare 221 gene signatures across infections caused by other SARS strains, we performed GSEA between 222 icSARS gene panels (queries) and 22 signatures from samples from cells infected with strains other 223 than icSARS, such as Urbani, MA15, SARS-BATSRBD, and mutants in ORF6, ExoNI, or NSP16 224 (references, Figure 2B ). Results 231 To identify genes associated with response to an icSARS infection, we began by defining the 234 GSE47960-derived icSARSvsmock and GSE47961-derived icSARSvsmock gene signatures 235 (Table 2) We used GSE47960-derived icSARSvsmock to generate two gene sets containing the 500 236 most differentially expressed genes from the positive and negative tails of GSE47960-derived 237 icSARSvsmock (T-score >2.9 and <-3.2 for positive and negative tails, respectively) , capturing 238 maximum coverage of the signature that was allowable by GSEA (Subramanian et al., 2005) . To 239 assess similarity between these two gene signatures, we calculated enrichment using GSEA between 240 either GSE47960-derived icSARSvsmock positive or negative tail gene sets (individual queries) and 241 the GSE47961-derived icSARSvsmock (reference), and achieved NES=3.28 and NES=-1.87 for 242 positive and negative tail query gene sets, respectively, both with a GSEA p-value<0.001. We icSARS infection. We also identified genes with no prior association with icSARS infection. These 252 genes included 12 genes had a zinc finger in addition to the two that were already reported (5.2% of 253 the panel), six genes encoding interferon induced proteins (2.6%), five genes from the solute carrier 254 family (2.1%), and six genes encoding for uncharacterized proteins (2.6%) in the positive icSARS 255 gene panel. In the negative icSARS gene panel, we found three zinc finger protein genes (2.6%), 256 three genes from the solute carrier family (2.6%), and five genes encoding for uncharacterized 257 proteins (4.4%). We speculate that these identified genes without previously reported associations 258 with icSARS infections in human lung epithelium cultures are also associated with an icSARS 259 infection. 260 Independent Datasets 262 To verify our icSARS gene panels are associated with response to an icSARS infection, we used 263 GSEA to calculate enrichment between our icSARS panels (individual queries) and four verification 264 gene signatures, GSE47962-derived icSARSvsmock, GSE37827-derived icSARSvsmock, 265 GSE48142-derived icSARSvsmock, and GSE33267-derived icSARSvsmock, (Table 2) , derived from 266 independent datasets (individual references). We found significant similarity between positive and 267 negative icSARS panels and GSE47962-derived icSARSvsmock (NES=3.51, Figure 3A , and 268 NES=-3.05, Figure 3B , for positive and negative icSARS panel, both GSEA p-value<0.001). To 269 determine how likely the NES achieved for icSARS panels would be achieved by random chance, we 270 generated 1000 randomly selected 233-or 114-gene panels from the GSE47960-derived 271 icSARSvsmock gene signature to match the size and potential composition of our icSARS panels. 272 We then repeated GSEA using these randomly generated gene panels (individual queries) and the 273 GSE47962-derived icSARSvsmock (reference) to generate a null distribution of NES achieved via 274 random chance. From this, we found random NES ranged from 1.47 to -1.66 and 1.56 to -1.95 for the 275 233-and 114-gene panels, respectively ( Figure 3C ), illustrating that NES achieved by our icSARS 276 panels are non-random (null distribution p-value<0.001). For GSE37827-derived icSARSvsmock, we 277 again observed similarity between positive and negative icSARS panels and GSE37827-derived 278 icSARSvsmock (NES=2.96, Figure 3D , and NES=-2.16, Figure This is a provisional file, not the final typeset article icSARS infection in human lung cultures and support the hypothesis that identified genes without 308 previously reported associations are also associated with icSARS infection in human lung cultures. 309 Positive icSARS Gene Panel and Mouse-derived icSARS Gene Signature 311 To determine if our icSARS gene panels are also associated with response to an icSARS infection in 312 a mouse model, we examined the GSE50000-derived icSARSvsmock gene signature (Table 2), 313 representing differential gene expression associated with icSARS infection in a mouse model. We 314 used GSEA to calculate enrichment between our icSARS panels (individual queries) and the 315 GSE50000-derived icSARSvsmock signature (reference) and observed significant enrichment with 316 the positive icSARS panel (NES=1.54, GSEA p-value<0.001, Figure 4A ) but not the negative 317 icSARS panel (NES=0.81, p-value=0.848, Figure 4B ). Achieved NES were non-random for the 318 positive icSARS panel (NES range: -1.65 to 1.57, null distribution p-value=0.002, Figure 4C We also noted that several icSARS panel genes were not included in this dataset's platform 335 (Supplemental Material STable 4) with 11 of these genes contained in leading-edges for all icSARS 336 gene signatures derived from human cell cultures, so we did not exclude them from further 337 consideration. A heat map of differential gene expression (i.e., T-scores) for these 20 genes across all 338 six icSARS gene signatures revealed that only XAF1 was statistically significant (Welch's two-339 sampled, two-sided T-test p-value<0.05) in all icSARS gene signatures ( Figure 3E ), highlighting the 340 advantage of using a GSEA-based rather than single-gene (e.g., T-score only) meta-analysis, which 341 would have likely missed the other genes identified here because of borderline significance in at least 342 one signature. 343 Figure 5A ), the GSE47961-derived signature from human cultures (NES=-0.95, 358 p-value=0.569) and the GSE50000-derived signature from mouse samples (NES=-0.81, 359 p-value=0.846). These data support our previous findings that enrichment of the negative icSARS 360 panel was inconsistent in icSARS signatures ( Figure 3B ). We confirmed that achieved significant 361 positive NES were non-random (null distribution p-value<0.002, Figure 5B and Figure 5C ) via 362 random modelling as previously described for all signatures except for the GSE33266-derived 363 MA15(10 2 )vsmock signature, which is the lowest infectious dose examined in this study. Due to this 364 finding, we remove the MA15(10 2 )vsmock signature from further analysis. 365 Top Gene Candidates 367 To determine which positive icSARS panel genes may make good therapeutic targets for SARS 368 infections caused by specific strains with varying virulence levels and mechanisms, we compared 369 inclusion of leading-edge genes identified by GSEA across 22 gene signatures derived from five 370 SARS strains (Supplemental Material STables 5-13). We began by examining leading-edge gene 371 inclusion in each SARS strain specifically ( Figure 6A ). Genes identified through leading-edge 372 intersections represent genes associated with infection of that specific SARS strain. We identified 373 several genes with known associations to infection with specific SARS strains, such as fos and jun 374 that were found in leading-edges from signatures derived from dORF6 (Sims et al., 2013) and Urbani 375 (Yoshikawa et al., 2010) infections in human lung cell cultures. We also found several genes with no 376 previously reported associations to that specific SARS infection. Taken together, these findings are 377 akin to findings from our meta-analysis of icSARS infection earlier that demonstrate the ability of 378 our meta-analysis approach to identify genes associated with specific SARS infections and predict 379 new genes associated with infections of specific SARS strains. 380 To determine which positive icSARS panel genes may be best therapeutic targets to preclude or 381 overcome SARS infections regardless of strain, we analyzed the intersection of common leading-382 edge genes across all six SARS strains examined in this study. We found five positive icSARS panel 383 genes, interferon-induced protein with C-X-C motif chemokine ligand 10 (CXCL10), 2'-5'-384 oligoadenylate synthetase 3 (OAS3), 2'-5'-oligoadenylate synthetase-like (OASL), tetratricopeptide 385 repeats 3 (IFIT3), and XIAP associated factor 1 (XAF1), in all 22 positive icSARS panel leading-386 edges ( Figure 6B ). Differential gene expression (i.e., T-scores) heat maps illustrated the strong 387 consistency and extent of expression changes observed across gene signatures ( Figures 4C and 6C among identified top gene candidates. This paper is the first study to report an association between 407 XAF1 and SARS infections for any strain, to the best of our knowledge, though a recent report 408 examining differential gene expression associated with SARS-CoV-2 infection identified increased 409 expression of CXCL10, OAS3, IFIT3, and XAF1 in human epithelial lung cells and lung samples 410 from Cynomolgus maca (cynomolgus monkey) and mice (Hachim et al., 2020) . We speculate that 411 applying our GSEA-based meta-analysis approach to examine gene expression datasets across 412 β-coronavirus infections may produce some interesting and novel insights into molecular changes 413 associated with all β-coronavirus infections which potentially could contribute to the identification of 414 therapeutic targets. 415 Defining a gene signature with the most unique genes possible is critical in optimizing achievable 417 results from GSEA (Subramanian et al., 2005) . There are several factors that impact how a gene 418 signature is defined and thus impact the overall outcome of the GSEA-based meta-analysis approach. 419 Such factors include gene presence in a platform and gene annotation. For example, if a gene is not 420 included in a dataset's platform (e.g., XAF1 in GSE33266-derived signatures, Figure 6C ), it cannot 421 be analyzed. We also notice a substantial loss of platform genes (Table 1) due to incomplete or 422 incompatible genome annotation (i.e., probe has no gene symbol and/or Entrez ID). This can become 423 more problematic when analyzing genes across different species since not all human genes have 424 homologs in other species. While both issues can be avoided if all datasets selected for analysis are 425 profiled on the same platform, these issues must be considered when comparing signatures across 426 platforms and species as done here. We could not improve initial gene inclusion in a platform or gene 427 annotation to recover lost genes, but we included icSARS panel genes that were not represented in 428 each signature specifically alongside identified leading-edge genes while considering genes 429 associated with infection with a specific SARS strain to ensure no genes were lost during our meta-430 analysis. As a result, not all signatures have expression data for our final gene candidates (Figures 4 431 and 6 ). 432 Signature skew can substantially affect obtainable results from GSEA. In strongly skewed signatures 433 (Table 2) , such as the icSARSvsmock validation signature derived from GSE33267 (T-score range: 434 12.2 to -12.2 with a midpoint around 1996 of 19751 genes), GSEA was unable to calculate 435 enrichment scores. To overcome this problem but maintain gene ranking, we adjusted all T-scores in 436 the strongly skewed signature by the T-score of the gene at that signature's actual midpoint. We did 437 not expect this adjustment to affect the overall results of this study since there were similarities in 438 leading-edge gene membership between adjusted and non-adjusted signatures. We recommend 439 caution when interpreting leading-edge gene membership from skewed signatures as membership 440 does not necessarily reflect the true expression relationship (e.g., membership in a positive leading-441 edge may arise from a raw negative T-score in that signature). 442 Another important consideration is signature inclusion in our meta-analysis. We included MA15-443 infected mouse signatures over a range of inoculation doses ( Figure 6 ) to mimic the range of 444 infection severity that would be encountered clinically (i.e., asymptomatic to severe patient 445 presentation). Including these signatures identified genes across inoculation doses with the best 446 therapeutic potential regardless of infection severity. We also removed the GSE33266 447 MA15(10 2 )vsmock signature from inclusion in our meta-analysis because its observed enrichment 448 was not statistically different from NES achieved randomly ( Figure 6 ). However, we noted that IFIT3 449 was the only top candidate in the leading-edge of the MA15(10 2 )vsmock signature and it was not 450 statistically significant individually (Welch's T-test p-value=0.160). These findings suggest there is a 451 lower inoculation dose limit to our approach's ability to detect relevant genes, which should be 452 considered when applying our meta-analysis to future research. 453 There are several factors to consider with respect to how this study was conducted, particularly 455 icSARS panel generation (i.e., query gene set size and signature selection) and time point selection. 456 To determine how query gene set sizes might impact our overall results, we ran GSEA between 457 GSE47961-derived icSARSvsmock (reference) and GSE47960-derived icSARSvsmock gene sets 458 ranging in size (queries) and found significant enrichment (p-value<0.05) with gene set sizes down to 459 100 genes (Supplemental Material SFig 1), again showing the high degree of similarity between 460 GSE47960-derived icSARSvsmock and GSE47961-derived icSARSvsmock gene signatures. 461 However, we noted that four of the five gene candidates identified here were only included in query 462 gene sets sized 400 genes or above and CXCL10 was only included in the 500 gene query set size 463 used in this study. Since 500 is the largest recommended query set size for accurate GSEA 464 calculations (Subramanian et al., 2005) , this discussion supports our decision that 500 was an optimal 465 query gene set size to use in this study. 466 Further, we noticed that enrichment did not substantially change if positive and negative tail query 467 gene sets were generated from GSE47961-derived icSARSvsmock and compared to GSE47960-468 derived icSARSvsmock as reference (Supplemental Material SFig 1). This highlights the similarity 469 between these signatures. When examining identified positive leading-edge genes, we noted that the 470 size of the positive icSARS panel did not substantially change from 233 genes (Table 1) Gene signatures are ranked lists of genes from high (red) to low (blue) differential mRNA expression 736 between groups. B) Generation of icSARS gene panels for use in this study. To identify differentially 737 expressed genes associated with icSARS infection in human airway epithelial cell cultures, query 738 gene sets containing either the 500 most over-or under-expressed genes from positive or negative 739 tails of the gene signature generated from the Gene Expression Omnibus (GEO) accession number 740 GSE47960 mRNA expression dataset. The positive and negative tail query sets were compared 741 individually to the gene signature generated from the GEO GSE47961 dataset, which was used as 742 reference for Gene Set Enrichment Analysis (GSEA). From this GSEA computed two enrichment 743 plots, one for each query set, and their associated normalized enrichment score (NES) and p-value 744 which represent the extent of enrichment between query set and reference signature. GSEA also 745 identified leading-edge genes, which are genes that contribute most to achieving maximum 746 enrichment. Two gene panels were defined from leading-edge genes identified in each query set. 747 These gene panels were used in this study for three purposes: 1) identification of gene expression 2 These values define the range of Welch's two-sample T-scores that were used to rank genes in the 843 signature. 844 3 Values in the Cross column reflect where (i.e., which gene) in the signature the T-score crosses 0. 845 4 T-scores in gene signature adjusted so Cross (T-score=0) occurs near center of the signature. 846 C, comparison; I, identification; N, number of samples; V, verification. 847 XLSTAT statistical and data analysis solution Use of hydroxychloroquine for pre-557 exposure prophylaxis in COVID 19: debate and suggested future course Hydroxychloroquine for 560 COVID-19: A review and a debate based on available clinical trials/case studies Human coronaviruses with emphasis on the COVID-563 19 outbreak. Virusdisease, 1-5 BET 565 bromodomain inhibition as a novel strategy for reactivation of HIV-1 NCBI 568 GEO: archive for functional genomics data sets--10 years on This is a provisional file, not the final typeset article archive for functional genomics data sets--update Synthetic recombinant bat SARS-like coronavirus is infectious in cultured cells and in mice In silico functional annotation of 577 a hypothetical protein from Staphylococcus aureus Descriptive review of geographic mapping of severe acute respiratory 580 syndrome (SARS) on the Internet BET-582 Inhibitors Disrupt Rad21-Dependent Conformational Control of KSHV Latency BET inhibition as a single or combined therapeutic approach in primary paediatric B-586 precursor acute lymphoblastic leukaemia The 588 Comparative Toxicogenomics Database: update 2019 Gene Expression Omnibus: NCBI gene 591 expression and hybridization array data repository Gene expression analysis of precision-cut human liver slices indicates stable 594 expression of ADME-Tox related genes Human Coronavirus: Host-Pathogen Interaction Immune and 599 Metabolic Signatures of COVID-19 Revealed by Transcriptomics Data Reuse. Front 600 Immunol 11 Identification and prioritization of macrolide resistance genes with 602 hypothetical annotation in Streptococcus pneumoniae Mechanisms of severe acute respiratory syndrome coronavirus-induced acute lung injury Complement Activation Contributes to Severe Acute Respiratory Syndrome 608 Effect of 610 Hydroxychloroquine in Hospitalized Patients with Covid-19 Interferon-Induced Transmembrane Protein (IFITM3) Is Upregulated Explicitly in 614 SARS-CoV-2 Infected Lung Epithelial Cells Annotation and 617 curation of uncharacterized proteins-challenges In Silico Structural and 620 Functional Annotation of Hypothetical Proteins of Vibrio cholerae O139 General mechanism of JQ1 in inhibiting various 623 types of cancer Increased mitochondrial ROS formation by acetaminophen in human hepatic cells is 626 associated with gene expression changes suggesting disruption of the mitochondrial electron 627 transport chain Identification and functional analysis of 'hypothetical' genes expressed in Haemophilus 630 influenzae Molecular mechanisms of environmental organotin toxicity in mammals Treatment of severe acute respiratory syndrome Interactome Rewiring Following Pharmacological Targeting of BET Bromodomains Human Coronaviruses: General Features Biomedical Sciences BET bromodomain inhibitors--a novel epigenetic 642 approach in castration-resistant prostate cancer The Comparative Toxicogenomics 645 Database (CTD) CCAT1 is an enhancer-templated RNA that predicts BET sensitivity in colorectal cancer Combination Attenuation Offers Strategy for Live Attenuated Coronavirus 651 Vaccines Efficacy of chloroquine and hydroxychloroquine in 653 the treatment of COVID-19 A network integration approach to predict conserved regulators related to 657 pathogenicity of influenza and SARS-CoV respiratory viruses Phosphorylation of p38 660 MAPK and its downstream targets in SARS coronavirus-infected cells Treatment strategies for Middle East respiratory syndrome coronavirus Computational structural and functional analysis of 665 hypothetical proteins of Staphylococcus aureus In silico approach for mining of potential drug 668 targets from hypothetical proteins of bacterial proteome COVID-19, SARS 671 and MERS: are they closely related? The current understanding and potential 674 therapeutic options to combat COVID-19 Keratinocyte-derived IL-36gamma 677 plays a role in hydroquinone-induced chemical leukoderma through inhibition of 678 melanogenesis in human epidermal melanocytes Hydroxychloroquine Proves Ineffective in Hamsters and Macaques Infected with SARS Predictive characterization of hypothetical 684 proteins in Staphylococcus aureus NCTC 8325 JQ1: a novel potential therapeutic 687 target Disturbed 689 XIAP and XAF1 expression balance is an independent prognostic factor in gastric 690 adenocarcinomas Release of severe acute respiratory syndrome coronavirus nuclear import block enhances host 693 transcription in human lung cells Functional annotation of hypothetical proteins -A 695 review Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles Receptor 3 Signaling via TRIF Contributes to a Protective Innate Immune 702 Response to Severe Acute Respiratory Syndrome Coronavirus Infection Hydroxychloroquine-inhibited dengue virus is associated with host defense machinery Molecular 708 Insights Into SARS COV-2 Interaction With Cardiovascular Disease: Role of RAAS and 709 MAPK Signaling COVID-19 prevention and treatment: A critical analysis of chloroquine and 712 hydroxychloroquine clinical pharmacology Xaf1 can cooperate with 715 TNFalpha in the induction of apoptosis, independently of interaction with XIAP Data analysis and statistics software for Microsoft Excel COVID-19: what has been learned and 720 to be learned about the novel coronavirus disease Dynamic innate immune responses of human bronchial epithelial cells to severe acute 724 respiratory syndrome-associated coronavirus infection Severe 727 acute respiratory syndrome coronavirus group-specific open reading frames encode 728 nonessential functions for replication in cell cultures and mice Four verification signatures were defined 755 from datasets containing cultures of mock-and icSARS-infected human airway epithelial cells 756 (GSE47962) or lung adenocarcinoma cells (GSE36727, GSE48142, and GSE33267). (B) Gene 757 signatures used for comparison to icSARS panels. Twenty-three comparison signatures were defined 758 from SARS strains varying in virulence (icSARS, Urbani, MA15, BAT-SRBD, ORF6, NSP16, and 759 ExoNI) with 11 comparison signatures derived from human lung adenocarcinoma cultures Figure 3. Verification of icSARS Gene Panels in Independent Datasets 763 GSEA detected significant enrichment between the This is a provisional file, not the final typeset article negative icSARS panel and the negative tail of the GSE47962-derived signature. C) Distribution plot 768 of NES from 1000 randomly generated gene panels (individual queries) compared to the GSE47962-769 derived signature (reference) shows that NES achieved in (A) and (B) are non-random. D) GSEA 770 found significant enrichment between the positive icSARS panel and the positive tail of the 771 GSE37827-derived gene signature. E) GSEA detected significant enrichment between the negative 772 icSARS panel and the negative tail of the GSE37827-derived signature. F) Distribution plot of 1000 773 randomly generated NES from the GSE37827-derived signature (reference) illustrated that NES 774 achieved in (D) and (E) are non-random. G) GSEA detected significant enrichment between the 775 positive icSARS gene panel and the positive tail of the GSE48142-derived gene signature. H) GSEA 776 detected significant enrichment between the negative icSARS panel and the negative tail of the 777 GSE48142-derived signature. I) Distribution plot of 1000 randomly generated NES from the 778 GSE48142-derived signature (reference) illustrated that NES achieved in (G) and (H) are non-779 random L) Distribution plot of 1000 randomly generated NES from the GSE33267-derived signature 783 (reference) shows NES achieved in (J) and (K) are non-random Positive icSARS Panel Enrichment in icSARS Infected Mouse Model Revealed Genes 786 Associated with icSARS Infection Set Enrichment Analysis (GSEA) detected significant enrichment, as determined by 789 normalized enrichment score (NES), between the positive icSARS gene panel and the positive tail of 790 the GSE50000-derived gene signature. B) GSEA did not detect enrichment between the negative 791 icSARS panel and the negative tail of the GSE50000-derived signature. C) Distribution plot of 1000 792 randomly generated NES from gene panels The authors would like to thank Jared Geller for the pathway analysis and Terri Pulice for the graphic 553 design assistance. 554 Genes column reflect the number of genes in gene signatures defined for this study. 841