key: cord-0746706-bccrgxd1 authors: Cillo, Anthony R.; Somasundaram, Ashwin; Shan, Feng; Cardello, Carly; Workman, Creg J.; Kitsios, Georgios D.; Ruffin, Ayana; Kunning, Sheryl; Lampenfeld, Caleb; Onkar, Sayali; Grebinoski, Stephanie; Deshmukh, Gaurav; Methe, Barbara; Liu, Chang; Nambulli, Sham; Andrews, Lawrence; Duprex, W. Paul; Joglekar, Alok V.; Benos, Panayiotis V.; Ray, Prabir; Ray, Anuradha; McVerry, Bryan J.; Zhang, Yingze; Lee, Janet S.; Das, Jishnu; Singh, Harinder; Morris, Alison; Bruno, Tullia C.; Vignali, Dario A.A. title: Critically ill COVID-19 patients exhibit peripheral immune profiles predictive of mortality and reflective of SARS-CoV-2 viral burden in the lung date: 2021-12-02 journal: Cell Rep Med DOI: 10.1016/j.xcrm.2021.100476 sha: 6b3d28321f2c1fb6ed942795c06accc0c0e8c27e doc_id: 746706 cord_uid: bccrgxd1 Despite extensive analyses, there remains an urgent need to delineate immune cell states that contribute to mortality in critically ill Coronavirus disease 2019 (COVID-19) patients. Here, we present high-dimensional profiling of blood and respiratory samples in severe COVID-19 patients to examine the association between cell-linked molecular features and mortality outcomes. Peripheral transcriptional profiles by single-cell RNAseq based deconvolution of immune states are associated with COVID-19 mortality. Further, persistently high levels of an interferon signaling module in monocytes over time leads to subsequent concerted upregulation of inflammatory cytokines. SARS-CoV-2 infected myeloid cells in the lower respiratory tract upregulate CXCL10, leading to a higher risk of death. Our analysis suggests a pivotal role for viral infected myeloid cells and protracted interferon signaling in severe COVID-19. The emergence of SARS-CoV-2 led to a worldwide pandemic 1 and subsequent clinical, 64 translational, and basic research efforts to combat this global health crisis. Despite the 65 rapid development and deployment of vaccines 2,3 , COVID-19 is projected to be a major 66 cause of critical illness and hospitalizations throughout 2021 and beyond before world-67 wide vaccine campaigns can control the pandemic. Furthermore, the emergence of highly 68 transmissible viral variants (e.g. alpha, beta, delta and gamma variants) are contributing 69 to infections and hospitalizations 4 , and pose a threat to effective immunity following 70 vaccination. Clinical trials examining pharmacologic interventions in severe have highlighted the efficacy of non-specific immunosuppression with corticosteroids 5-7 72 and targeted IL-6 blockade in patients treated soon after ICU admission 8 . Despite these 73 therapeutics, severe COVID-19 continues to carry a high mortality risk. Thus, in depth 74 analyses of cellular and molecular states, particularly of the immune system, that are 75 associated with survival or death in severe COVID-19 disease are urgently needed. 76 SARS-CoV-2 infection can cause acute lung injury leading to acute respiratory 77 distress syndrome (ARDS) 9 , dysregulated inflammation 10 and hyper-coagulability 11 . 78 Studies dissecting the immunopathology of COVID-19 have evaluated patients across 79 the spectrum of disease severity (mild to critical), and noted lymphopenia 9 with high levels 80 12 mortality. To distinguish the activation states of canonical monocyte subsets, we 247 performed gene set enrichment analysis 34 (Supplementary Data S1). Gene sets for 248 interferon signaling were highly enriched in intermediate monocytes, consistent with viral 249 sensing. We also utilized flow cytometry to analyze the expression levels of the SARS-250 CoV-2 entry receptor ACE2 (which has recently been shown to be an interferon 251 stimulated gene 40 ) and cofactor neuropilin-1 (NRP1) 41, 42 Our findings that monocyte inflammatory states may play an important role in 261 COVID-19 patient outcome prompted us to evaluate whether the Mono_cells_NOC2L and 262 Mono_cells_PDE6H gene module scores were correlated with levels of inflammatory 263 cytokines in plasma. Briefly, we measured levels of 44 cytokines on days 1, 5 and 10 from 264 COVID-19 patients using the Meso Scale Discovery platform (Methods). The 265 Mono_cells_PDE6H module trended towards a significant correlation with MIP-1β in 266 plasma (Fig. 3D) , consistent with the inclusion of the gene for MIP-1β (CCL4) in this 267 module. Conversely, the Mono_cells_NOC2L module correlated with CXCL10 in plasma 268 on day 1 (Fig 3E) , consistent with CXCL10 being strongly associated with response to 269 J o u r n a l P r e -p r o o f 13 interferon in monocytes and suggesting that the Mono_cells_NOC2L module is reflective 270 of an interferon response in monocytes. We next evaluated the extent to which this 271 chemokine was associated with survival. Using Cox proportional hazards regression 272 (Methods), we found that higher levels of plasma MIP-1β on day 1 were associated with 273 improved survival (Fig. 3E ). We also found that higher levels of CXCL10 in plasma on 274 day 1 trended towards an association with greater risk of death (Fig. 3G ). The patient 275 outcome trends associated with these cytokines and their molecular linkages with the 276 monocyte gene modules strengthened the overall interpretation of divergent monocytic 277 states that could underlie disease trajectories. Importantly, these measures of monocytes 278 states and plasma cytokines were not correlated with age ( Fig. S6A -D) or associated with 279 whether patients received glucocorticoids ( Fig. S6E-H) . Therefore, the monocyte gene 280 modules capture key aspects of divergent systemic inflammatory states in severe COVID-281 19 disease that are predictive of mortality. 282 We next used longitudinal data from COVID-19 patient samples to investigate the 283 stability of the initial monocytic cell states (day 1 post-enrollment) and their temporal 284 relationships with inflammatory cytokines. We first used a rank sum test to determine if 285 monocyte gene modules on days 5 and 10 were associated with subsequent mortality 286 and found that there were 60 monocyte modules nominally associated with outcome on 287 day 5 but none on day 10 (using a two-sided alpha of 10% as a cutoff; Supplementary 288 Data S1). We next evaluated which monocyte modules on day 5 were correlated with 289 Mono_cells_NOC2L or Mono_cells_PDE6H monocyte modules on day 1 (Supplementary 290 Data S1). This analysis revealed the Mono_cells_NOC2L module to be strongly 291 correlated with itself at days 1 and 5 (Fig. 3H) . There was also a strong negative 292 J o u r n a l P r e -p r o o f 14 correlation between Mono_cells_PDE6H on day 1 and Mono_cells_NOC2L on day 5 (Fig. 293 3I). These data demonstrate that monocyte states on day 1, as measured by NOC2L or 294 PDE6H gene module scores, are dominated by one signature or the other and remain 295 reflective of monocyte states on day 5. 296 Finally, we sought to evaluate the significance of protracted elevation of the 297 Mono_cells_NOC2L gene module. To this end, we evaluated the relationships between 298 Mono_cells_NOC2L levels and inflammatory cytokines levels in plasma on day 5 and 299 found that Mono_cells_NOC2L was strongly correlated with inflammatory cytokines such 300 as CXCL10, IL-6, TNFα and IL-8 on day 5 (Fig. 3J ). Taken together, these data suggest 301 that sustained viral sensing and interferon signaling by monocytes (inferred by high 302 Mono_cells_NOC2L levels) promote an inflammatory immune state involving a network 303 of cytokines leading to immunopathogenesis and organ damage. Importantly, high levels 304 of Mono_cells_PDE6H on day 1 appear to dictate a disease course in which inflammatory 305 cytokines do not become correlated at day 5. 306 We next aimed to characterize immune transcriptional states by scRNAseq in 309 endotracheal aspirate (ETA) samples from mechanically ventilated patients. Like our 310 PBMC analysis, we visualized cells using UMAP and identified cell types by canonical 311 gene expression profiles (Fig. 4A ). Compared to PBMC, we identified cells with unique 312 expression profiles consistent with epithelia and CD14+CD16+ macrophages in addition 313 to populations found in PBMC. Graph-based clustering revealed 5 distinct clusters of 314 myeloid cells (clusters 1, 2, 4, 5, and 6), with the CD14+CD16+ macrophages having two 315 J o u r n a l P r e -p r o o f 15 distinct cell states (clusters 1 and 4; Fig. 4B ). Next, we evaluated expression of SARS-316 CoV-2 transcripts by scRNAseq (Methods). Briefly, inclusion of the SARS-CoV-2 genome 317 in the GRCh38 reference sequence permitted mapping of viral sequences to individual 318 cells 38, 43, 44 . We found that CD14+CD16+ macrophages had the highest levels of SARS-319 CoV-2 transcript expression (as measured by the viral N gene; Fig. 4C ). When evaluated 320 by graph-based clustering, we found that the CD14+CD16+ macrophage cluster 4 had 321 the highest levels of viral transcripts, compared with a paucity of transcripts from 322 CD14+CD16+ macrophage cluster 1 (Fig. 4D ). This suggested that infected myeloid cells 323 may have a unique expression profile. 324 We next evaluated differentially expressed genes between myeloid cell clusters. 325 Consistent with peripheral blood immune signatures, we found that intermediate 326 monocytes had a strong interferon response signature (Fig. 4E) . Interestingly, infected 327 CD14+CD16+ macrophages had high expression levels of specific cytokines including 328 CXCL10 (Fig. 2G ). This finding was suggestive of a role of infected myeloid cells in 329 COVID-19 immunopathogenesis and motivated further interrogation in external cohorts. 330 We therefore performed a meta-analysis across datasets to determine if CXCL10 331 was consistently upregulated in infected myeloid cells in the lower respiratory tract 332 (Methods). We identified 4 studies in which scRNAseq was performed on lower 333 respiratory tract cells 38, 43, 44 contributing to release of inflammatory cytokines from macrophages 50 , thereby promoting 379 further T cell activation 44 . In the context of our findings that elevated CXCL10 levels in 380 plasma are predictive of death, we propose that higher levels of CXCL10 in peripheral 381 blood are reflective of a higher degree of viral burden (and correspondingly greater 382 immunopathogenesis) in the lung. 383 Our combined analysis of peripheral blood immune cell states, lower respiratory 384 tract cells, and plasma cytokines provides important insight into the pathogenesis of 385 severe COVID-19. A specific peripheral blood monocyte state characterized by the 386 Mono_cells_NOC2L gene module was correlated with plasma levels of CXLC10, 387 suggesting that peripheral blood immune signatures are reflective of the degree of 388 infection in the lower respiratory tract (given that infected myeloid cells appear to be a 389 dominant source of CXCL10). Interestingly, the Mono_cells_NOC2L was governed by 390 genes associated with viral sensing and type I interferon response and activated 391 monocytes states were associated with elevated expression of ACE2 and NRP1. Plasma 392 viremia is also associated with increased risk of mortality in COVID-19 patients 51 , 393 suggesting that peripheral monocytes may become activated by inflammatory cytokines, 394 express entry receptors such as ACE2, and become further activated by viral entry or 395 opsonization, and triggering innate viral sensors such as RIG-I in monocytes. This 396 pathogenic process would not require productive viral infection of monocytes but could 397 nevertheless contribute to viral pathogenesis by driving increasing immune-related 398 inflammation. 399 The finding that CCL4 (MIP-1) was associated with survival is consistent with its 400 positive prognostic role in dengue infection 52 . In hepatitis C infection, higher MIP-1 was 401 associated with viral control following treatment with antiviral therapy 53 . Interestingly, MIP-402 1 is a type I interferon dependent gene but does not directly inhibit viral replication. 403 Instead, it promotes the recruitment of monocytes to infected tissue to prevent viral 404 spread throughout the tissue 54 . In this context, our findings suggest a model in which 405 J o u r n a l P r e -p r o o f 19 preferential expression of MIP-1 in relation to other IFN response genes by monocytes 406 protects infected tissue in the lung during severe High levels of inflammatory cytokines are associated with increased risk of death 408 in COVID-19 patients 14 , but the immunologic drivers of these cytokine levels remain 409 incompletely understood. Type I IFN signaling has been proposed to play both a 410 protective and pathogenic role in COVID-19 disease depending on its kinetics 18-20 . It is 411 essential early in infection in moderating COVID-19 disease, as patients with either IFN 412 auto-antibodies 16 or inborn errors of type I interferon 17 production have a much higher risk 413 of severe COVID-19. However dysregulated interferon signaling at later times during 414 COVID-19 disease progression appears to be pathogenic 55,56 . Our analysis revealed that 415 monocyte gene modules on day 1 dictate distinct disease trajectories, with the high levels 416 of the interferon response associated Mono_cells_NOC2L module on day 1 correlating 417 with its levels on day 5 in individual patients. Additionally, the Mono_cells_NOC2L module 418 on day 5 is correlated with numerous cytokines associated with disease severity and 419 death, including CXCL10, IL-6, TNFα and IL-8. In support of this association, a late 420 immune juncture beginning at 17 days following symptom onset (and corresponding to 421 day 10 in our cohort) was recently described as associated with death in COVID-19 422 patients 37 . Thus, we infer that protracted interferon signaling (as reflected by consistently 423 high Mono_cells_NOC2L module scores, ICU day 1-5) promotes an immune state leading 424 to immunopathogenesis, organ damage, and death. 425 Our study (which is supported by validation studies with external patient cohorts) 426 adds important details of immunopathogenesis. The high-dimensional approaches we 427 applied to our patient cohort allowed us to not only characterize peripheral biomarkers 428 Lead Contact 558 Further information and requests for resources and reagents should be directed to and 559 will be fulfilled by the Lead Contact, Dario A.A. Vignali (dvignali@pitt.edu). 560 This study did not generate new unique reagents. 562  Data: Both raw and processed transcriptomic are available through the Gene 564 Expression Omnibus (GSE180578). Supplementary Data S1 is available on 565 Mendeley Data at http://dx.doi.org/10.17632/r83csstphc.1. 566  Code: The Seurat package was used for scRNAseq normalization, scaling, 567 dimensionality reduction, UMAP visualization, clustering, and differential gene 568 expression analysis. Code for these steps is available through Seurat's website 569 (https://satijalab.org/seurat/). Code for gene set enrichment analysis is available at 570 www.github.com/arc85/singleseqgset. Code is provided at 571 https://github.com/arc85/covid_analyses files to demonstrate gene module 572 discovery, training of machine learning algorithms and leave one out cross 573 validation analysis, and meta-analysis of infected myeloid cells. 574  General statement: Any additional information required to reanalyze the data 575 reported in this paper is available from the Lead Contact upon request. at least one negative SARS-CoV-2 qPCR test and were diagnosed with a non-COVID 590 etiology of acute respiratory illness (non-COVID group). For comparisons against healthy 591 controls, we also included a single blood biospecimen from 10 healthy donors. 592 From enrolled critically ill patients, we collected blood specimens upon enrollment 593 (day 1), and then if the patients remained in the ICU, we collected follow-up blood samples 594 on days 5 and 10 post-enrollment. From intubated patients, we also collected 595 endotracheal aspirate (ETA) samples for profiling of the lower respiratory tract. 596 Demographics information include age, gender, and race, history of chronic 597 diseases such as diabetes and chronic obstructive pulmonary disease (COPD), timing of 598 disease onset and whether patients received mechanical ventilation, and outcome data 599 are all presented in Table S1 . Based on the modules scores, a median module scores were assigned to each patient 728 for a given immune lineage. If less than 20 cells were present in a given patient sample, 729 the median module score was imputed as the median across all patients. 730 We sought to leverage machine learning algorithms in conjunction with gene modules 732 scores on day 1 post-enrollment in the ICU to predict outcome for critically ill patients with 733 COVID-19. To do this, we utilized our cohort from the University of Pittsburgh as a 734 discovery cohort to train a support vector machine (with a linear kernel), random forest, 735 and neural net. To reduce the dimensionality prior to input into the machine learning 736 J o u r n a l P r e -p r o o f 34 algorithms, we first performed PCA using all scaled and centered gene modules scores 737 from day 1 that were significantly associated with outcome (p<0.05 by Wilcoxon rank sum 738 test) as input. If a patient did not have a gene module score for a particular lineage due 739 to too few cells present, the median gene module score from all samples was used as the 740 interpolated value. We then used the embeddings for each patient in the first two principal 741 components from the PCA as input for the machine learning algorithms. We implemented 742 these algorithms through the R package caret 60 . 743 Cross-validation for the discovery cohort was performed using a leave-one-out analysis. 744 To prevent data leakage between folds of cross validation, gene modules from day 1 745 samples that had a p<0.05 for a Wilcoxon rank sum test between patients who survived 746 versus those that died were selected within each fold. Then, modules were scaled and 747 PCA was performed for dimensionality reduction in each fold, once again using PC1 and 748 PC2 were used as predictive variables. Tuning parameters for each algorithm were D CCL20 CXCL3 IL1B CXCL8 CXCL2 CCL3 TIMP1 CCL3L1 CCL4L2 CCL4 PI3 HES4 IFIT2 IFIT3 IL1RN G0S2 IFITM2 ISG15 IFIT1 RSAD2 CCL8 CTSL CCL2 CCL7 Coronavirus Resource Center FDA Takes Key Action in Fight Against Action 848 Follows Thorough Evaluation of Available Safety, Effectiveness, and 849 Manufacturing Quality Information by FDA Career Scientists Safety and 853 Efficacy of the BNT162b2 mRNA Covid-19 Vaccine Genetic Variants of SARS-CoV-2-What 856 Do They Mean? Dexamethasone in 859 Hospitalized Patients with Covid-19 -Preliminary Report Association 863 Between Administration of Systemic Corticosteroids and Mortality Among Critically 864 Ill Patients With COVID-19: A Meta-analysis Repurposed Antiviral Drugs for Covid-19 -Interim 869 WHO Solidarity Trial Results Association 872 Between Administration of IL-6 Antagonists and Mortality Among Patients 873 Hospitalized for COVID-19: A Meta-analysis Clinical Characteristics of Coronavirus Disease 876 2019 in China course, and outcomes of critically ill adults with COVID-19 880 in New York City: a prospective cohort study Haematological characteristics and risk factors in the 884 classification and prognosis evaluation of COVID-19: a retrospective cohort study. 885 The Lancet Eosinopenia and elevated C-reactive protein facilitate triage of COVID-19 patients in fever clinic: A retrospective case-control 889 study A dynamic COVID-19 immune signature includes associations with poor 893 prognosis An 896 inflammatory cytokine signature predicts COVID-19 severity and survival Cytokine 900 elevation in severe and critical COVID-19: a rapid systematic review, meta-901 analysis, and comparison with other inflammatory syndromes. The Lancet. 902 Respiratory medicine Autoantibodies 905 against type I IFNs in patients with life-threatening COVID-19 Inborn errors of type I IFN 909 immunity in patients with life-threatening COVID-19 Immunophenotyping of COVID-19 and 913 influenza highlights the role of type I interferons in development of severe COVID-914 19 Type I and III interferons disrupt lung epithelial 917 repair during recovery from viral infection Mouse model of SARS-CoV-2 reveals 921 inflammatory role of type I interferon signaling A human circulating immune cell landscape in aging and 925 COVID-19 Severe 928 COVID-19 Is Marked by a Dysregulated Myeloid Cell Compartment Frontline Science: COVID-19 infection induces readily 932 detectable morphologic and inflammation-related phenotypic changes in monocytes Major alterations 937 in the mononuclear phagocyte landscape associated with COVID-19 severity Longitudinal profiling of respiratory and systemic immune responses reveals 942 myeloid cell-driven lung inflammation in severe COVID-19 COVID-19 versus Non-946 COVID-19 Acute Respiratory Distress Syndrome: Comparison of Demographics Physiologic Parameters, Inflammatory Biomarkers, and Clinical Outcomes. Ann 948 Am Thorac Soc Deep 951 immune profiling of COVID-19 patients reveals distinct immunotypes with 952 therapeutic implications Comprehensive mapping of immune perturbations associated with severe COVID-956 19 Pathogenic T cells and inflammatory monocytes incite 959 inflammatory storm in severe COVID-19 patients Dengue virus infection induces expansion of a CD14(+)CD16(+) 964 monocyte population that stimulates plasmablast differentiation Prothrombotic autoantibodies 968 in serum from patients hospitalized with COVID-19 Diverse Functional Autoantibodies in Patients 972 with COVID-19. medRxiv New-Onset IgG 976 Autoantibodies in Hospitalized Patients with COVID-19. medRxiv Immune Landscape of Viral-and 980 Carcinogen-Driven Head and Neck Cancer Dimensionality reduction for visualizing single-cell 984 data using UMAP A scalable 987 SCENIC workflow for single-cell gene regulatory network analysis Time-resolved 991 systems immunology reveals a late juncture linked to fatal COVID-19 COVID-19 immune features revealed by a large-scale single-cell 995 transcriptome atlas Single-cell multi-omics analysis of the immune response in COVID-19 Receptor ACE2 Is an Interferon-Stimulated Gene in Human Airway Epithelial Cells 1003 and Is Detected in Specific Cell Subsets across Tissues Neuropilin-1007 1 facilitates SARS-CoV-2 cell entry and infectivity Neuropilin-1 is a host factor for SARS-CoV-2 infection Single-cell landscape of bronchoalveolar immune cells in 1015 patients with COVID-19 Circuits between infected macrophages and T cells in SARS-CoV-2 1019 pneumonia COVID-19 tissue atlases reveal SARS-CoV-2 pathology and cellular targets Single cell 1026 RNA sequencing identifies an early monocyte gene signature in acute respiratory 1027 distress syndrome Active replication of Middle East respiratory 1030 syndrome coronavirus and aberrant induction of inflammatory cytokines and 1031 chemokines in human macrophages: implications for pathogenesis Severe 1034 acute respiratory syndrome and the innate immune responses: modulation of 1035 effector cell function without productive infection Middle East respiratory 1038 syndrome coronavirus infection: virus-host cell interactions and implications on 1039 pathogenesis Lower 1042 Respiratory Tract Myeloid Cells Harbor SARS-Cov-2 and Display an Inflammatory 1043 Phenotype SARS-CoV-2 viral 1046 load is associated with increased disease severity and mortality Multiplex cytokine profile from dengue 1050 patients: MIP-1beta and IFN-gamma as predictive factors for severity Temporal dynamics of inflammatory 1054 cytokines/chemokines during sofosbuvir and ribavirin therapy for genotype 2 and 1055 3 hepatitis C infection Type I interferon-dependent CCL4 is induced by a cGAS/STING 1058 pathway that bypasses viral inhibition and protects infected tissue, independent of 1059 viral burden Longitudinal analyses reveal 1062 immunological misfiring in severe COVID-19 Imbalanced Host 1066 Response to SARS-CoV-2 Drives Development of COVID-19 Cell Hashing with barcoded antibodies 1070 enables multiplexing and doublet detection for single cell genomics Comprehensive 1074 Integration of Single-Cell Data SCENIC: single-cell regulatory network inference and clustering Building predictive models in R using the caret Package pROC: an open-source package for R and S+ to analyze and compare 1084 ROC curves Modeling Survival Data: Extending the Cox 1086 Model Using Time Dependent Covariates and Time Dependent 1088 Coefficients in the Cox Model We performed permutation testing to determine if machine learning models outperformed 753 the same models with permuted class labels. We determined the accuracy of each of the 754 machine learning models when leaving one patient out for the Pitt cohort, and compare 755 the resulting accuracy scores to models generated with the outcome labels for each 756 patient permuted. We then compared the resulting accuracy using a t-test. 757Validation of the peripheral immune signature for mortality 758 J o u r n a l P r e -p r o o f 35 To determine if the peripheral immune signatures derived from our discovery cohort were 759 valid across cohorts, we identified and analysis 3 external datasets consisting of single-760 cell RNAseq of PBMC from patients with COVID-19. Datasets were selected by having 761 publicly available single-cell RNAseq data from PBMC of COVID-19 patients and 762 outcome data for each patient. For each of these external datasets, the processed data 763 was downloaded from the Gene Expression Omnibus. Cell types annotations were used 764 when provided by authors, and were characterized by canonical gene expression profiles 765 otherwise. Patients were selected for inclusion based on availability of information 766 pertaining to survival. Patients were retained in all instances where symptomatic disease 767 was noted. Gene module scores for the statistically significant gene modules on day 1 in 768 the discovery cohort were calculated for each immune lineage for each patient in the 769 validation cohorts using the AddModuleScore function from Seurat. After gene module 770 scores were calculated, the machine learning algorithms derived from the discovery 771 cohort were applied to the validation cohorts. Receiver operating curves and area under 772 the curve were calculated using the R package pROC 61 as described above. 773 Further dissection of myeloid lineages was performed by bioinformatically isolating and 775 re-clustering all myeloid cells. Datasets were once again integrated for downstream 776 visualization and clustering based on fresh versus frozen status using the workflow 777 described above. Differentially expressed genes were identified using a Wilcoxon rank 778 sum test as implemented in Seurat. Gene set enrichment analysis was performed as 779 previously described using the R package singleseqgset 34 . Statistically significant gene 780 J o u r n a l P r e -p r o o f 36 sets were identified as those that had p values corrected for false discovery rate of less 781 than 5%. 782 We Sample concentrations for each marker were then calculated based on the respective 799 standard curve. For analysis, the limit of detection was set at 50% of the lowest limit of 800 detection across all analytes (i.e. 0.045 pg/mL). 801 J o u r n a l P r e -p r o o f 37 We sought to evaluate whether there were consistent gene expression profiles of infected 803 myeloid cells in the lower respiratory tract of patients with COVID-19. To do this, we first 804 identified datasets in which single-cell RNAseq was performed on either sputum, 805 endotracheal aspirates, bronchoalveolar lavage (BAL) fluid, or lung tissues. We identified 806 4 studies that fit these criteria, from which we downloaded the processed gene expression 807 datasets from the Gene Expression Omnibus. We note that 1 study used was a post-808 mortem autopsy study in which single nuclei were processed rather than single cells. We 809 next visualized all cells from each study by UMAP, and either leveraged the cell type 810 annotations noted by the authors when available or defined cells by canonical gene 811 expression profiles otherwise. Likewise, we also used cluster annotations provide by the 812 authors for downstream analysis or generated new graph-based clusters as previously 813 described. 814To perform a meta-analysis to robustly characterize expression profiles of infected 815 myeloid cells, we next identified the most highly infected myeloid cell cluster in each 816 individual study. This cluster was identified as the cluster with the highest frequency of 817 Survival analysis was performed using Cox's proportional hazard regression analysis as 827 implement in the R package survival 62,63 . Likelihood ratio tests were used to determine 828 the statistical significance of the survival models. Wilcoxon rank sum tests were used to 829 evaluate differences in immune cell frequencies and mean fluorescence intensity by flow 830 cytometry, to calculate differentially expressed genes from the scRNAseq analysis, and 831 identify gene modules that were statistically different between patients who survived and 832 those that died. Spearman's correlation was used to calculate the correlation between 833 gene module scores and cytokine levels in plasma, between gene modules on day 1 and 834 day 5, and between cytokines and gene modules on day 5. A two-sided alpha of 5% was 835 considered significant unless otherwise noted.