key: cord-261435-wcn4bjnw authors: Ren, Xianwen; Wen, Wen; Fan, Xiaoying; Hou, Wenhong; Su, Bin; Cai, Pengfei; Li, Jiesheng; Liu, Yang; Tang, Fei; Zhang, Fan; Yang, Yu; He, Jiangping; Ma, Wenji; He, Jingjing; Wang, Pingping; Cao, Qiqi; Chen, Fangjin; Chen, Yuqing; Cheng, Xuelian; Deng, Guohong; Deng, Xilong; Ding, Wenyu; Feng, Yingmei; Gan, Rui; Guo, Chuang; Guo, Weiqiang; He, Shuai; Jiang, Chen; Liang, Juanran; Li, Yi-min; Lin, Jun; Ling, Yun; Liu, Haofei; Liu, Jianwei; Liu, Nianping; Liu, Yang; Luo, Meng; Ma, Qiang; Song, Qibing; Sun, Wujianan; Wang, GaoXiang; Wang, Feng; Wang, Ying; Wen, Xiaofeng; Wu, Qian; Xu, Gang; Xie, Xiaowei; Xiong, Xinxin; Xing, Xudong; Xu, Hao; Yin, Chonghai; Yu, Dongdong; Yu, Kezhuo; Yuan, Jin; Zhang, Biao; Zhang, Tong; Zhao, Jincun; Zhao, Peidong; Zhou, Jianfeng; Zhou, Wei; Zhong, Sujuan; Zhong, Xiaosong; Zhang, Shuye; Zhu, Lin; Zhu, Ping; Zou, Bin; Zou, Jiahua; Zuo, Zengtao; Bai, Fan; Huang, Xi; Bian, Xiuwu; Zhou, Penghui; Jiang, Qinghua; Huang, Zhiwei; Bei, Jin-Xin; Wei, Lai; Liu, Xindong; Cheng, Tao; Li, Xiangpan; Zhao, Pingsen; Wang, Fu-Sheng; Wang, Hongyang; Su, Bing; Zhang, Zheng; Qu, Kun; Wang, Xiaoqun; Chen, Jiekai; Jin, Ronghua; Zhang, Zemin title: Large-scale single-cell analysis reveals critical immune characteristics of COVID-19 patients date: 2020-10-29 journal: bioRxiv DOI: 10.1101/2020.10.29.360479 sha: doc_id: 261435 cord_uid: wcn4bjnw Dysfunctional immune response in the COVID-19 patients is a recurrent theme impacting symptoms and mortality, yet the detailed understanding of pertinent immune cells is not complete. We applied single-cell RNA sequencing to 284 samples from 205 COVID-19 patients and controls to create a comprehensive immune landscape. Lymphopenia and active T and B cell responses were found to coexist and associated with age, sex and their interactions with COVID-19. Diverse epithelial and immune cell types were observed to be virus-positive and showed dramatic transcriptomic changes. Elevation of ANXA1 and S100A9 in virus-positive squamous epithelial cells may enable the initiation of neutrophil and macrophage responses via the ANXA1-FPR1 and S100A8/9-TLR4 axes. Systemic upregulation of S100A8/A9, mainly by megakaryocytes and monocytes in the peripheral blood, may contribute to the cytokine storms frequently observed in severe patients. Our data provide a rich resource for understanding the pathogenesis and designing effective therapeutic strategies for COVID-19. HIGHLIGHTS Large-scale scRNA-seq analysis depicts the immune landscape of COVID-19 Lymphopenia and active T and B cell responses coexist and are shaped by age and sex SARS-CoV-2 infects diverse epithelial and immune cells, inducing distinct responses Cytokine storms with systemic S100A8/A9 are associated with COVID-19 severity The coronavirus disease 2019 (COVID-19) is an ongoing pandemic infectious disease, 104 caused with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Currently, 105 it has caused with around 29 million infections and close to 1 million deaths according to 106 the statistics of World Health Organization until September 15, 2020, with the fatality rate as 107 high as ~10% in specific regions. Although many COVID-19 patients experience 108 asymptomatic, mild or moderate symptoms, some patients progress to severe conditions 109 and even death. It is thus of paramount importance to understand the disease mechanisms 110 and the underlying factors associated with vulnerabilities, which are critical for controlling 111 the pandemic and alleviating the global crisis. It is also critical to systematically investigate 112 differences between clinical presentations (mild/moderate and severe), or between 113 treatment outcomes (disease progression and convalescence) of patients, as they can 114 provide important guidance to the development of effective therapeutics and vaccines. findings. Here we applied scRNA-seq to a large cohort with 205 individuals, including 139 hospitalized COVID-19 patients with moderate or severe disease, and patients in the 140 convalescent stage, as well as healthy controls. With high-quality transcriptomics data of 141 ~1.5 million single cells, we reveal that SARS-CoV-2 could infect a wider range of cell types 142 than previous understanding, and induce distinct phenotypic changes in those infected cells. 143 Such heterogeneity of SARS-CoV-2 infection has important immunological implications as 144 such cells exhibit distinct interaction potentials with innate and adaptive immune cells. We 145 also observed critical changes in the peripheral blood discriminating mild/moderate from 146 severe COVID-19 patients in the disease progression or convalescence stages, and found 147 their association with patient sex and age. Further, our large cohort analysis provides a 148 unique opportunity to reveal the characteristics of cytokine storms in patients, and to further 149 illustrate the cell subpopulations that might contribute to the inflammatory responses and 150 the hyper-inflammatory genomic signatures under SARS-CoV-2 infection. Our findings may 151 have important implications to the research, treatment, control and prevention of COVID-19. 152 Integrated analysis of the COVID-19 scRNA-seq data 154 To systematically characterize the immune properties at single-cell resolution in the COVID-155 19 patients, we formed a Single Cell Consortium for COVID-19 in China (SC4), which 156 consisted of researchers from 36 research institutes or hospitals from different regions of 157 China. Members of SC4 contributed COVID-19 related scRNA-seq data, mostly still 158 unpublished, for a total of 205 individuals, including 25 patients with mild/moderate 159 symptoms, 63 hospitalized patients with severe symptoms, and 92 recovered convalescent 160 persons, as well as 25 healthy controls ( Figure 1A and Table S1 ). While most previous 161 studies did not discriminate whether convalescent individuals recovered from mild/moderate 162 or severe symptoms, we divided the convalescent group into two subgroups, 54 recovered 163 from mild/moderate symptoms and 38 recovered from severe symptoms, to investigate the 164 effects of disease severity on the immune status of recovered individuals. This cohort 165 covered a wide age range (from 6 to 92 years old), with the mild/moderate and severe 166 groups having significant age differences (Figure S1A We applied a common set of stringent quality control criteria to ensure that the selected 184 data were from single and live cells and that their transcriptomic phenotypes were 185 comprehensively characterized. A total of 1,462,702 high-quality single cells were ultimately 186 obtained, with an average of 4,835 unique molecular identifiers (UMIs), representing 1,587 187 genes ( Figures S1D and S1E ). With the large-scale of data, we obtained 64 cell clusters, 188 covering diverse epithelial cells in the respiratory system, megakaryocytes, mast cells, 189 myeloid cells, and NK/T/B cells ( Figure 1B ). Such an information-rich resource (available at 190 http://covid19.cancer-pku.cn/ for quick browsing) enabled accurate annotation and analysis 191 of these cell clusters at different resolutions ( Figure 1C , Figure S1F-J and Table S2), which 192 allow the elucidation of potential molecular and cellular mechanisms underlying the 193 pathogenesis of SARS-CoV-2 infection and differences of human immune responses for 194 patients with distinct symptoms. 195 196 Notable differences could be observed in the immune compositions of healthy controls and 197 COVID-19 patients with mild/moderate or severe symptoms ( Figure 1D ) or between the 198 disease progression stages and convalescence ( Figure 1E ) based on the t-distributed 199 stochastic neighbor embedding (t-SNE) projection. The tissue preference of each cluster 200 was illustrated based on the ratio of observed to randomly expected cell numbers (Ro/e, 201 Figure 1F ), partially reflecting the validity of cell clustering. Notably, various clusters of 202 proliferating CD8+ and CD4+ T, and plasma B cells were more enriched in BALF than 203 PBMCs, indicating activated adaptive immune responses in the lung ( Figure 1F ). 204 205 We first analyzed the compositional changes of the broad categories of immune cells for 206 PBMCs in different COVID-19 patient groups. Notably, the percentages of megakaryocytes 207 and monocytes in PBMCs were elevated, particularly in severe COVID-19 patients during 208 the disease progression stage (Figure 2A) The increased plasma B cells in peripheral blood appeared to be derived from active 231 proliferation of plasmablasts and transitions from memory B cells based on the paired BCR 232 sequencing analyses. Both the extent of BCR clonal expansion and the diversity of the total 233 BCR repertoire of these cells were significantly increased in severe COVID-19 patients 234 ( Figure 2C ). Plasmablast cells (B_c06_MKI67), characterized by high expression of MKI67 235 and thus indicating a proliferative state, were elevated in the peripheral blood of severe 236 COVID-19 patients ( Figure S2A ) and shared the most clonotypes with plasma cells (Figure 237 2D). The memory B cell cluster expressing high levels of CD27, CD80, AIM2, GRIP2, and 238 COCH (B_c03-CD27-AIM2) was the second major source of plasma B cells in the 239 peripheral blood, which shared a large proportion of clonotypes with plasma cells and 240 plasmablasts ( Figure 2D ). Distinct from plasma cells and plasmablasts which were mainly 241 composed of IgAs and IgGs, B_c03-CD27-AIM2 had a higher proportion of IgMs ( Figure 242 2E), indicating a precursor state. 243 244 We applied analysis of variance (ANOVA) to dissect the associations of compositional 245 changes of plasma B cells with disease severity, stage (progression or convalescence), age, 246 sex, or the interactions of these factors. We found that plasma B cells in blood were 247 specifically associated with the disease severity of COVID-19, and then disease stage, but 248 had no associations with age or sex observed ( Figure 2F ) (Takahashi et al., 2020). In fact, 249 for the mild/moderate disease, convalescent patients harbored higher levels of plasma B 250 cells than those in the disease progression stage. By contrast, the plasma B cell levels in 251 convalescent patients who recovered from severe disease were significantly lower than 252 those in the disease progression stage ( Figure 2B ). Interestingly, the precursors of plasma 253 B cells, i.e., cells of B_c03-CD27-AIM2, appeared to be associated with sex differences 254 ( Figure 2G ). In females, the percentage of B_c03-CD27-AIM2 cells was significantly higher 255 than that of males ( Figure 2G ). Almost all B cell clusters were associated with disease 256 stages, implying the importance of humoral immune response changes between disease 257 progression and convalescence ( Figure S2B and Table S3 ). 258 In summary, plasma B cells appeared to be significantly elevated in the peripheral blood 260 regarding either the composition, proliferation, or developmental transition from memory B 261 cells, and were more associated with disease severity. While their precursor cells were also 262 elevated, they were more prone to be influenced by sex differences, providing a plausible 263 explanation for the epidemiological observations on sex differences of COVID-19. Taken 264 together with the observation that plasma B cells were more enriched in BALF ( Figure 1F ), 265 these observations may suggest that humoral immune responses were actively initiated to 266 combat SARS-CoV-2 infection and contributed to disease severity. 267 268 Two proliferative CD4+ T cell clusters were also identified, with T_CD4_c13-MKI67-294 CCL5 low characterized by high expression of SELL and low CCL5 and T_CD4_c14-MKI67-295 CCL5 high characterized by low SELL and high CCL5. The counts of T_CD4_c14-MKI67-296 CCL5 high in PBMCs did not show significant differences among different COVID-19 patients. 297 By contrast, the T_CD4_c13-MKI67-CCL5 low counts were elevated in COVID-19 patients, 298 particularly in severe patients during the disease progression stage ( Figure 3C ). Similar to 299 plasma B cells, the diversity and clonality of this cluster were both increased in severe 300 patients with disease progression ( Figure 3D ), indicating an expanded TCR repertoire and 301 developmental transitions from other clusters. Unlike plasma B cells whose source cluster 302 B_c03-CD27-AIM2 was increased in peripheral blood ( Figure S2C ), the major source 303 cluster of proliferative CD4+ T cells T_c04_CD4−ANXA2 was decreased in COVID-19 304 patients, particularly in severe patients during the disease progression stage (Figures 3E 305 and 3F). This may partially explain the dichotomous and incomplete adaptive immunity 306 previously observed in COVID-19 patients (Catanzaro et al., 2020). ANOVA analyses 307 revealed that different from T_CD4_c13-MKI67-CCL5 low ( Figure 3G ), the percentage of 308 T_CD4_c04−ANXA2 was associated with disease severity, progression/convalescence, 309 and sex ( Figure 3H ). In particular, female patients generally had higher levels of 310 T_CD4_c04−ANXA2 than males ( Figure 3H) While the decrease of γ δ T cells, MAIT cells, and effector memory T cells abovementioned 329 were primarily associated with disease severity, the decreases of naive and central memory 330 T cells were associated with the age but not sex difference of patients ( Figure S2D and S2E 331 and Table S3 ). Such clusters included the naive CD8+ cluster T_CD8_c01-LEF1, the CD8+ 332 central memory cluster T_CD8_c02-GPR183, the naive CD4+ cluster T_CD4_c01-LEF1, 333 and two CD69+ CD4+ clusters T_CD4_c06-NR4A2 and T_CD4_c05-FOS. cohort also enabled us to dissect the impact of age and sex on the immune responses of 344 COVID-19 patients. We found that, rather than associated with T cell proliferation, age and 345 sex are more likely associated with the abundance of naive/central memory T cells and the 346 precursor cells of proliferative T cells, respectively, highlighting the complexity of human T 347 cell responses to SARS-CoV-2 infection. 348 349 350 Our scRNA-seq data also coupled with TCR/BCR repertoire sequencing and thus provided 351 a rich resource to investigate the TCR/BCR usage of COVID-19 patients, which is 352 instructive for the development of anti-SARS-CoV-2 therapeutics and vaccines. We first 353 examined whether identical TCRs or BCRs could be identified across COVID-19 patients. 354 We found that only a few TCRs or BCRs were shared between two patients, and no 355 identical TCRs or BCRs were shared beyond three patients. The diversity of TCR or BCR repertoires of various T and B clusters might also be 383 influenced by age, sex, COVID-19 severity, and disease stages. While age was mainly 384 associated with the abundance of only naive and central memory T cells in PBMC, ANOVA 385 analysis revealed that age might influence the decrease of TCR diversity in a wider range of 386 T cells, including naive, central memory, and diverse effector memory T cells ( Figure S2I -J). 387 By contrast, sex differences were mainly associated with the BCR diversity of naive and 388 memory B cells (B_c01-TCL1A, B_c02-MS4A1-CD27, and B_c04-SOX5-TNFRSF1B) and 389 the TCR diversity of a subset of effector memory CD4+ T cells (T_CD4_c08-GZMK-FOS high ) 390 ( Figure S2I -K). After correcting the effects of age and sex, the decrease of diversity in MAIT 391 cells, naive B and CD8+ and CD4+ T cells, effector memory CD8+ T cells (T_CD8_c03-392 GZMK and T_CD8_c04-COTL1), and a few CD69+ CD4+ T cell clusters (T_CD4_c03-393 ITGA4, T_CD4_c04-ANXA2) remained independently associated with COVID-19 severity 394 ( Figure S2I -K), highlighting the importance of these cells in COVID-19. Importantly, the 395 TCR diversity of one proliferative CD8+ T cell cluster, i.e., T_CD8_c11-MKI67-FOS, was 396 associated with the triad interaction by disease severity, age, and sex ( Figure S2J ), 397 indicating the impacts of age and sex on disease severity. Similarly, the clonal expansion of 398 a central memory CD4+ T cell cluster highly expressing AQP3 (T_CD4_c02-AQP3) was 399 also associated with the triad interaction by disease severity, age, and sex ( SARS-CoV-2 detected in multiple epithelial and immune cell types with 414 interferon response phenotypes 415 The enrichment of plasma B and proliferative T cells in BALF and the elevation of these 416 cells in PBMCs of COVID-19 patients highlighted the roles of these cells in combating 417 SARS-CoV-2 infection. To explore potential interactions between these cells and SARS-418 CoV-2 infected cells, we examined the characteristics of cell types that harbored SARS-419 CoV-2 sequences in our dataset. we examined their expression levels in these cells ( Figure 4C ) (Netea et al., 2020). We 428 found that at least a subset of those epithelial cells expressed ACE2 and TMPRSS2, 429 consistent with the notion that SARS-CoV-2 employs ACE2 and TMPRSS2 to invade these 430 cells. Interestingly, those immune cells, which did not express ACE2 or TMPRSS2, 431 harbored even more viral RNA sequences than the epithelial cells ( Figure 4D ). The high 432 viral load reassured that the detection of SARS-CoV-2 RNAs in these immune cells was 433 unlikely caused by experimental contamination. Consistently, an independent scRNA-seq 434 study of COVID-19 patients also identified SARS-CoV-2 RNAs in neutrophils and 435 macrophages from the respiratory samples of COVID-19 patients (Bost et al., 2020). 436 437 Since interferon-stimulated genes (ISGs) are typically activated in virus-infected cells 438 (Schoggins and Rice, 2011), we next examined the expression of ISGs in these cells 439 (Figure 4E and CLCA4, and SULT2B1 ( Figure 5A ). These genes were enriched in pathways such as 470 "response to virus", "response to type I interferon" and "response to hypoxia", consistent 471 with viral infection and the subsequent respiratory distress, reflecting the host immune 472 response via type I interferons ( Figure 5B ). By contrast, the numbers of genes with 473 significant changes after SARS-CoV-2 infection for ciliated and secretory epithelial cells 474 were much smaller than squamous cells, and few genes showed consistent changes in all 475 the three epithelial cell types ( Figure 5C ). 476 477 We cells with no viral detection ( Figure S4C ). Such changes were consistent across COVID-19 496 patients ( Figure 5F ). Comparison across ciliated, secretory, and squamous epithelial cells 497 infected by SARS-CoV-2 also highlighted the dispersing tendency of ciliated cells and the 498 interacting potentials among squamous cells themselves (Figures 5G and 5H ). 499 500 Such interaction distinctions not only existed among epithelial cells, but also impacted their 501 interactions with immune cells. Consistent with the dispersing nature of ciliated cells in the 502 outer compartment of the pseudo-space, no significant interactions were observed between 503 virus+ ciliated cells and immune cells. By contrast, virus+ secretory epithelial cells showed 504 significant interactions with neutrophils and macrophages in mild/moderate COVID-19 505 patients via the SCGB3A1-MARCO axis ( Figures S4D and S4E ), but such interactions were 506 subdued in severe COVID-19 patients due to the down-regulation of MARCO in neutrophils 507 and macrophages ( Figure S4F ). In severe patients, virus+ squamous cells showed 508 significant interactions with neutrophils and macrophages via the ANXA1-FPR1 and 509 S100A9/A8-TLR4 axes ( Figure 5I ). Neutrophils and macrophages exhibiting high interacting 510 potentials with virus+ squamous epithelial cells were also prone to be SARS-CoV-2 infected 511 ( Figure 5J ). As ANXA1-FPR1 and S100A9/A8 immune response. It is noteworthy that plasma B cells in BALF also tended to be SARS-519 CoV-2-positive and displayed close interactions with virus+ neutrophils and squamous 520 epithelial cells via the S100A9/A8-TLR4 axes ( Figure 5L ). 521 522 We then investigated the cell types expressing ANXA1, FPR1, S100A9, S100A8, and TLR4 523 in both BALF and PBMC across COVID-19 patients to evaluate the possible inflammatory 524 cascade mediated by these LR pairs. It was evident that ANXA1 was highly expressed in a 525 wide range of immune cells except B cells and naive T cells ( Figures S5A and S5B ) and its 526 receptor FPR1 was highly expressed in neutrophils, macrophages, and monocytes ( Figures 527 S5A and S5B) . Interestingly, for most immune cell clusters in BALF, the expression levels 528 of ANXA1 and FPR1 were down-regulated in severe COVID-19 patients compared with 529 those of mild/moderate COVID-19 patients ( Figure S5A ). But in PBMCs, except for MAIT 530 cells (T_CD8_c09-SLC4A10) and γ δ T cells (T_gdT_c14-TRDV2), ANXA1 and FPR1 were 531 significantly up-regulated in many cell types in severe COVID-19 patients compared with 532 those of mild/moderate COVID-19 patients ( Figure S5B ). S100A9 and S100A8 were highly 533 expressed in neutrophils, macrophages, and monocytes in COVID-19 patients with 534 mild/moderate symptoms and had no expression in T, B, NK, or dendritic cells ( Figures 535 S4H and S5B ). However, for severe COVID-19 patients in the disease progression stage, 536 S100A9 and S100A8 were significantly up-regulated in almost all cell clusters for both 537 BALF and PBMCs ( Figures S4H and S5B ). In particular, T, B, NK, and dendritic cells had 538 no or minimal levels of S100A9 and S100A8 expression in mild/moderate COVID-19 539 patients ( Figures S4H and S5B ). By contrast, in severe COVID-19 patients, the levels of 540 S100A9 and S100A8 were significantly up-regulated in T, B, NK, and dendritic cells 541 ( Figures S4H and S5B) , indicating a systemic inflammatory response. TLR4 did not exhibit 542 significant differences in PBMCs between severe and mild/moderate COVID-19 patients but 543 was significantly down-regulated in certain BALF monocyte and macrophage subsets 544 ( Figure S5B ). 545 546 In summary, our data indicated that SARS-CoV-2 infection in different types of epithelial 547 cells might trigger different transcriptomic changes and thus could modulate their 548 interactions with themselves and with immune cells. In particular, squamous epithelial cells 549 could up-regulate ANXA1 and S100A8/A9 after SARS-CoV-2 infection, enhancing their 550 interactions with neutrophils and macrophages via the axes of ANXA1-FPR1 and 551 S100A8/A9-TLR4. The systemic up-regulation of ANXA1, FPR1, and S100A8/A9 in 552 immune cells from peripheral blood may indicate, at least partially, the molecular 553 mechanism of aberrant inflammation in severe COVID-19 patients. This hypothesis is 554 supported by a preliminary finding that small molecules targeting S100A8/A9 could inhibit compromised adaptive immune response in severe patients. Furthermore, the 561 megakaryocytes in PBMCs, followed by monocytes, exhibited higher interaction potentials 562 with epithelial and immune cells in BALF than adaptive immune cells ( Figure S4I ), 563 suggesting the critical roles of these cells in the pathogenesis of COVID-19. 564 565 Megakaryocytes and monocyte subsets as critical peripheral sources of 566 cytokine storms 567 With our large scale scRNA-seq dataset, we next sought to investigate whether any crucial 568 cell subtypes in peripheral blood contribute to the bulk of inflammatory cytokine production. 569 We first defined a cytokine score and inflammatory score for each cell based on the 570 expressions of the collected cytokine genes and reported inflammatory response genes 571 (Liberzon et al., 2015) (Table S6) , respectively, and used these two scores as indicators to 572 evaluate the levels of inflammatory cytokine storm for each cell. We found apparent 573 elevated expression of cytokine and inflammatory genes in patients, especially at the 574 severe progression stage ( Figures 6A and S6A T_CD8_c09-SLC4A10) and one subtype of megakaryocytes was detected with significantly 579 higher cytokine and inflammatory scores ( Figure. S9B, Table S2 , P < 0.0001), indicating 580 that these cells might be major sources of inflammatory storm. Interestingly, 581 megakaryocytes, which have not been reported in the inflammatory response in COVID-19 582 patients, may affect the functions of platelets at the disease stage, in consistent to a 583 previous study (Manne et al., 2020). 584 Each of the hyper-inflammatory subtypes highly expressed several cytokine genes that are 585 known to be involved in the inflammatory storm, such as CCL3, IL1B, CXCL8, CCL4, CCL6, 586 IL32, LTB and TGFB1, but with different patterns ( Figure 6B ), suggesting divergent 587 genomic signatures of these cells. We then investigated the proportion of each of the 7 cell 588 subtypes in every patient and found that these hyper-inflammatory cell subtypes were in 589 general slightly more frequent in patients at severe stage ( Figure. S6C ). When we clustered 590 these cell subtypes with each individual patient based on the proportions of the hyper-591 inflammatory cell subtype in PBMCs, we found distinct enrichment of these cell subtypes in 592 different groups of patients ( Figure 6C ). Mono_c1-CD14-CCL3, known be associated with 593 tocilizumab-responding cytokine storm (Guo et al., 2020a), was highly enriched in a 594 subpopulation of severe onset patients likely to be accompanied by inflammatory storm 595 ( Figures 6C and 6D ). The proportion of Mono_c1-CD14-CCL3 subtype was also correlated 596 with the age of the corresponding patients ( Figure. 6E) . The hyper-inflammatory 597 megakaryocytes were enriched in another batch of severe onset patients, which could also 598 be under excessive inflammatory response ( Figure. 6C and 6D) . 599 By contrast, Mono_c2-CD14-HLA-DPB1 and Mono_c3-CD14-VCAN subtypes were widely 600 distributed in every disease stage, and the hyper-inflammatory T cells showed decreased 601 proportions in patients at the severe onset stage such as T_CD4_c08-GZMK-FOS high 602 subtype ( Figures 6C, 6D and S6B ), although both of these two monocyte subtypes 603 exhibited increased proportions in elder convalescent patients ( Figure 6E ). Taken Next, we investigated the inflammatory signatures for each hyper-inflammatory cell subtype 612 and found unique pro-inflammatory cytokine gene expressions in each cell subtype ( Figure 613 6F), suggesting diverse mechanisms by which these cell subtypes may contribute to the 614 cytokine storm. The hyper-inflammatory Mono_c1-CD14-CCL3 and megakaryocytes largely 615 expressed more cytokines, suggesting central roles of the two cell types in driving the 616 inflammatory storm. Specifically, Mono_c1-CD14-CCL3 highly expressed CXCL8, TNF, 617 IL1RN, IL1B, and CCL3, which we also detected with significantly higher levels in serum 618 from patients at the severe stage, especially those critically ill patients ( Figures 6F and 619 S6D ). Although the inflammatory megakaryocytes highly expressed the cell type identity 620 marker genes such as PPBP (Zhang et al., 1997), the expression level of these genes was 621 significantly decreased in patients compared to healthy controls, indicating a loss of 622 function of these cells after inflammatory activation ( Figures 6F and 6G) . Notably, the 623 T_CD8_c06-TNF subtype specifically and highly expressed IFNG, a pro-inflammatory 624 cytokine highly enriched in patients at the severe onset stage also confirmed by serum 625 cytokine detection ( Figures 6F, G and S6D) . Moreover, pro-inflammatory cytokines CXCL8 626 and IFNG showed significant age-dependent expressions in patients with disease 627 progression, while no significance was observed in healthy controls ( Figure 6H ). PPBP 628 showed no correlation with the age in either patients or healthy controls, suggesting that the 629 loss of function of megakaryocytes might not be age-dependent ( Figure 6H ). To assess the 630 dynamic changes of cytokines in COVID-19 patients with different periods, we compared 631 them with healthy controls for these seven hyper-inflammatory subtypes, and found that 632 IFNG, IL6, CCL3, TNF, CXCL2, CXCL8, IL1RN, etc, were highly expressed in cells of 633 severe patients with disease progression ( Figure S6E ). 634 We also observed eight cell subtypes with significantly higher cytokine scores even though 635 their inflammatory scores showed no difference to other cell clusters ( Figure S6B , Table S7, 636 p < 0.0001). These cell subtypes exhibited uniform and relatively low expressions of 637 cytokine genes such as IGF1, TXLNA, SCYL1, CCL5 and IL16 ( Figure 6F ), likely not 638 involved in the cytokine storm. No significant differences were observed at the serum level 639 for these cytokines between the different groups of patients ( Figure S6F ). These genes 640 specific for hyper-inflammatory cells may serve as signatures for the inflammatory storm 641 and be helpful in deepening the understanding of COVID-19 pathogenesis. 642 Figure 7A ). Similar to our analysis on PBMCs, we identified five 653 hyper-inflammatory cell subtypes, including Macro_c2-CCL3L1, the three subtypes of 654 monocytes and the neutrophils ( Figure 7B ), suggesting that these cell subtypes might be 655 the major sources driving inflammatory storm in the lung tissue. Neither CD4+ nor CD8+ T 656 cells were detected with an elevated inflammatory score or the cytokine score in BALF 657 samples, which was different from those in PBMCs. Each hyper-inflammatory subtype 658 highly expressed specific cytokines; for example, Macro_c2-CCL3L1 specifically expressed 659 CCL8 infection rates and disease severity of COVID-19 patients. Our data, covering a wide age 702 range and a sex-balanced COVID-19 cohort, proved to be powerful at dissecting the 703 associations of age and sex in the immune responses to SARS-CoV-2 infection. Our data 704 revealed an apparent involvement of age and sex in the diverse human immune responses 705 via multiple mechanisms, at least partially reflected at the immune cell sub-cluster level. In 706 general, plasma B and proliferative T cells were associated with disease severity, while 707 compositional differences of the precursor cells of these adaptive immune cell types were 708 more prone to be influenced by sex and age seemed to impact more on naive and central 709 memory cells. Of note, age and sex also seemed to impact the diversity of TCR/BCR 710 repertoires for a wide range of T and B cells, which may have clinical implications. 711 712 The single-cell resolution of our data also enabled us to examine the in vivo potential host 713 cells of SARS-CoV-2 and the transcriptomic changes caused by SARS-CoV-2 infection. We 714 observed the presence of SARS-CoV-2 RNAs in multiple epithelial cell types in the human 715 respiratory tract, including ciliated, secretory, and squamous cells. Although prominent type 716 I interferon responses could be identified in these cells, distinct transcriptomic changes 717 appeared to be caused by SARS-CoV-2 infection. Such distinctions were exhibited not only 718 in the correlations of interferon responses and viral load, but also in the genes of specific 719 immune relevance, including those encoding LR interactions which are pivotal to cell-cell 720 communications. Of hundreds of immune-relevant LR pairs, ANXA1-FPR1 and S100A8/A9-721 TLR4 seemed to be critical in mediating the interactions of virus+ squamous epithelial cells 722 and neutrophils and macrophages. Although S100A8/A9 were not expressed in lymphoid 723 cells in mild/moderate COVID-19 patients, they were highly expressed in the T, B, and NK 724 cells of severe patients, likely contributing to the aberrant inflammation of these patients. 725 Coincidentally, small molecule inhibitors of S100A8/A9 could reduce the aberrant 726 inflammation and SARA-CoV-2 replication in mice (Guo et al., 2020b), supporting our 727 findings. Both S100A8/9 and FPR1 should be evaluated further as targets for modulating 728 the immune responses to SARS-CoV-2. 729 730 In addition to epithelial cells, RNAs of SARS-CoV-2 were also identified in various immune 731 cell types, including neutrophils, macrophages, plasma B cells, T and NK cells, often with 732 even higher levels than those in epithelial cells. The viral infection status of these cells 733 could also be supported by the prominent interferon responses in these cells. It is still not 734 clear how such immune cells would acquire viral sequences in the absence of either ACE2 735 or TMPRSS2, but it is evident that the pattern of SARS-CoV-2 infection is more complicated 736 than initial understanding. Such complexity needs to be thoroughly addressed before this 737 dreadful infectious disease can be effectively controlled. 738 739 The rich information of our data also allowed us to dissect the cellular origins of potential 740 cytokine storms. We found that megakaryocytes and a few monocyte subsets might be key 741 sources of a diverse set of cytokines highly elevated in COVID-19 patients with severe 742 disease progression. We suspect that in severe patients, infected epithelial cells would 743 secrete cytokines such as IL1RN into the peripheral, and monocytes expressing IL1R2 744 could be stimulated and in turn produce multiple proinflammatory cytokines such as CXCL8, 745 IL6, IL1B, and TNF ( Figure 7E ). Through IL1R2, these hyperactive monocytes could also 746 interact with dysfunctional megakaryocytes producing TGFB1, TNFSF4, PF4 and FTH1. 747 Meanwhile, the T cells in the blood go through lymphopenia, while the residual ones are 748 hyperactive in secreting many inflammatory cytokines such as IFNG and TNFSF8. Such 749 proinflammatory cytokines secreted by the cells in the blood could also infiltrate into the 750 lung tissue, and thus activating the tissue resident monocytes, macrophages and 751 neutrophils for further cytokine production. We acknowledge that this is only one of many 752 possible scenarios where an inflammatory storm could form, although our data revealed 753 key actors in the final cytokine screenplay. 754 755 In conclusion, we generated a large scRNA-seq dataset including ~1 Violin plots of selected marker genes (rows) for major cell subpopulations (columns) 1324 ordered by cell lineage relationships. NK, natural killer cells; Mono, monocyte; Macro E) t-SNE representations of integrated single-cell transcriptomes of 1,462,702 cells 1328 coloured by disease symptoms (D) and disease progression stages (E) See also Figure S1 Dynamic changes of B cell composition across disease conditions 1334 (A) Differences in immune cell composition across disease conditions for PBMC. Conditions 1335 are shown in different colors. Each boxplot represents one cell cluster. All differences with 1336 adjusted P-value < 0.05 are indicated; two-sided unpaired Wilcoxon was used for analysis. 1337 (B) Changes of XBP1+ plasma cells proportion across disease conditions. Composition of 1338 XBP1+ plasma cell BCRH cgene Differences of XBP1+ plasma cells clonal expansion and BCR diversity across disease 1341 conditions. BCR clonal expansion level is calculated by STARTRAC-expa. Shannon's 1342 entropy reveals the diversity of BCR repertoire. All differences with P-value < 0.05 are 1343 indicated; two-sided unpaired Wilcoxon was used for analysis Clonotypes 1345 of clones contain XBP1+ plasma cells (right); only shows clones with more than 5 cells. 1346 (E) Composition of B_c03-CD27-AIM2 memory cells BCRH cgene. 1347 (F) ANOVA of XBP1+ plasma cells proportion. 1348 (G) ANOVA of B_c03-CD27-AIM2 memory cells proportion (left) and differences of B_c03-1349 CD27-AIM2 memory cells proportion between male and female (right) See also Figure S2 Differences in T cell composition across disease conditions 1354 (A) Changes of proliferating CD4 and CD8 T cells across disease conditions for PBMC All differences with adjusted P-value < 0.05 are 1356 indicated; two-sided unpaired Wilcoxon was used for analysis Differences of three CD8 proliferating T cell sub clusters proportion across disease 1358 conditions. All differences with P-value < 0.05 are indicated; two-sided unpaired Wilcoxon 1359 was used for analysis Differences of two CD4 proliferating T cell sub clusters proportion across disease 1361 conditions. All differences with P-value < 0.05 are indicated; two-sided unpaired Wilcoxon 1362 was used for analysis Differences of T_CD4_c13-MKI67-CCL5 low proliferating cells clonal expansion and TCR 1364 diversity across disease conditions. TCR clonal expansion level is calculated by 1365 STARTRAC-expa. Shannon's entropy reveals the diversity of BCR repertoire. All 1366 differences with P-value < 0.05 are indicated; two-sided unpaired Wilcoxon was used for 1367 analysis Transition between T_CD4_c13-MKI67-CCL5 low proliferating cells and other CD4 cell 1369 sub clusters (left) and clonotypes of clones contain T_CD4_c13-MKI67-CCL5 low 1370 proliferating cells (right) All 1372 differences with P-value < 0.05 are indicated; two-sided unpaired Wilcoxon was used for 1373 analysis. 1374 (G) ANOVA of T_CD4_c13-MKI67-CCL5 low proliferating cells proportion. 1375 (H) ANOVA of T_c04_CD4-ANXA2 T cell proportion (left) and differences of ANXA2 T cell proportion between male and female (right). Two-sided unpaired Wilcoxon 1377 test See also Figure S2 Figure 4 Landscape Of Cell Types Detected SARS-Cov-2 Sequences and Their 1381 Antiviral Response Uniform Manifold Approximation and Projection (UMAP) of all cells with SARS-CoV-2 1383 genome UMI > 0 after quality control containing 3085 cells in total Characteristic markers we chose to identify each cell type. The purple box indicates 1385 immune cell types (top), and the red one indicates epithelial cell types (bottom) UMAP showing expression level of known SARS-CoV-2 infected receptor ACE2 (left) 1387 and TMPRSS2 (right) UMAP showing the viral load of each cell. The darker colors in the bar indicate a higher 1390 viral load in cells UMAP showing the activation of Interferon-stimulated genes (ISGs) in cells with viral 1392 detection Violin plots showing differential expression of ISGs between cells with viral detection 1394 (virus+) and cells without (virus-) in PBMC-derived neutrophils (left panel), BALF-derived 1395 neutrophils (middle panel) and squamous cells (right panel). The y axis represents the 1396 expression level of each gene Scatter plots showing the correlation between viral load and ISGs in neutrophil (left 1399 panel) and squamous cells (right panel). The line in scatter plots represent the result of 1400 linear regression. Each point in the graph represents one single cell The x axis shows virus load in each cell while the y axis represents the expression level of 1402 one of the ISG genes. Correlation coefficient (R) and probability (p) are acquired using 1403 Pearson See also Figures S3 and Tables S4 Boxplot showing the pseudo space distance within squamous cells. Each dot represents 1418 an individual patient. Two-sided paired Wilcoxon test Violin plot showing the pseudo space distance within each type of epithelial cells in one 1420 example. Two-sided unpaired Wilcoxon test Boxplot showing the median of pseudo space distance within each type of epithelial 1422 cells of all the patients with BALF data. Each dot represents an individual patient. Two-1423 sided unpaired Wilcoxon test Boxplot showing the normalized connection between squamous cells and virus-detected 1425 plasma B cells of all the patients with BALF data. Each dot represents an individual patient. 1426 Two-sided unpaired Wilcoxon test Pie chart showing the ligand-receptor contribution proportion between virus+ squamous 1428 and Macro_c6-VCAN in one example. Ligand-receptor pairs with contribution less than 0.05 1429 were merged as 'Other LRs Boxplot showing the normalized connection between squamous cells and virus-1431 macrophage (left), virus+ macrophage (middle) and virus+ neutrophils (left) Mono_c1-CD14-CCL3 and megakaryocytes in peripheral blood 1436 appear as dominant source for inflammatory cytokine storm A) t-SNE plots of PBMC cells colored by major cell types (top left panel), inflammatory cell 1438 types (top right panel), cytokine score (middle panel) and inflammatory score Violin plots of selected cytokine genes for seven hyper-inflammatory cell subtypes. 1441 (C) Heatmap of an unsupervised clustering of cell proportion of seven hyper-inflammatory 1442 cell subtypes in all samples analyzed. 1443 (D) Box plots of the cell proportion of Mono_c1-CD14-CCL3, Mega and T_CD4_c08-GZMK-1444 FOS high clusters from healthy controls (n=20), moderate convalescent (n=48), moderate 1445 onset (n=18), severe convalescent (n=35) and severe onset (n=38) patients Ordinary least squares model of age to cell proportion of Mono_c1-CD14-CCL3 convalescent (n=83) and onset (n=56) patients. P value was assessed with F-1450 statistic for ordinary least squares model. 1451 (F) Heatmap of cytokines genes' expression among seven hyper-inflammatory cell 1452 subtypes. Seven hyper-inflammatory cell subtypes are colored in red and others are 1453 colored in grey. 1454 (G) Box plots of the cytokines T_CD8_c06-TNF clusters from healthy controls (n=20), moderate convalescent (n=48), 1456 moderate onset (n=18), severe convalescent (n=35) and severe onset (n=38) patients. 1457 Two-sided Wilcoxon rank-sum test Ordinary least squares model of age to cytokines' expression of Mono_c1-CD14-CCL3 Mega and T_CD8_c06-TNF clusters from healthy controls (n=20), convalescent (n=48+35) 1460 and onset (n=18+38) patients. P value was assessed with F-statistic for ordinary least 1461 squares model the box represents the second, third quartiles and median, whiskers each 1463 extend 1.5 times the interquartile range; dots represent outliers Mono_c1, Mono_c2, Mono_c3, T_CD4_c08, T_CD8_c09, T_CD8_c06 and Mega 1465 correspond to Mono_c1-CD14-CCL3, Mono_c2-CD14-HLA-DPB1 T_CD8_c09-SLC4A10, T_CD8_c06-TNF and Mega, 1467 respectively. In panel (F), T_CD4_c11, T_CD8_c03, T_CD8_c04, T_CD8_c05 T_CD8_c04-COTL1, T_CD8_c05-ZNF683 T_CD8_c08-IL2RB and NK_c01-FCGR3A, respectively. DC, dendritic 1471 cells. Mega, megakaryocytes. Mono, monocytes Figure 7. The interactions of hyper-inflammatory cell subtypes in lung and 1474 peripheral blood A) t-SNE plots of BALF cells colored by major cell types (top panel), cytokine score (middle 1476 panel) and inflammatory score Significance was evaluated with Wilcoxon rank-sum test. **** P < 0.0001. 1479 (C) Heatmap of an unsupervised clustering of cytokine genes' expression among five 1480 hyper-inflammatory cell subtypes The outer ring 1483 displays color coded cell types and the inner ring represents the involved ligand-receptor 1484 interacting pairs. The line width and arrow width are proportional to the log fold change 1485 between severe onset and moderate onset patient groups in ligand and receptor, 1486 respectively. Colors and types of lines are used to indicate different types of interactions as 1487 shown in the legend. The bar plot at bottom indicates the interaction score for each 1488 interaction Summary illustration depicting the potential cytokine/receptor interactions of hyper-1490 inflammatory cell subtypes involved in the cytokine storm Epi, epithelial cells. Macro, macrophage cells. Mono, monocytes. Neu, 1492 neutrophils Basic information of the dataset quality and cell subsets in major 1497 cell lineages, Related to Figure 1 1498 (A) Sorted age span of donors color-coded by disease symptoms. 1499 (B) Distribution of sex in donors with different disease symptoms. Chi-square test. 1500 (C-E) Distribution of unique molecular identifier (UMI) count per cell (C), gene count per cell 1501 (D), and percentage of mitochondrial transcripts per cell (E) detected for cells pleural effusion/sputum. 1504 (F-J) Violin plots of selected marker genes (rows) for cell subsets (columns) within each cell 1505 lineage, including 6 B/plasma B cell clusters (F), 23 Myeloid cell clusters (G), 3 NK cell 1506 clusters (H), 4 Epithelial cell clusters (I) and 28 T cell clusters (J). 1507 1508 1532 (A) 2D pseudo space calculated by CSOmap, showing the location of ciliated cells Violin plot showing the distance calculated from space shown in (A) within each ciliated 1535 cell group. Two-sided unpaired Wilcoxon test Violin plot showing the distance within each squamous cell group. Two-sided unpaired 1537 Wilcoxon test Bar plot showing the mean of normalized connections of the interaction between virus+ 1539 secretory and Macro_c1-C1QC in patients categorized by two states. Error bar, s.e.m. 1540 (E) Pie chart showing the ligand-receptor contribution proportion between virus+ secretory 1541 and Macro_c6-VCAN in one example Dotplot showing the mean expression level of MARCO in BALF samples. Pct, 1544 percentage of expressed cells Boxplot of normalized connection between major cell types and ciliated (top), secretory 1546 (middle) and squamous (bottom) cells with viral detection. Kruskal-Wallis Rank Sum Test. 1547 (H) Dot plots showing the expression of S100A9(left) and S100A8 (right) in PBMC samples. 1548 Each dot is colored by the means of the expression and sized by the scaled means (Z 1549 scores) Boxplot of normalized connection between PBMC-derived cell types and BALF. Each dot 1551 represents a sample. Kruskal-Wallis Rank Sum Test. 1552 1553 1555 (A) Dot plots showing the expression of ANXA1 (top), FPR1 (middle) and TLR4 (bottom) in 1556 PBMC samples Dot plots showing the expression of ANXA1 (first panel), FPR1 (second panel) S100A9 1558 (third panel), S100A8 (fourth panel) and TLR4 (bottom panel) in BALF samples. 1559 Each dot is colored by the means of the expression and sized by the scaled means Identification of hyper-inflammatory subtypes associated with 1563 cytokine storm in PBMCs 1564 (A) t-SNE plots of PBMC cells colored by cytokine score (top panel) and inflammatory score 1565 severe onset (color, 1567 n=38) and average of all samples (n=159) (top panel); the inflammatory score (middle panel) 1568 and cytokine score (bottom panel) of subtypes from healthy controls (n=20), moderate 1569 convalescent (n=48), moderate onset (n=18), severe convalescent (n=35) and severe onset 1570 (n=38) patients. Significance was evaluated with Mann-Whitney rank test for each subtype 1571 versus all the other subtypes. **** P < 0.0001. 1572 (C) Barplots of subtypes' (seven hyper-inflammatory cell types, eight cytokine cell types and 1573 others) frequencies across each individual samples from healthy controls (n=20) Bar graphs showing cytokine concentration at the serum levels of CCL3, IFNG, IL1RN 1577 and TNF from healthy controls (n=5), convalescent (n=7), non-severe (n=4), severe (n=4), 1578 death case (n=7) patients The triangle 1581 represents severe onset versus healthy controls. Circle stands for moderate onset versus 1582 healthy controls. The square stands for convalescent versus healthy controls. All rings in 1583 the plot from the inside to the outside represent the range of P value Bar graphs showing cytokine concentration at the serum levels of CCL5 and IL16 from 1588 healthy controls (n=5), convalescent (n=7), non-severe (n=4), severe (n=4), death case 1589 (n=7) patients. Kruskal-Wallis H-test between non-severe, severe and death case. 1590 In panel (D) and panel (F), all points are shown and bars represent mean with the 95% 1591 confidence intervals. DC, dendritic cells Intercellular interaction alterations among cell types between 1594 severe and moderate onset sample groups Circos plot showing the prioritized interactions mediated by ligand-receptor pairs 1596 between inflammation-related cell subtypes for each tissue, namely The outer ring displays color coded cell types and the inner ring 1598 represents the involved ligand-receptor interacting pairs. The line width and arrow width are 1599 proportional to the log fold change between severe onset and moderate onset patient 1600 groups in ligand and receptor, respectively. Colors and types of lines are used to indicate 1601 different types of interactions as shown in the legend. The barplot at bottom indicates the 1602 interaction score for each ligand-receptor interaction Epi, epithelial cells. Macro, macrophage cells. Mono, monocytes. Neu, 1605 neutrophils. Mega, megakaryocytes Yan, X., Li, F., Wang, X., Yan, J., Zhu, F., Tang, S., Deng, Y., Wang, H., Chen, R., Yu, Z., 990 et al. (2020) . Neutrophil to lymphocyte ratio as prognostic and predictive factor in patients 991 with coronavirus disease 2019: A retrospective cross-sectional study. J Med Virol. 992 21 Yang, Y., Shen, C., Li, J., Yuan, J., Wei, J., Huang, F., Wang, F., Li, G., Li, Y., Xing, L., et al. 993 (2020). Plasma IP-10 and MCP-3 levels are highly associated with disease severity and 994predict the progression of COVID-19. J Allergy Clin Immunol 146, 119-127 e114. 995 Yu, G., Wang, L.G., Han, Y., and He, Q.Y. (2012) years old) and the elderly (70+ years old). Interactions between variables were regarded as 1200 significantly associated with cell type proportions when FDR < 0.05. 1201 To investigate the impact of virus infection on epithelial cells, we identify differential 1203 expressed genes by performing two-sided unpaired Wilcoxon tests on all the expressed 1204 genes (expressed in at least 10% cells in either group of cells). P values were adjusted 1205following Benjamini & Hochberg protocol. Top 100 highly expressed genes of each group 1206were shown in the volcano plots. Based on these genes, enriched GO terms were then 1207 acquired for each group of cells using R package clusterProfiler (Yu et al., 2012) . 1208 To illustrate the cell-cell interaction potential of cells with viral detection, we first created a 1210 set of datasets by joining 7 BALF samples with the virus+ dataset separately. Then, we 1211used CSOmap (Ren et al., 2020) to construct a 3D pseudo space and calculate the 1212 significant interaction for each dataset. To investigate the interaction potentials of the cell 1213 types, we used two indexes, distances within cell type and normalized connection. Distance 1214 within each cell type is calculated based on the aforementioned 3D coordinates. The 1215shorter the distance, the closer the cells are located in the 3D space, which indicates that 1216 they are more likely to interact with each other. To further investigate the interaction 1217 between different cell types, we made use of the CSOmap output connection matrix. For a 1218 cluster pair, normalized connection was calculated by dividing its corresponding connection 1219 value by the product of their respective cell numbers. Normalized connections were then 1220 multiplied by 10,000. Meanwhile, to highlight the key ligand-receptor pairs function in the 1221 interaction, we also examine the contribution output by CSOmap. 1222In addition, normalized connections were also calculated on another set of cohorts where 1223we combined virus+ dataset with samples with paired PBMC and BALF tissues, in order to 1224 investigate the interaction potential between cells from two tissues, PBMC and BALF. 1225Inflammatory and cytokine score related subtypes analysis. 1226 Briefly, we firstly filtered out samples with less than 1000 cells. For PBMC, only subtypes 1227 with more than 1000 cells were included in the subsequent analysis. For BALF data 1228 analysis, we removed major cell types with less than 500 cells. To define inflammatory and 1229 cytokine score, we downloaded a gene set termed 1230 'HALLMARK_INFLAMMATORY_RESPONSE' from MSigDB (PMID: 26771021) and 1231 28 collected cytokine genes based on these references (see Table S1 ). Cytokine and 1232inflammatory score were evaluated with the AddModuleScore function built in the Seurat 1233 (PMID: 31178118). To select the most promising hyper-inflammatory cell types, we 1234performed Mann-Whitney rank test (single-tail) for each subtype's score versus all the other 1235 subtypes' score. Seven subtypes (Mono_c1-CD14-CCL3, Mono_c2-CD14-HLA-DPB1, 1236 Mono_c3-CD14-VCAN, T_CD4_c08-GZMK-FOS high , T_CD8_c06-TNF, T_CD8_c09-1237 SLC4A10 and Mega) in PBMC were defined as hyper-inflammatory cell types with 1238 significantly statistical parameters (P < 0.0001) in both cytokine and inflammatory score. In 1239 addition, we defined 8 subtypes (T_CD8_c08-IL2RB, T_CD4_c11-GNLY, NK_c01-FCGR3A, 1240T_CD8_c05-ZNF683, T_CD8_c04-COTL1, T_CD8_c07-TYROBP, T_CD8_c03-GZMK and 1241T_gdT_c14 1257 To identify and visualize the possible cell-cell interactions in terms of cytokine storm 1258 between the highly inflammation-correlated cell types evaluated by the inflammation score 1259 within each tissue and the crosstalk between lung and circulating blood, we employed an R 1260 package iTALK introduced by Wang et al. (Wang et al., 2019, bioRxiv, 1261 https://www.biorxiv.org/content/10.1101/507871v1). Cytokine/chemokine category (n = 320) 1262 in the ligand-receptor database was selectively used for our purpose. Wilcoxon rank sum 1263 test was used to identify the differentially expressed genes (DEGs) between severe onset 1264 and moderate onset patient groups for each cell type. DEGs were then matched and paired 1265 against the ligand-receptor database to construct a putative cell-cell communication 1266network. An interaction score defined as the product of the log fold change of ligand and 1267 receptor was used to rank these interactions. In addition, the expression level of both ligand 1268 and receptor were also considered. We defined severe gained interaction if a ligand gene 1269 was upregulated in severe onset group and its paired gene upregulated or remains no 1270 change. We defined severe lost interaction if a ligand(receptor) gene was downregulated in 1271severe onset group regardless of the expression level of its paired gene. 1272 Cytokine analysis of serum by using multiplex bead-based immunoassay 1273 Human cytokines in the serum were measured by Bio-plex Pro TM Human Cytokine 1274Screening 48 plex Bio-PlexTM 200 System (# 12007283, Bio-Rad, US) and Bio-PlexTM 1275200 System (Bio-Rad). Bio-Plex ProTM assays are essentially immunoassays formatted on 1276 magnetic beads and are built upon three core elements of xMAP technology, fluorescently 1277 29 dyed microspheres (also called beads), a dedicated flow cytometer with two lasers to 1278 measure the different molecules bound to the surface of the beads, and a high-speed digital 1279 signal processor that efficiently manages the fluorescence data. 1280 1281Sample preparation 1282Whole blood from COVID-19 patients and healthy controls were drawn into collection tubes 1283 containing anticoagulant. Centrifugation the tubes at 1,000 x g for 15 min at 4°C and 1284transfer the serum to a clean polyprophylene tube, followed by another centrifugation at 1285 10,000 x g for 10 min at 4°C to completely remove platelets and precipitates. The raw sequencing and processed gene expression data in this paper have been 1312 deposited into GSA (Genome Sequence Archive in BIG Data Center, Beijing Institute of 1313Genomics, Chinese Academy of Sciences) and the NCBI GEO database, respectively. 1314Visualization of this dataset can be found at http://covid19.cancer-pku.cn. 1315 1316