key: cord-303934-8gh3q7p3 authors: Sungnak, Waradon; Huang, Ni; B'ecavin, Christophe; Berg, Marijn; Network, HCA Lung Biological title: SARS-CoV-2 Entry Genes Are Most Highly Expressed in Nasal Goblet and Ciliated Cells within Human Airways date: 2020-03-13 journal: Nature medicine DOI: 10.1038/s41591-020-0868-6 sha: doc_id: 303934 cord_uid: 8gh3q7p3 The SARS-CoV-2 coronavirus, the etiologic agent responsible for COVID-19 coronavirus disease, is a global threat. To better understand viral tropism, we assessed the RNA expression of the coronavirus receptor, ACE2, as well as the viral S protein priming protease TMPRSS2 thought to govern viral entry in single-cell RNA-sequencing (scRNA-seq) datasets from healthy individuals generated by the Human Cell Atlas consortium. We found that ACE2, as well as the protease TMPRSS2, are differentially expressed in respiratory and gut epithelial cells. In-depth analysis of epithelial cells in the respiratory tree reveals that nasal epithelial cells, specifically goblet/secretory cells and ciliated cells, display the highest ACE2 expression of all the epithelial cells analyzed. The skewed expression of viral receptors/entry-associated proteins towards the upper airway may be correlated with enhanced transmissivity. Finally, we showed that many of the top genes associated with ACE2 airway epithelial expression are innate immune-associated, antiviral genes, highly enriched in the nasal epithelial cells. This association with immune pathways might have clinical implications for the course of infection and viral pathology, and highlights the specific significance of nasal epithelia in viral infection. Our findings underscore the importance of the availability of the Human Cell Atlas as a reference dataset. In this instance, analysis of the compendium of data points to a particularly relevant role for nasal goblet and ciliated cells as early viral targets and potential reservoirs of SARS-CoV-2 infection. This, in turn, serves as a biological framework for dissecting viral transmission and developing clinical strategies for prevention and therapy. In December 2019, a cluster of atypical pneumonia associated with a novel coronavirus was detected in Wuhan, China 1 . This coronavirus disease, termed COVID-19, was caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2; previously termed 2019-nCoV) 2 . The virus has since spread worldwide, emerging as a serious global health concern in early 2020 3, 4 . Human-to-human transmission of the virus has been reported in several instances [5] [6] [7] and is thought to have occurred since mid-December 2019 8 . As of early March 2020, there were more than 100,000 confirmed COVID-19 cases 4 . Patients with suspected COVID-19 have been treated in the Wuhan Jin Yintan Hospital since Dec 31st, 2019 9 . In a meta-analysis of 50,466 hospitalized patients with COVID-19 from 10 studies, most patients were from China and the average age in the included studies ranged from 41 to 56 years old 10 . The prevalence rates of fever, cough, and muscle soreness or fatigue were 89.1%, 72.2%, and 42.5%. Critical illness requiring admission to an intensive care unit occurred in 18.1% of patients, and 14.8% developed acute respiratory distress syndrome (ARDS) 10 . Acute renal injury and septic shock have been observed in 4% and 5% of patients hospitalized with COVID-19, respectively 1,9 . Chest imaging demonstrated bilateral pneumonia involvement in more than 80% of cases 1, 9, 11 . Ground-glass opacities were the most common radiologic finding on chest computed tomography (CT) 11, 12 . Abnormalities on CT were also observed preceding symptom onset in patients exposed to infected individuals, with an incidence of 93% 10, 11 . Pathological evaluation of a patient who died of severe disease revealed diffuse alveolar damage consistent with ARDS 13 . Currently, the estimated mortality rate is 3.4% 14 . These clinical data underscore the severity of this infection. The involvement of both lungs in most of the cases suggests viral dissemination after initial infection. Viral RNA was detected in the upper airways from symptomatic patients, with higher viral loads observed in nasal swabs compared to those obtained from the throat 15 . Similar viral loads were observed in an asymptomatic patient 15 , indicating that the nasal epithelium is an important portal for initial infection, and may serve as a key reservoir for viral spread across the respiratory mucosa and an important locus mediating viral transmission. Identification of the cells hosting viral entry and permitting viral replication as well as those contributing to inflammation and disease pathology is essential to improve diagnostic and therapeutic interventions. Cellular entry of coronaviruses depends on the binding of the spike (S) protein to a specific cellular receptor and subsequent S protein priming by cellular proteases. Similar to severe acute respiratory syndrome-associated coronavirus (SARS-CoV) 16, 17 , the SARS-CoV-2 employs angiotensin-converting enzyme-2 (ACE2) as a receptor for cellular entry. In addition, studies have shown that the serine protease TMPRSS2 can prime S protein 15, 18 although other proteases like cathepsin B/L can also be involved 18 . For SARS, the binding affinity between the S protein and the ACE2 receptor was found to be a major determinant of viral replication rates and disease severity 19 . The SARS-CoV-2 has been shown to infect and replicate in Vero cells, a Cercopithecus aethiops (old world monkey) kidney epithelial cell line, and huh7 cells, a human hepatocarcinoma cell line 15 . The BHK21 cell line has been shown to facilitate viral entry by the SARS-CoV-2 S protein only when engineered to express the ACE2 receptor ectopically 18 . In addition, viral entry was found to depend on TMPRSS2 activity, although cathepsin B/L activity might substitute for the loss of TMPRSS2 18 . The in vivo expression of ACE2 and TMPRSS2 (as well as other candidate proteases) by cells of the upper and lower airways and alveoli must be defined. Previously, gene expression of ACE2 and TMPRSS2 has been reported to occur largely in type-2 alveolar (AT-2) epithelial cells 15 , which are central to SARS-CoV pathogenesis. A study reported that ACE2 expression is absent from the upper airways 20 . The rapid spread of the SARS-CoV-2 suggests efficient human-to-human transmission which would, in turn, seem to supersede the odds of dependency on alveolar epithelial cells as the primary point of entry and viral replication 8, 21, 22 . Indeed, protein expression, based on immunohistochemistry, of ACE2 and TMPRSS2 has been reported in both nasal and bronchial epithelium 23 . To clarify the expression patterns of ACE2 and TMPRSS2 and analyze the expression of the other potential genes associated with SARS-CoV-2 pathogens at cellular resolution, we interrogated single-cell transcriptome expression data from published scRNA-seq datasets from healthy donors generated by the Human Cell Atlas consortium 24 . We investigated the gene expression of ACE2 in multiple scRNA-seq datasets from different tissues, including those of the respiratory tree 25 , ileum 26 , colon 27 , liver 28 , placenta/decidua 29 , kidney 30 , testis 31 , pancreas 32 , and prostate gland 33 . While scRNA-seq is a comprehensive assay, we note that some studies may still miss specific cell types, due to either their rarity, challenges associated with their isolation, or analysis methodology that was used. Thus, while positive (presence) results are highly reliable, absence should be interpreted with care. The expression of ACE2, in general, is relatively low in all of the datasets analyzed. Consistent with independent analyses 34 , we found that ACE2 is expressed in lung, airways, ileum, colon, and kidney ( Fig. 1a; first column). It is worth noting that TMPRSS2, the primary protease important for viral entry, is highly expressed with a broader distribution ( Fig. 1a ; second column), suggesting that ACE2, rather than TMPRSS2, may be a limiting factor for viral entry at the initial stage of infection. When taking into account the expression of both genes, the cells found in mucosal epithelia in the respiratory tree, ileum, and colon are ACE2 + ( Fig. 1a ; third column), consistent with viral transmission by respiratory droplets, and the potential of fecal-oral transmission 35 . We also assessed ACE2 and TMPRSS2 expression in developmental datasets from fetal liver, fetal thymus, fetal skin, fetal bone marrow and fetal yolk sac 36, 37 and found little to no expression of ACE2 with no co-expression with TMPRSS2 (data not shown) even if single ACE2 expression is noticeable in certain cell types in placenta/decidua (Fig. 1a ). While we cannot rule out the possibility that the virus uses alternative proteases for entry in such contexts, or that lung fetal tissue expresses the relevant genes, these results are at least consistent with early reports that fail to detect evidence of intrauterine infection through vertical transmission in women who develop COVID-19 pneumonia in late pregnancy 38 . If future epidemiologic data are consistent with a lack vertical viral transmission, these findings may form the basis of an explanatory model for the clinical finding. However, if future evidence for vertical transmission emerges, additional scRNA-seq data can be collected and further scrutinized for the presence of rare co-expressers or alternative receptors or proteases. Nasal goblet and ciliated cells display the highest expression of ACE2 within the larger population of respiratory epithelial cells To further characterize specific epithelial cell types expressing ACE2, we evaluated the expression of ACE2 within lung/airway epithelia from a previous study 25 . We found that, despite a low level of expression overall, ACE2 is expressed in multiple epithelial cell types across the airway, as well as in AT-2 cells in the parenchyma, consistent with previous studies 20, 39 . Importantly, nasal epithelial cells, including previously described two clusters of goblet cells and one cluster of ciliated cells, have the highest expression among all investigated cells in the respiratory tree ( Fig. 1b; left panel) . We confirmed enriched ACE2 expression in nasal epithelial cells from a second scRNA-seq study, which, in addition to nasal brushing samples seen in the earlier dataset, included nasal biopsies 40 . The results were consistent: we found the highest expression of ACE2 in nasal secretory cells (equivalent to the two goblet cell clusters in the previous dataset) and ciliated cells ( Fig. 1b; right panel) . In addition, scRNA-seq data from an in vitro 3D epithelial regeneration system from nasal epithelial cells 41 corroborated the expression of ACE2 in goblet/secretory cells and ciliated cells in these air-liquid interface (ALI) cultures (Extended Data Fig. 1 ). Of note, the differentiating cells in ALI acquire progressively more ACE2 and, unlike their corresponding progenitors, they have large luminal surfaces in the mature differentiated epithelium where viral entry is likely to occur (Extended Data Fig. 1 ). These results also suggest that such in vitro culture system is biologically relevant to the study of viral pathogenesis. We also investigated the expression of known proteases associated with the entry of SARS-CoV and SARS-CoV-2. TMPRSS2, which was shown to be important for SARS-CoV/SARS-CoV-2 viral entry and SARS-CoV transmission, [16] [17] [18] is expressed in a subset of ACE2 + cells (Extended Data Fig. 2 ), suggesting that the virus might use alternative pathways for entry. In fact, it was previously shown that SARS-CoV-2 could enter TMPRSS2cells using cathepsin B/L 18 . Indeed, we found that they are much more promiscuously expressed than TMPRSS2, especially cathepsin B, which is expressed in more than 70%-90% of ACE2 + cells (Extended Data Fig. 2 ). However, whether cathepsin B/L can functionally replace TMPRSS2 has not been empirically determined. In the case of SARS-CoV, TMPRSS2 activity is documented to be important for viral transmission 42, 43 . 49 , with the skewed expression towards lower airway/lung parenchyma (Fig. 2a ). Therefore, our data highlight the possibility that viral transmissivity is dependent on receptor accessibility based on spatial distribution along the respiratory tract. Expression of genes associated with ACE2 expression: innate immunity and carbohydrate metabolism To gain more insight into the expression patterns of genes associated with ACE2, we performed Spearman correlation analysis with Benjamini-Hochberg-adjusted p-values on genes associated with ACE2 across all cells within the lung epithelial cell dataset 25 . While the correlation coefficients are relatively low (< 0.11), likely due to low expression of ACE2, the expression pattern of the top 50 ACE2-correlated genes (all with adjusted pvalue close to 0; ranked by correlation coefficients) across the respiratory tree is similar to that of ACE2, with a skewed expression toward upper airway (Fig. 2b) Table 1 ). These genes have the highest expression in nasal goblet 2 cells (Fig. 2b) , consistent with the phenotype previously described 25 . Nonetheless, nasal goblet 1 and nasal ciliated 2 cells also significantly express these genes, but less so elsewhere (Fig. 2b) . Given their environmental exposure and the high expression of receptor/receptor-associated enzymes (Fig. 2a) , it is plausible that the nasal epithelial cells were conditioned and primed to express these immune-associated genes to prevent viral susceptibility. This association with innate immune pathways not only highlights the importance of host-microbe dynamics in nasal epithelia, but it may also have implications for subsequent viral pathogenesis and immune-associated protection/pathology. In this study, we explored multiple scRNA-seq datasets generated within the HCA consortium, and found that SARS-CoV-2 entry receptor ACE2 is more highly expressed The datasets were retrieved from existing sources based on previously published data as specifically specified in the reference. We retained the cell clustering when available or reprocessed using scanpy 50 and harmony 51 , and annotated the clusters with marker genes and cell type nomenclature based on the respective studies. Illustration of the results was generated using scanpy 50 and Seurat 52 . is a co-founder and equity holder of Celsius Therapeutics, an equity holder in Immunitas, and an SAB member of ThermoFisher Scientific, Syros Pharmaceuticals, Asimov, and Neogene Therapeutics. O.R. is a coinventor on patent applications filed by the Broad Institute to inventions relating to single cell genomics applications, such as in PCT/US2018/060860 and US Provisional Application No. 62/745,259. A.K.S. reports compensation for consulting and/or SAB membership from Merck Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Naming the coronavirus disease (COVID-19) and the virus that causes it A Novel Coronavirus from Patients with Pneumonia in China Covid-19: UK records first death, as world's cases exceed 100 000 Importation and Human-to-Human Transmission of a Novel Coronavirus in Vietnam Transmission of 2019-nCoV Infection from an Asymptomatic Contact in Germany A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia Clinical features of patients infected with 2019 novel coronavirus in Wuhan Clinical characteristics of 50466 hospitalized patients with 2019-nCoV infection Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study Clinical Characteristics of Coronavirus Disease 2019 in China Pathological findings of COVID-19 associated with acute respiratory distress syndrome. The Lancet. Respiratory medicine World Health Organization. WHO Director-General's opening remarks at the media briefing on COVID-19 -3 A pneumonia outbreak associated with a new coronavirus of probable bat origin Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus Efficient activation of the severe acute respiratory syndrome coronavirus spike protein by the transmembrane protease TMPRSS2 The novel coronavirus 2019 (2019-nCoV) uses the SARS-coronavirus receptor ACE2 and the cellular protease TMPRSS2 for entry into target cells. bioRxiv Receptor and viral determinants of SARS-coronavirus adaptation to human ACE2 Tissue distribution of ACE2 protein, the functional receptor for SARS coronavirus. A first step in understanding SARS pathogenesis Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures Euro surveillance : bulletin Europeen sur les maladies transmissibles Influenza and SARS-coronavirus activating proteases TMPRSS2 and HAT are expressed at multiple sites in human respiratory and gastrointestinal tracts A cellular census of human lungs identifies novel cell states in health and in asthma Single-Cell Analysis of Crohn's Disease Lesions Identifies a Pathogenic Cellular Module Associated with Resistance to Anti-TNF Therapy Intra-and Inter-cellular Rewiring of the Human Colon during Ulcerative Colitis Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations Single-cell reconstruction of the early maternal-fetal interface in humans Spatiotemporal immune zonation of the human kidney The adult human testis transcriptional cell atlas A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter-and Intra-cell Population Structure A Cellular Anatomy of the Normal Adult Human Prostate and Prostatic Urethra Single cell RNA sequencing of 13 human tissues identify cell types and receptors of human coronaviruses. bioRxiv Molecular and serological investigation of 2019-nCoV infected patients: implication of multiple shedding routes. Emerging microbes & infections 9 Decoding human fetal liver haematopoiesis A cell atlas of human thymic development defines T cell repertoire formation Clinical characteristics and intrauterine vertical transmission potential of COVID-19 infection in nine pregnant women: a retrospective review of medical records Single-cell RNA expression profiling of ACE2, the putative receptor of Wuhan 2019-nCov. bioRxiv A single-cell atlas of the human healthy airways. bioRxiv Novel dynamics of human mucociliary differentiation revealed by single-cell RNA sequencing of nasal epithelial cultures Protease inhibitors targeting coronavirus and filovirus entry TMPRSS2 Contributes to Virus Spread and Immunopathology in the Airways of Murine Models after Coronavirus Infection Human aminopeptidase N is a receptor for human coronavirus 229E Dipeptidyl peptidase 4 is a functional receptor for the emerging human coronavirus-EMC N-Glycolylneuraminic Acid as a Receptor for Influenza A Viruses Modeling influenza epidemics and pandemics: insights into the future of swine flu (H1N1) Coronavirus infections in working adults. Eightyear study with 229 E and OC 43. The American review of respiratory disease 105 Middle East Respiratory Syndrome Coronavirus Transmission SCANPY: large-scale single-cell gene expression data analysis Fast, sensitive and accurate integration of single-cell data with Harmony Integrating single-cell transcriptomic data across different conditions, technologies, and species We are grateful to Cori Bargmann, Jeremy Farrar, and Sarah Aldridge for stimulating discussions. We thank Jana Eliasova (scientific illustrator) for support with the design of figures.This publication is part of the Human Cell Atlas -www.humancellatlas.org/publications. W.S., N.H., C.B., and M.B. performed data analyses. W.S, N.H. and the HCA Lung Biological Network interpreted the data. W.S., with significant input from the HCA Lung Biological Network, wrote the paper. All authors read the manuscript, offered feedback, and approved it before submission.