key: cord-350627-4pgish5x authors: Zhao, Yu; Zhao, Zixian; Wang, Yujia; Zhou, Yueqing; Ma, Yu; Zuo, Wei title: Single-cell RNA expression profiling of ACE2,thereceptor of SARS-CoV-2 date: 2020-01-26 journal: bioRxiv DOI: 10.1101/2020.01.26.919985 sha: doc_id: 350627 cord_uid: 4pgish5x A novel coronavirus SARS-CoV-2 was identified in Wuhan, Hubei Province, China in December of 2019. According to WHO report, this new coronavirus has resulted in 76,392 confirmed infections and 2,348 deaths in China by 22 February, 2020, with additional patients being identified in a rapidly growing number internationally. SARS-CoV-2 was reported to share the same receptor, Angiotensin-converting enzyme 2 (ACE2), with SARS-CoV. Here based on the public database and the state-of-the-art single-cell RNA-Seq technique, we analyzed the ACE2 RNA expression profile in the normal human lungs. The result indicates that the ACE2 virus receptor expression is concentrated in a small population of type II alveolar cells (AT2). Surprisingly, we found that this population of ACE2-expressing AT2 also highly expressed many other genes that positively regulating viral entry, reproduction and transmission. This study provides a biological background for the epidemic investigation of the COVID-19, and could be informative for future anti-ACE2 therapeutic strategy development. Seq technique, we analyzed the ACE2 RNA expression profile in the normal human lungs. The result indicates that the ACE2 virus receptor expression is concentrated in a small population of type II alveolar cells (AT2). Surprisingly, we found that this population of ACE2-expressing AT2 also highly expressed many other genes that positively regulating viral reproduction and transmission. A comparison between eight individual samples demonstrated that the Asian male one has an extremely large number of ACE2-expressing cells in the lung. This study provides a biological background for the epidemic investigation of the 2019-nCov infection disease, and could be informative for future anti-ACE2 therapeutic strategy development. Severe infection by 2019-nCov could result in acute respiratory distress syndrome (ARDS) and sepsis, causing death in approximately 15% of infected individuals 1, 2 . Once contacted with the human airway, the spike proteins of this author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.01.26.919985 doi: bioRxiv preprint virus can associate with the surface receptors of sensitive cells, which mediated the entrance of the virus into target cells for further replication. Recently, Xu et.al., modeled the spike protein to identify the receptor for 2019-nCov, and indicated that Angiotensin-converting enzyme 2 (ACE2) could be the receptor for this virus 3 . ACE2 is previously known as the receptor for SARS-Cov and NL63 4-6 . According to their modeling, although the binding strength between 2019-nCov and ACE2 is weaker than that between SARS-Cov and ACE2, it is still much higher than the threshold required for virus infection. Zhou et. al. conducted virus infectivity studies and showed that ACE2 is essential for 2019-nCov to enter HeLa cells 7 . These data indicated that ACE2 is likely to be the receptor for 2019-nCov. The expression and distribution of the receptor decide the route of virus infection and the route of infection has a major implication for understanding the pathogenesis and designing therapeutic strategies. Previous studies have investigated the RNA expression of ACE2 in 72 human tissues 8 . However, the lung is a complex organ with multiple types of cells, and such real-time PCR RNA profiling is based on bulk tissue analysis with no way to elucidate the ACE2 expression in each type of cell in the human lung. The ACE2 protein level is also investigated by immunostaining in lung and other organs 8, 9 . These studies showed that in normal human lung, ACE2 is mainly expressed by type II and type I alveolar epithelial cells. Endothelial cells were also reported to be ACE2 positive. However, immunostaining analysis is known for its lack of signal author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.01.26.919985 doi: bioRxiv preprint specificity, and accurate quantification is also another challenge for such analysis. The recently developed single-cell RNA sequencing (scRNA-Seq) technology enables us to study the ACE2 expression in each cell type and give quantitative information at single-cell resolution. Previous work has built up the online database for scRNA-Seq analysis of 8 normal human lung transplant donors 10 . In current work, we used the updated bioinformatics tools to analyze the data. In total, we analyzed 43,134 cells derived from normal lung tissue of To further understand the special population of ACE2-expressing AT2, we performed gene ontology enrichment analysis to study which biological processes are involved with this cell population by comparing them with the AT2 cells not expressing ACE2. Surprisingly, we found that multiple viral process-related GO are significantly over-presented, including "positive regulation of viral process" (P value=0.001), "viral life cycle" (P value=0.005), "virion assembly" (P value=0.03) and "positive regulation of viral genome replication" (P value=0.04). These highly expressed viral process-related genes in ACE2-expressing AT2 include: SLC1A5, CXADR, CAV2, NUP98, CTBP2, GSN,HSPA1B,STOM, RAB1B, HACD3, ITGB6, IST1,NUCKS1,TRIM27, APOE, SMARCB1,UBP1,CHMP1A,NUP160,HSPA8,DAG1,STAU1,ICAM1,CHMP5,D EK, VPS37B, EGFR, CCNK, PPIA, IFITM3, PPIB, TMPRSS2, UBC, LAMP1 and CHMP3. Therefore, it seems that the 2019-nCov has cleverly evolved to hijack this population of AT2 cells for its reproduction and transmission. We further compared the characteristics of the donors and their ACE2 expressing patterns. No association was detected between the ACE2expressing cell number and the age or smoking status of donors. Of note, the 2 male donors have a higher ACE2-expressing cell ratio than all other 6 female author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.01.26.919985 doi: bioRxiv preprint donors (1.66% vs. 0.41% of all cells, P value=0.07, Mann Whitney Test). In addition, the distribution of ACE2 is also more widespread in male donors than females: at least 5 different types of cells in male lung express this receptor, while only 2~4 types of cells in female lung express the receptor. This result is highly consistent with the epidemic investigation showing that most of the confirmed 2019-nCov infected patients were men (30 vs. 11, by Jan 2, 2020). We also noticed that the only Asian donor (male) has a much higher ACE2- Altogether, in the current study, we report the RNA expression profile of ACE2 in the human lung at single-cell resolution. Our analysis suggested that the expression of ACE2 is concentrated in a special population of AT2 which expresses many other genes favoring the viral process. This conclusion is different from the previous report which observed abundant ACE2 not only in AT2, but also in endothelial cells 8 . In fact, to our knowledge, endothelial cells sometimes can be non-specifically stained in immunohistochemical analysis. author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.01.26.919985 doi: bioRxiv preprint Public datasets (GEO: GSE122960) were used for bioinformatics analysis. Firstly, we used Seurat (version 2.3.4) to read a combined gene-barcode matrix of all samples. We removed the low-quality cells with less than 200 or more than 6,000 detected genes, or if their mitochondrial gene content was > 10%. Genes were filtered out that were detected in less than 3 cells. For normalization, the combined gene-barcode matrix was scaled by total UMI counts, multiplied by 10,000 and transformed to log space. The highly variable genes were identified using the function FindVariableGenes. Variants arising from number of UMIs and percentage of mitochondrial genes were regressed out by specifying the vars.to.regress argument in Seurat function ScaleData. The expression level of highly variable genes in the cells was scaled and centered along each gene, and was conducted to principal component analysis. Then we assessed the number of PCs to be included in downstream analysis by (1) plotting the cumulative standard deviations accounted for each PC using the function PCElbowPlot in Seurat to identify the 'knee' point at a PC number after which successive PCs explain diminishing degrees of variance, and (2) by exploring primary sources of heterogeneity in the datasets using the PCHeatmap function in Seurat. Based on these two methods, we selected the first top significant PCs for two-dimensional t-distributed stochastic neighbor embedding (tSNE), implemented by the Seurat software with the default parameters. We used FindClusters in Seurat to identify cell clusters for each author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.01.26.919985 doi: bioRxiv preprint sample. Following clustering and visualization with t-Distributed Stochastic Neighbor Embedding (tSNE), initial clusters were subjected to inspection and merging based on the similarity of marker genes and a function for measuring phylogenetic identity using BuildClusterTree in Seurat. Identification of cell clusters was performed on the final aligned object guided by marker genes. To identify the marker genes, differential expression analysis was performed by the function FindAllMarkers in Seurat with Wilcoxon rank sum test. Differentially expressed genes that were expressed at least in 25% cells within the cluster and with a fold change more than 0.25 (log scale) were considered to be marker genes. tSNE plots and violin plots were generated using Seurat. b. Cellular cluster map of the Asian male. All 8 samples were analyzed using the Seurat R package. Cells were clustered using a graph-based shared nearest neighbor clustering approach and visualized using a t-distributed Stochastic Neighbor Embedding (tSNE) plot. author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.01.26.919985 doi: bioRxiv preprint author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.01.26.919985 doi: bioRxiv preprint Clinical features of patients infected with 2019 novel coronavirus in Wuhan A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission The S proteins of human coronavirus NL63 and severe acute respiratory syndrome coronavirus bind overlapping regions of ACE2 Crystal structure of NL63 respiratory coronavirus receptor-binding domain complexed with its human receptor Expression of elevated levels of pro-inflammatory cytokines in SARS-CoV-infected ACE2+ cells in SARS patients: relation to the acute lung injury and pathogenesis of SARS Discovery of a novel coronavirus associated with the recent pneumonia outbreak in humans and its potential bat origin. bioRxiv No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the Tissue distribution of ACE2 protein, the functional receptor for SARS coronavirus. A first step in understanding SARS pathogenesis Binding of SARS coronavirus to its receptor damages islets and causes acute diabetes Single-Cell Transcriptomic Analysis of Human Lung Provides Insights into the Pathobiology of Pulmonary Fibrosis No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the