key: cord-295142-5sqkdpi8 authors: Han, Y.; Jia, Z.; Shi, J.; Wang, W.; He, K. title: The active lung microbiota landscape of COVID-19 patients date: 2020-08-23 journal: nan DOI: 10.1101/2020.08.20.20144014 sha: doc_id: 295142 cord_uid: 5sqkdpi8 With the outbreak of COVID-19 causing by SARS-CoV-2, the interaction between the host and SARS-CoV-2 was widely studied. However, it is unclear whether and how SARS-CoV-2 infection affects lung microflora, which contributes to COVID-19 complications. Here, we analyzed the metatranscriptomic data of bronchoalveolar lavage fluid (BALF) of 19 COVID-19 patients and 23 healthy controls from 6 independent projects and detailed the active microbiota landscape in both healthy individuals and COVID-19 patients. The infection of SARS-CoV-2 could deeply change the lung microbiota, evidenced by the -diversity, {beta}-diversity and species composition analysis based on bacterial microbiota and virome. Pathogens (such as Klebsiella oxytoca causing pneumonia as well), immunomodulatory probiotics (such as Lactic Acid Bacteria and Faecalibacterium prausnitzii, a butyrate producer) and Tobacco mosaic virus (TMV) were enriched in the COVID-19 group, suggesting a severe microbiota dysbiosis. The significant correlation between Rothia mucilaginosa, TMV and SARS-CoV-2 revealed drastic inflammatory battles between the host, SARS-CoV-2 and other microbes in the lungs. Notably, TMV only existed in the COVID-19 group, while Human respirovirus 3 only existed in the healthy group. Our study provides insight into the active microbiota in the lungs of COVID-19 patients and will contribute to the understanding of the infection mechanism of SARS-CoV-2 and the treatment of the disease and complications. With the outbreak of severe acute respiratory syndrome coronavirus 2, SARS-CoV-2(1), leading to Coronavirus disease 2019 (COVID- 19) , many studies on clinical characteristics, three-dimensional structure(2) and interactions between host and virus(3) were reported. The symptoms of COVID-19 patients (COV) usually include fever, vomiting, diarrhea(4). As bacterial or viral infections could also induce several symptoms among them, it is necessary to profile the microbiota of COVID-19 patients to effectively relieve their symptoms(5, 6). The bronchoalveolar lavage fluid (BALF), containing microenvironment information on bronchioles and lung alveoli from the lower respiratory tract, is one of key sample types for characterizing the host inflammatory response and microbiota of COVID-19 patients as lung is one of main organs for the infection of SARS-CoV-2 (7, 8) . The human microbiota in BALF, including bacterial microbiota and virome, is a diverse microbial ecosystem associated with beneficial or deleterious physiological functions as well as disease etiologies, including COVID-19 (9) . It was reported that multiple common respiratory pathogens were co-infected with SARS-CoV-2 in COVID-19 patients (10) . Moreover, the role of commensal viruses in human lungs is poorly understood, particularly in the respiratory tract (11) . Besides, there are close interactions between microorganisms including symbiosis and competition (12, 13) . Therefore, the relationship between SARS-CoV-2 and other microorganisms in the body is of great significance for studying its infection mechanism and developing effective treatments. Shen et al. briefly reported that the bacterial microbiota in COVID-19 patients was dominated by the pathogens and oral and upper respiratory commensal bacteria (14) . However, there is no systematic microbiota landscape in BALF samples from COVID-19 patients as well as the healthy individuals. In this study, we systematically profiled the transcriptionally active microbiota landscape in BALF from COVID-19 patients and healthy individuals, identified microorganism composition in healthy individuals and COVID-19 patients, found disease-specific active microbes in the COVID-19 patient group, revealed the interaction between several bacteria or viruses and SARS-CoV-2. The systematic microbiota landscape provided crucial insight into understanding the mechanism of SARS-CoV-2 infection and the treatment of COVID-19. The raw metatranscriptomic fastq data of for 42 BALF samples, consist of 19 COVID-19 patients and 23 healthy controls, and two negative controls (NC) were downloaded from Sequence Read Archive, EMBL-EBI ArrayExpress and National Genomics Data Center, China with project IDs, PRJCA002202, PRJNA601736, PRJCA002326, PRJNA605983, PRJNA615032 and PRJNA434133. The basic information of these samples was described in Supplementary Table 1 . The raw fastq data were analyzed using IDSeq webserver(15) (Pipeline v4.9) to obtain the raw read count of each bacterial and viral species. Briefly, the raw reads underwent removal of host reads (Hg38 reference genome) via STAR (Spliced Transcripts Alignment to a Reference), a series of quality control, consist of removal of low-quality, low-complexity and redundant sequences via Paired-Read Iterative Contig Extension . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 23, 2020. . https://doi.org/10.1101/2020.08.20.20144014 doi: medRxiv preprint (PRICE), CD-HIT-DUP and Lempel-Ziv-Welch (LZW) compression score, respectively (15) . After quality control, the assignment of taxonomic IDs was implemented by aligning to the NCBI nucleotide (nt) and non-redundant protein (nr) databases via GSNAPL and RAPsearch (15) , respectively. Then the read counts of nt and reads per million (rPM) of nt at the species level were obtained. To guarantee the quality and credibility of the lung microbiota due to the low biomass, species with rPM value less than 50 in NC were used for downstream analysis. Moreover, bacteria detected in at least two study projects were kept to largely eliminate the reagent and laboratory contamination. Microbial diversity was analyzed by R package vegan (v2.5-6). Shannon diversity index was calculated to evaluate α -diversity. Bacterial or viral loads were computed based on rPM of microorganism. β -diversity was studied by Principal coordinate analysis, PCoA, using R package ape (v5.3) based on Bray-Curtis distance. Differential species were analyzed by DEseq2 (v1.24.0). Differential bacteria with a statistical significance (q-value) < 1e-11 and absolute value of log 2 (Fold Change) > 3, virus with a statistical significance (q-value) < 0.01 and absolute value of log 2 (Fold Change) > 1 were retained. Student's t test and Mann-Whitney rank test were used in variables with normal and without normal distribution, respectively. The Benjamini-Hochberg correction was used to obtain FDR adjusted p-values (q-values) for multiple hypothesis testing. Correlation analysis was implemented by Spearman's rank correlation. Only bacteria and viruses retained in the differential analysis were used for correlation analysis. Bacteria correlations with a coefficient (r) > 0.8 and a statistical significance (q-value) < 1e-13 were considered for further analysis and visualization. Virus correlations with a coefficient (r) > 0.5 and a statistical significance (q-value) < 0.01 were considered for further analysis and visualization. Bacteria and virus correlations with a coefficient (r) > 0.7 and a statistical significance (q-value) < 5e-8 were considered for further analysis and visualization. Correlation network was visualized by Cytoscape 3.7.2 (16) . We collected the metatranscriptomic data of 42 BALF samples, consist of 19 COVID-19 patients The diversity analysis of bacteria and virus, including α -and β -diversity, showed large differences between two groups. The COVID-19 group had higher Shannon's Diversity Index of bacteria than . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 23, 2020. . https://doi.org/10.1101/2020.08.20.20144014 doi: medRxiv preprint healthy group (p-value = 0.0268, Student's t test, Figure 1a ). Also, Bacterial loads of COVID-19 group were much higher than healthy group (p-value = 9e-4, Mann-Whitney rank test, Figure 1b ). For virome, COVID-19 patients had much higher viral loads than healthy controls (p-value < 1e-4, Mann-Whitney rank test) while Shannon's Diversity Index showed no statistical difference ( Figure 1cd ). Furthermore, Principal Coordinates Analysis (PCoA) with Bray-Curtis distance was applied to evaluate the β -diversity in BALF microbiota across groups. The PCoA results of bacterial microbiome showed there was a significant difference of β -diversity between two groups (p-value < 1e-3, R = 0.217, ANOSIM Analysis) ( Figure 2a) . Also, similar results for virome were found (p-value < 1e-3, R = 0.541, ANOSIM Analysis) (Figure 2b ), indicating a heterogeneous community diversity in the patient and control groups. The diversity analysis revealed that the infection of SARS-CoV-2 probably caused a different lung microbiota composition in the COVID-19 patient group compared with the healthy group. To get an insight of the microbial composition of BALF, we screened out the 10 most abundant bacteria at the species level in healthy controls and COVID-19 patients (Figure 3a ), respectively. Concerning the bacteria community, Porphyromonas gingivalis, often found in the oral cavity where it is implicated in periodontal disease(21), had a relatively higher abundance in some healthy individuals. The abundance of Lautropia mirabilis, which was commonly found in the human oral cavity and the upper respiratory tract (22) , and Enterobacter kobei, leading to nosocomial infections (23) patients. To identify the species associated with COVID-19, differential abundance analysis was performed between the two groups. 99 bacteria at the species level were significantly different between the two groups (q-value < 1e-11, absolute value of log 2 (Fold Change) > 3) (Supplementary Table 2 ). Meanwhile, 82 species with the maximum relative abundance greater than 0.05% were selected from the 99 bacteria and visualized using heatmap to get a deeper investigation (Figure 4a ). The abundance of Lactic Acid Bacteria (LAB), such as Lactobacillus fermentum, Lactobacillus reuteri, Lactobacillus delbrueckii and Lactobacillus salivarius, were higher in the COVID-19 group than in the healthy group (Supplementary Table 2 ). Of note, Lactobacillus reuteri was detected only in . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 23, 2020. . https://doi.org/10.1101/2020.08.20.20144014 doi: medRxiv preprint the patient group. Some pathogens were enriched in COVID-19 patients and depleted in healthy controls. For example, Klebsiella oxytoca, leading to pneumonia, colitis and sepsis (24) , and Enterobacter cloacae, positively correlated with COVID-19 severity(25), were increased in COVID-19. The abundance of some nosocomial infection pathogens, such as Enterobacter kobei (23) , Enterobacter cloacae (26) and Ralstonia pickettii (27) , were higher in patients than healthy controls. Several gut bacteria like Faecalibacterium prausnitzii (28) , Enterococcus faecium (29) and Citrobacter freundii (30) and commensal bacteria residing in the mouth and respiratory tracts, such as Rothia mucilaginosa, were also enriched in the lung of COVID-19 patients. In addition, a few of pathogens causing severe nausea, vomiting and diarrhea, like Bacillus cereus (31) increased in patients significantly. The abundance of 18 viruses significantly differed between the two groups (q-value < 0.01, absolute value of log 2 (Fold Change) > 1) (Figure 4b,Supplementary presented that severe acute respiratory syndrome-related coronavirus was positively correlated to Rothia mucilaginosa, Schaalia meyeri and Phyllobacterium myrsinacearum which was also positively related to TMV. Taken together, the results of correlation analysis suggest that the altered lung microbiota, particularly species from the Lactobacillus and Rothia, Enterobacter and TMV, actively played a certain role during the infection of SARS-CoV-2. The negative correlation between SARS-CoV-2 and Human respirovirus 3 indicates that the former probably have an inhibitory effect on the latter. The transcriptionally active microbiota landscape of BALF in COVID-19 patients and healthy controls could shed light on the infection mechanism of SARS-COV-2. Our study described the . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 23, 2020. . https://doi.org/10.1101/2020.08.20.20144014 doi: medRxiv preprint microbial composition at the species level in BALF of COVID-19 patients and healthy controls. Although Porphyromonas gingivalis existed widely in healthy lungs and accounted for a high proportion, lower richness and bacterial loads means it still stay a low level. In addition, though Human respirovirus 3 is one of the most common respiratory viruses, particularly among young children under 5 years of age and immunocompromised patients (32, 33) , it was detected in the healthy group with no symptoms. The infection of SARS-CoV-2 changed the active microbiota in the lungs, characterized by the Although Rothia mucilaginosa was a part of the normal oropharyngeal flora, it was linked to causing disease, including pneumonia, in immunosuppressed humans (36) . The enrichment of it in COVID-19 patients may be the result of the impact of SARS-CoV-2 on the patients' immune system. In our study, we also found that some intestinal bacteria (such as Faecalibacterium prausnitzii and The metatranscriptome of human BALF can reveal transcriptionally active microbiota in the lungs. The microbes in the lungs have a great influence on human health and disease. We detailed the active microbiota landscape in both healthy individuals and COVID-19 patients and found the infection of SARS-CoV-2 could deeply change the microbiota in the lungs. This is the first detail description of lung colonization in COVID-19 patients. The enrichment of several microbes, Lactic Acid Bacteria, Faecalibacterium prausnitzii and TMV, the significant correlation between Rothia, TMV and . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 23, 2020. . https://doi.org/10.1101/2020.08.20.20144014 doi: medRxiv preprint SARS-CoV-2 reveals a microbiota disorder and drastic battles between the host, SARS-CoV-2 and other microbes in the lungs. Our study provides insight into the active microbiota in the lungs of COVID-19 patients and will contribute to the understanding of the infection mechanism of SARS-CoV-2 and the treatment of the disease and complications. The raw datasets analyzed for this study were downloaded from Sequence Read Archive, EMBL-EBI ArrayExpress and National Genomics Data Center, China with project IDs, PRJCA002202, PRJNA601736, PRJCA002326, PRJNA605983, PRJNA615032 and PRJNA434133. The authors have declared that no conflicts of interest . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 23, 2020. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 23, 2020. Bacterial and viral loads respectively. Boxes represent the interquartile ranges (IQRs) between the first and third quartiles, and the line inside the box represents the median; whiskers represent the lowest or highest values within 1.5 times IQR from the first or third quartiles. * p-value < 0.05, ** p-value < 0.01, *** p-value <0.001, **** p-value <0.0001. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 23, 2020. . https://doi.org/10.1101/2020.08.20.20144014 doi: medRxiv preprint The lung tissue microbiota features of 20 deceased patients with COVID-19 The microbiome in the lower respiratory tract The microbiota in bronchoalveolar lavage from young children with chronic lung disease includes taxa present in both the oropharynx and nasopharynx Gut Microbiota beyond Bacteria-Mycobiome, Virome, Archaeome, and Eukaryotic Parasites in IBD Rates of Co-infection Between SARS-CoV-2 and Other Respiratory Pathogens Transcriptionally Active Lung Microbiome and Its Association with Bacterial Biomass and Host Inflammatory Status. mSystems Gut microbiota: Role in pathogen colonization, immune responses, and inflammatory disease Depicting SARS-CoV-2 faecal viral activity in association with gut microbiota composition in patients with COVID-19 Genomic diversity of SARS-CoV-2 in Coronavirus Disease 2019 patients IDseq -An Open Source Cloud-based Pipeline and Analysis Service for Metagenomic Pathogen Detection and Monitoring Cytoscape: a software environment for integrated models of biomolecular interaction networks Transcriptomic characteristics of bronchoalveolar lavage fluid and peripheral blood mononuclear cells in COVID-19 patients RNA based mNGS approach identifies a novel human coronavirus from two individual pneumonia cases in 2019 Wuhan outbreak Imbalanced Host Response to SARS-CoV-2 Drives Development of COVID-19 Obesity and disease severity magnify disturbed microbiome-immune interactions in asthma patients Lautropia mirabilis gen. nov., sp. nov., a gram-negative motile coccus with unusual morphology isolated from the human mouth Nosocomial urosepsis caused by Enterobacter kobei with aberrant phenotype Klebsiella oxytoca as a causative organism of antibiotic-associated hemorrhagic colitis Alterations in Gut Microbiota of Patients With COVID-19 During Time of Hospitalization Occurrence of virulence-associated properties in Enterobacter cloacae Ralstonia pickettii: a persistent gram-negative nosocomial infectious organism Complete Genome Sequence of Faecalibacterium prausnitzii Isolated from the Gut of a Healthy Indian Adult Emerging Infectious Diseases Genomic variation landscape of the human gut microbiome Epidemiology and pathogenesis of Bacillus cereus infections Epidemiology and clinical impact of parainfluenza virus infections in otherwise healthy infants and young children < 5 years old Epidemiology and clinical presentation of the four human parainfluenza virus types Lactobacillus casei downregulates commensals' inflammatory signals in Crohn's disease mucosa Probiotic Lactobacillus rhamnosus downregulates FCER1 and HRH4 expression in human mast cells Induction of cyclooxygenase-2 signaling by Stomatococcus mucilaginosus highlights the pathogenic potential of an oral commensal Translocation of a gut pathobiont drives autoimmunity in mice and humans Supplementary table 1. The basic information of the samples. The table lists the Project ID, sample ID, sample name, total reads, non-host reads, the percentage of non-host reads, water control and collection location.Supplementary table 2. Differential bacteria at the species level (q-value < 1e-11, absolute value of log 2 (Fold Change) > 3).Supplementary table 3. Differential viruses at the species level (q-value < 0.01, absolute value of log 2 (Fold Change) > 1).Supplementary table 4. Correlation within differential bacteria (r > 0.8, q-value < 1e-13).Supplementary table 5. Correlation within differential viruses (r > 0.5, q-value < 0.01).Supplementary table 6. Correlation of differential bacteria with differential viruses (r > 0.7, q-value < 5e-8).. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 23, 2020. . https://doi.org/10.1101/2020.08.20.20144014 doi: medRxiv preprint