key: cord-0903022-0xtmdfk5 authors: Chakraborty, Chiranjib; Sharma, Ashish Ranjan; Bhattacharya, Manojit; Zayed, Hatem; Lee, Sang-Soo title: Understanding Gene Expression and Transcriptome Profiling of COVID-19: An Initiative Towards the Mapping of Protective Immunity Genes Against SARS-CoV-2 Infection date: 2021-12-15 journal: Front Immunol DOI: 10.3389/fimmu.2021.724936 sha: 0d09aa0cb56a29033acd97b91eb23a0a71ba35da doc_id: 903022 cord_uid: 0xtmdfk5 The COVID-19 pandemic has created an urgent situation throughout the globe. Therefore, it is necessary to identify the differentially expressed genes (DEGs) in COVID-19 patients to understand disease pathogenesis and the genetic factor(s) responsible for inter-individual variability. The DEGs will help understand the disease’s potential underlying molecular mechanisms and genetic characteristics, including the regulatory genes associated with immune response elements and protective immunity. This study aimed to determine the DEGs in mild and severe COVID-19 patients versus healthy controls. The Agilent-085982 Arraystar human lncRNA V5 microarray GEO dataset (GSE164805 dataset) was used for this study. We used statistical tools to identify the DEGs. Our 15 human samples dataset was divided into three groups: mild, severe COVID-19 patients and healthy control volunteers. We compared our result with three other published gene expression studies of COVID-19 patients. Along with significant DEGs, we developed an interactome map, a protein-protein interaction (PPI) pattern, a cluster analysis of the PPI network, and pathway enrichment analysis. We also performed the same analyses with the top-ranked genes from the three other COVID-19 gene expression studies. We also identified differentially expressed lncRNA genes and constructed protein-coding DEG-lncRNA co-expression networks. We attempted to identify the regulatory genes related to immune response elements and protective immunity. We prioritized the most significant 29 protein-coding DEGs. Our analyses showed that several DEGs were involved in forming interactome maps, PPI networks, and cluster formation, similar to the results obtained using data from the protein-coding genes from other investigations. Interestingly we found six lncRNAs (TALAM1, DLEU2, and UICLM CASC18, SNHG20, and GNAS) involved in the protein-coding DEG-lncRNA network; which might be served as potential biomarkers for COVID-19 patients. We also identified three regulatory genes from our study and 44 regulatory genes from the other investigations related to immune response elements and protective immunity. We were able to map the regulatory genes associated with immune elements and identify the virogenomic responses involved in protective immunity against SARS-CoV-2 infection during COVID-19 development. The COVID-19 pandemic is one of the most devastating infectious diseases in recent times, spreading rapidly to more than 188 countries. As of November 20, 2021, over 256.5 million confirmed cases were reported and nearly 5.15 million deaths (1, 2) . COVID-19 has also been reported in waves globally. The second wave caused an increased number of confirmed cases and mortality in different parts of the world (3) (4) (5) . COVID-19 infection can be categorized into mild and severe conditions in humans (6) . In the second wave, most patients showed mild to severe symptoms. However, 15% of the COVID-19 patients progressed towards acute or severe disease, requiring hospitalization (7) . Studies have been performed to understand the differences between mild infections versus severe infections in patients. In one study, viral dynamics were investigated in 76 patients whose clinical presentation was classified as mild or severe (8) . In that study, 61% of patients (46 patients) were categorized as mild, and the remaining 39% of patients (30 patients) were classified as severe. Patients with mild infection cleared the virus very early, while patients with severe infection had an extended virus-shedding phase with a high viral load (8) . Velavan and Meyer attempted to understand the host markers associated with mild and severe infection (9) . The C-reactive protein (CRP) levels in patients with mild and severe infection were also studied to develop a predictive marker (10) . Numerous other studies have also been performed to understand the molecular biological aspects, immunological impact, and pathogenicity of this infectious virus (11) (12) (13) . Many efforts have been made to design and develop effective diagnostics, therapeutics, and vaccines against the virus (14) (15) (16) (17) (18) . It is essential to understand the differences in gene expression in patients with different levels of severity of infection to help develop therapies against the virus. To understand complex diseases, gene expression studies and network analyses are of immense importance (19, 20) . It aids in understanding the underlying mechanism and genetic vulnerability to complex diseases (21) . Gene expression studies can also help understand the transcriptomic landscape of cells (22) . Identifying the gene regulatory networks and host immune response dynamics can help to develop therapeutics, as the transcriptomic profiling of cells during virus infection helps to understand host gene regulatory networks and the host immune response (23) (24) (25) . Recently, Xiong et al. studied the transcriptomic pattern of COVID-19 patients using peripheral blood mononuclear cells (PBMCs) and various body fluids (26) . Ziegler et al. determined a gene expression profile in interferonstimulated airway epithelial cells infected by SARS-CoV-2 in humans and non-human primates (27) . In another study, Jain et al. evaluated transcriptomic profiling of COVID-19 patients with mild, moderate, and severe infections (28) . The differentially expressed genes were evaluated using microarray technology in patients with different severities of the disease. Microarray technology is a robust procedure that is commonly used to study differentially expressed genes to understand gene mapping, association, linkage, and expression (29, 30) . However, the number of studies comparing the whole genome transcriptome of PBMCs isolated from COVID-19 patients with mild and severe infections versus healthy controls is limited. The mapping of genes related to the activation of immune cells, immune system-related components, and protective immunity may help understand the genomic landscape and the modulator genes or proteins of any disease. It will also provide a better understanding of the immunology of the disease (31) . The immune-mediated approach may aid in developing an immunotherapeutic for the treatment of COVID-19 (32, 33) . In this study, we attempted to understand the expression of genes related to the activation of immune cells, immune system-related components, and protective immunity in COVID-19 patients. This study aimed to identify the DEGs in COVID-19 patients with mild and severe symptoms versus healthy controls. With the information on significantly upregulated DEGs, we developed an interactome map, a protein-protein interaction (PPI) pattern, a cluster analysis of the PPI network, and performed pathway enrichment analysis. We also developed a transcriptome network profile, a PPI pattern, a cluster analysis of the PPI network, and a pathway enrichment analysis of top-ranked genes from three other COVID-19 gene expression studies performed by (26) (27) (28) , and compared the results with those obtained from our gene expression study. Additionally, we identified the differentially expressed lncRNA genes of COVID-19 patients and constructed DEG-lncRNA co-expression networks. Finally, an attempt was made to identify the regulatory genes related to immune response elements and protective immunity combining the analyses from the three previous COVID-19 gene expression studies and this study. The GEO database, an NCBI resource, was used for data acquisition; the GSE164805 dataset was used in this study. In this dataset, gene expression was profiled through the array. GEO is the database where gene expression profiles are stored, and users can download a dataset of gene expression profiles from this database (34) . We used different keywords "COVID-19", "Homo sapiens", and "Microarray" to search GEO datasets. All selected expression datasets were log-transformed expression (log2 transformed) and then standardized. The outline of gene expression and transcriptome landscape data analysis of patients with COVID-19 are shown in Figure 1A . All the patient data was derived from the GEO database, which is an open database from NCBI. We divided our dataset into three: COVID-19 patients with a mild infection, COVID-19 patients with severe infection, and healthy control. Our dataset contains 15 human samples: five COVID-19 patients with mild infection, five COVID-19 patients with severe infection, and five healthy control samples. Among the healthy controls, four were males, and one was female. Four males and one female were selected for the COVID-19 patient group with mild infection. All the patients with severe COVID-19 infection were males (Table 1) . To analyze raw gene expression data, we used the statistical tool GEO2R, and this tool further uses the R/Bioconductor and limma package (34, 35) . We developed different types of statistical plots using RStudio. The statistical plots are volcano plots, mean difference (MD) plots, uniform manifold approximation and projection (UMAP) plot, venn diagram, box plot, expression density plot, adjusted p-value histogram, moderated t-statistic quantile-quantile (q-q) plot, and meanvariance trend plot. These plots were used to identify and analyze DEGs using PBMCs from the different groups: COVID-19 patients with a mild infection, COVID-19 patients with severe infection, and healthy controls (36) (37) (38) . The dataset's principal standards were set to | log (fold change) | > 1 and p < 0.05 to analyze and acquire significant DEGs. We acquired the top-ranking genes from other studies to compare gene expression and transcriptome profiling. We generated topranking genes from various studies conducted by (26) (27) (28) . Xiong et al. performed a gene expression study using bronchoalveolar lavage fluid and peripheral blood mononuclear cell samples. The researchers used RNA sequencing library construct for RNA library construction and high-throughput RNA sequencing for gene expression studies (26) . Ziegler et al. performed a gene expression study using lung lobe, nasal polyps, ethmoid sinus surgical tissue, and ileum samples; they used a single-cell RNAsequencing assay for gene expression studies (27) . Jain et al. analyzed gene expression profiles using nasopharyngeal swab samples; they used shotgun transcriptome sequencing of RNA for their gene expression profiling study (28) . The acquired topranking expressed genes of COVID-19 patients from the different studies are shown in Table 2 . Our study mapped gene expression from the diverse sample types ( Figure 1B ). To understand the associations between the DEGs from our dataset, we constructed a protein interactome using HuRI (39) . A protein interactome was generated through binary protein interactions using approximately 53,000 high-quality PPIs. We also developed a transcriptome network by acquiring data from the other three studies (26) (27) (28) . To understand the associations between protein-coding DEGs, we constructed a PPI network using the web-based tool STRING (40, 41) . The cut-off criteria were fixed with a confidence interaction score ≥ of 0.4 to obtain consistency from the dataset for the PPI interactions. The PPI network analysis outcome was represented by Cytoscape from STRING to better understand and conceptualize the PPI interactions among the highly DEGs (42, 43) . STRING can integrate data from several resources: ConsensusPathDB, HitPredict, IMP, IMID, VisANT, GeneMANIA, and I2D. Simultaneously, we have generated a PPI network that acquired top-ranking expressed genes of COVID-19 patients from the other studies (26) (27) (28) . We compared all PPI networks. For cluster analysis, we generated similarities between intracluster and inter-cluster. We transformed the outcomes from Metascape (44) using the Cytoscape software. Metascape performed cluster analysis experiments using different databases such as InWeb_IM (45) , BioGrid (46) , and OmniPath (45) , and the MCODE algorithm. In this study, the relationships' capture condition is a subset of enriched terms selected and rendered as a network plot. In this case, the tool has a condition with a similarity of > 0.3 connected by edges. The terms of selection were set with the best p-values from each of the 20 clusters. We also developed different clusters of PPI networks using topranking expressed genes of COVID-19 patients from the studies (26) (27) (28) . We compared all the cluster analyses of the PPI network. Pathway enrichment analysis was performed using Metascape analysis (45) with the 29 significantly expressed genes that use different ontology sources: GO biological processes, KEGG pathway, Reactome gene sets, and so on. The study used a term with a p-value < 0.01, with a minimum count of 3. The q-values were computed using a significant process, which accounts for multiple tests. This process is called the Benjamini-Hochberg procedure (47) . Similar to the previous analysis, we performed pathway enrichment analysis using topranking expressed genes of COVID-19 patients from the different studies (26) (27) (28) . We compared all the results of the pathway enrichment analysis. We mapped the top-ranking differentially expressed lncRNA genes from the 250 DEGs. Using top-ranking differentially expressed lncRNA genes and other DEGs, we constructed DEG-lncRNA pairs networking using the Cytoscape software (44) . In this case, the Cytoscape MCODE plug-in was used (48) . Before network construction, we cross-verified the lncRNA through a non-coding RNA sequence database, RNAcentral (49), a database for subcellular localization of lncRNAs. We attempted to map the genes from the DEGs with immunomodulatory and protective immunity properties. We used NCBI Genbank (50) and GeneCards (51, 52) . Our study analyzed the gene expression profiles and transcriptome landscape of the GSE164805 dataset from the GEO database. The Agilent-085982 Arraystar human lncRNA V5 microarray platform was used for this expression analysis. In this study, PBMCs were taken from COVID-19 patients with mild and severe infections and healthy controls for gene expression analysis. Our dataset containing 15 human samples in three groups (the two COVID-19 patients groups and one healthy control group) ( Table 1) . The volcano plot is a statistical plot, and it is a type of scatter plot that shows p-value (statistical significance) against fold change (magnitude of change). The top 250 DEGs were ranked in this study (Tables S1-S3). Table S1 describes the ID, p-value, F, and gene description of the top 250 DEGs. Table S2 describes the top 250 DEG sequences, and Table S3 represents the accession number and chromosome of the top 250 DEGs. The developed volcano plot of DEGs, and the significant genes showed the satisfactory value which was created using the dataset ( Figure 2 ). Using the cut-off criteria (p < 0.05 and |log2 FC|>1), the upregulated DEGs were acquired. We next developed the DEG volcano plot using the data of control vs. COVID-19 patients with mild infection (Figure 2A) . Similarly, we also developed another DEG volcano plot using the data of COVID-19 patients with severe infection vs. control healthy volunteers ( Figure 2B ). At the same time, we illustrated the DEG volcano plot comparing the data of COVID-19 patients with mild vs. severe infections ( Figure 2C ). In all cases, DEGs of volcano plots were adjusted with a p-value cut-off of 0.05. Red dots represent the upregulated DEGs, and the blue dots represent the downregulated DEGs. For the visualization of the DEGs, we also developed an MD plot. The plot helps to demonstrate the log2 fold change against average log2 expression values, and here, the adjusted p-value cut-off was 0.05. This study also depicted an MD plot to understand the log2 fold change against average log2 expression values ( Figure 3 ). Figure 3A shows the DEG MD plot using the data of control vs. COVID-19 patients with mild infection. Figure 3B depicts the DEG MD plot using COVID-19 patients with severe infection vs. control healthy volunteers. Figure 3C depicts the DEG MD plot using the data of COVID-19 patients with mild vs. severe infections. In all cases, red dots represent the upregulated DEGs, and blue dots represent the downregulated DEGs. For better visualization, we represented the DEG data using several other statistical plots. First, we have developed an UMAP plot ( Figure 4A ). It is a dimension reduction procedure useful for visualizing samples that are related to each other. Our analysis detected the control, mild, and severe samples. We depicted one Venn diagram, which shows the common DEGs among the three groups the "COVID-19 patient with severe infection vs. control healthy volunteers" groups and the "control vs. COVID-19 patients with mild infection" groups show the 10794 DEGs those significant genes that are common to both contrasts ( Figure 4B) . Similarly, 2338 DEGs were typical for both of our groups. Therefore, we depicted a box plot from the dataset, which informs us of the distribution of the selected samples' values for this study ( Figure 4C ). The data distribution indicated that the data could be useful and suitable for the DEGs analysis. We also analyzed our dataset and developed an expression density plot of the distribution of values of the DEGs of the three groups ( Figure 4D ). Another adjusted p-value histogram that was developed showed that the p-value in the experiment is identical to that of the top DEGs. In this histogram, the p-values are relatively consistent ( Figure 4E ). We also illustrated the moderated t-statistic q-q plot quantiles of our DEGs' data sample against the theoretical quantiles of a Student's t-distribution ( Figure 4F ). In our study, values recline along a straight line, which indicates that the investigation and data of the DEGs are ideal. Therefore, the values for the DEGs of our sample quantiles follow the distribution of theoretical quantiles. Finally, the mean-variance trend plot, which shows the meanvariance relationship of the gene expression data has been shown ( Figure 4G) . Each dot represents a gene, and the statistical plot is described after fitting a linear model. The average log expression line shows that the values of the early DEGs are highly dense. We have listed the significant DEGs (both protein-coding genes and long non-coding RNAs) from our studies in Table 3 . From the significant DEGs, we found several protein-coding genes and lncRNA. The percentage of significantly expressed proteincoding genes and lncRNAs is depicted through a pie diagram ( Figure 5A ). The acquired top-ranking expressed genes of COVID-19 patients from the various studies are shown in Table 2 . The number of protein-coding genes in our investigation is represented through a bar diagram in Figure 5B . This figure also shows the total numbers of top-ranking protein-coding genes from the other studies. Protein interactions within a cell can be represented through a protein interactome map, providing global insights into genome function and cellular organization. It will provide a comprehensive understanding of the interactome networks of SARS-CoV-2-infected human cells. We developed a protein interactome with protein-coding genes from the 250 DEGs in our study ( Figure 6A ). We found that the number of interactions was 2901 and the number of proteins that participated in the interactions was 453, and the average node degree was 12.53. Next, we depicted the protein interactome of the central cluster from our previous study ( Figure 6B ). We also portrayed the protein interactome using the data from the Xiong et al. study where the samples were the bronchoalveolar lavage fluid ( Figure 6C ). We found that the number of interactions was 76, the number of proteins that participated in the interactions was 35, and the average node degree was 4.06. We then illustrated the protein interactome with the top-ranked genes of the Xiong et al. study from the PBMC samples ( Figure 6C) . Here, we found that the number of interactions was 103, the number of proteins that participated in the interactions was 62, and the average node degree was 3.11. Similarly, we represented one protein interactome with data from the Ziegler et al. study where the samples were collected from the lung lobe, nasal polyps, ethmoid sinus surgical tissue, and ileum ( Figure 6E ). We found that the number of interactions was 2613, the number of proteins that participated in the interaction was 504, and the average node degree was 10.15. Finally, we have illustrated the protein interactome with the top-ranked genes of the Jain et al. study that used nasopharyngeal swabs as samples ( Figure 6F ). We found that the number of interactions was 437, the number of proteins that participated in the interaction was 128, and the average node degree was 6.56. Integrative gene expression analysis and creating the PPI networks from the DEGs coding proteins are essential to understanding the diseases' molecular pathology. The PPI network analysis also showed the functional and physical associations among DEGs' coding proteins of other samples of COVID-19. From this analysis we depicted a PPI using significant protein-coding genes from the 250 DEGs in our study ( Figure 7A ). At the same time, we have also developed a PPI using the top-ranked genes of the Xiong et al. study that used bronchoalveolar lavage fluid as samples ( Figure 7B ). Similarly, we depicted a PPI using the top-ranked genes of the Xiong et al. study using the PBMCs as samples ( Figure 7C ). Again our study illustrates a PPI using the top-ranked genes from the Ziegler et al. study ( Figure 7D ). Finally, we depicted a PPI using the topranked genes of the Jain et al. study ( Figure 7E ). The enrichment network cluster shows the intra-cluster and intercluster similarities from the input genes involved in different biological processes, enzymatic functions, and protein localization. It shows similarities of the other cluster proteins from the DEGs as per their function. In this study, we have developed a PPI network enrichment cluster using significant protein-coding genes from the 250 DEGs ( Figure 8A ). At the same time, our analysis represents the enrichment cluster of the PPI network using the top-ranked genes of the Xiong et al. study where bronchoalveolar lavage fluid was used as samples ( Figure 8B) . Again, we depicted the enrichment cluster of the PPI network using data from the Xiong et al. study that used the PBMC as samples ( Figure 8C ). Similarly, we developed a PPI network enrichment cluster using data from the Ziegler et al. study ( Figure 8D) . Finally, the analysis depicted the enrichment cluster of the PPI networks using data from the Jain et al. study ( Figure 8E ). This analysis helps researchers provide mechanistic insights into the DEGs (gene list) generated from genome-scale (omics) experiments. In this pathway enrichment analysis, gene list enrichment was identified in the COVID-19 categories and transcription factor targets. At first, we have depicted the gene list enrichments in COVID-19 categories from the 250 DEGs of our study ( Figure 9A ). Subsequently, we developed the gene list enrichments in COVID-19 categories with the top-ranked genes of the Xiong et al. study that used bronchoalveolar lavage fluid as samples ( Figure 9B) . Similarly, we illustrated the gene list enrichments in COVID-19 categories with the top-ranked genes of the Xiong et al. study that used PBMC samples ( Figure 9C) . Then, we have developed the gene list Figure 10D ). At last, this analysis depicted the gene list enrichment in transcription factor targets of the top-ranked genes of the Jain et al. study ( Figure 10E ). Cross-Verification of the Construction of the LncRNA and DEG-lncRNA Co-Expression Networks Table 4 represents the significantly upregulated DEGs lncRNA genes from the three patients groups in our dataset, and GEO2R was used to identify it. Furthermore, we cross-verified lncRNAs and identified 24 significant lncRNA genes, which are recorded in Table 4 . We developed the coexpression networks of the protein expression genes of DEG and lncRNA ( Figure 11 ). Our gene function annotation and categorization show the genes associated with the immune response elements and protective immunity from DEGs ( Figure 12) . The genes we identified for the activation of immune cells and components, and protective immunity correlate with the other networks ( Table 5 ). High-throughput technologies such as DNA microarrays and next-generation sequencing are beneficial for discoveries in the biomedical field. Gene expression profiling using microarray is a promising way to gain insight into the intrinsic molecular pathways, which helps to understand the complex machinery of biological systems (98, 99) . We used these gene expression profiling methods to identify genes that are differently expressed in the PBMCs of COVID-19 patients with mild or severe infections compared with healthy volunteers. Zhang and Diao submitted the dataset in the GEO database. They have illustrated the antiviral and inflammation mechanisms related to the immune response associated with severe COVID-19 patients (100). However, using their dataset, we have analyzed the dataset differently. Our study investigated the DEGs in PBMCS of five patients having mild COVID-19, five patients with severe COVID-19, and five healthy volunteers using the GSE164805 dataset. We opine that our findings will help better understand the pathogenesis of SARS-CoV-2 and the host gene response during infection. Here we have performed a comprehensive analysis of the DEGs and comparison of three groups of human subjects using advanced methods and statistical techniques. We extracted the 250 top-ranked DEGs using GEO2R and further evaluated them. Our study listed the significant DEGs (both protein-coding genes and long non-coding RNAs), and some of the significant DEGs, CERKL, RPL18A, STRN4, RPL3L/RPL35/RPL1BA/RPL19, RPS3/RPS16, AP2M1, EDN1, ARHGEF1, DUS1L, RBM5, etc. in other studies are CXCL1, CXCL6, CXCL8, IL33, TIMP1, IL18, IFNGR2, TRIM27, TRIM28, IFI6, XAF1, CXCL5, IFIT1, IL6, IL10 , and CSF2, etc. The analysis will help to understand the DEGs in mild and severe COVID-19 patients. Our protein interactome map analysis found that the number of interactions was 2901, the number of proteins that participated in the interactions was 453, and the average node degree was 12.53. Also, we developed a protein interactome from Xiong et al.'s study using bronchoalveolar lavage fluid. We found that the number of interactions was 76, the number of proteins that participated in the interactions was 35, and the average node degree was 4.06. Similarly, the study from the same authors but with different samples (from the PBMC samples) was used to develop a protein interactome map. The analysis revealed that the number of interactions was 103, and the number of proteins that participated in the interactions was 62. The average number of proteins that participated in the interactions node degree was 3.11. Similarly, the developed protein interactome from the study of Ziegler et al. informed us the interaction proteins that the number of interactions was 2613, the number of proteins that participated in the interaction was 504, and the average node degree was 10.15. Finally, we developed a protein interactome from the study of Jain et al. We found that the number of interactions was 437, the number of proteins that participated in the interaction was 128, and the average node degree was 6.56. The analysis provides global insights into genome function and cellular organization in COVID-19 patients, indicating interactome networks in COVID-19 patients. The enrichment cluster analysis of the PPI network showed that the MPHOSPH6, RPL19, EIF3G, EIF3E, RPL1BA, EXOCS5, EXOSC2, RPL3L, RPS3, RPS16, RPL35, EIF4G1, SKIV2L2 firmly formed the central cluster of this PPI network. However, some side clusters were noted associated with the proteins like SKIV2L2, EIF3G, EXOSC2, and RPS16. At the same time, our analysis identified 24 significant lncRNA genes, which will help understand the differentially expressed lncRNA genes and help understand future researchers more about the SARS-CoV-2infected human cells. Moreover, we found six lncRNAs (TALAM1, DLEU2, and UICLM CASC18, SNHG20, and GNAS) involved in the protein-coding DEG-lncRNA network. This finding is significant for the next-generation biomarker detection point of view. Finally, the analysis of gene function annotation and categorization of regulated genes related to immune response elements found that several genes are directly or indirectly associated with the inter immunity defense mechanism such as EDN1 MPHOSPH6, RPL19. EDN1 gene is associated with TLR4 response (54) . At the same time, RPL19 might be related to the TLR3 receptor-associated signaling and endorses cytokine secretion (101) . However, our analysis (our gene and the genes from the other study) found that genes are associated with cytokine up-regulation. The study generated an interactome map, a PPI pattern, a cluster analysis of the PPI network, a pathway enrichment analysis from our research, and other experimental investigations to understand the gene expression and transcriptome profiling of SARS-CoV-2-infected human cells in mild and severe COVID-19 patients. We identified the differentially expressed lncRNA genes of COVID-19 patients and constructed the DEG-lncRNA co-expression networks. We also found six lncRNAs that are involved in the protein-coding DEG-lncRNA network generation. The attempt will help to understand the lncRNA expression in mild and severe COVID-19 patients. These lncRNAs might serve as next-generation potential biomarkers for COVID-19 patients. Presently, one significant objective worldwide is to understand the dysregulation of immune response and inflammation, immunity, and intervention in COVID-19 patients. We attempted to understand the gene Previously, to understand the host transcriptional responses of SARS-CoV-2, an interactome study was performed and was reported by Messina et al. (102) . Their study suggested that the host interactome is linked to the S-glycoprotein of the virus mainly via the innate immunity machinery, such as cytokines, chemokines, and TLRs. We, too, found in our interactome study that several proteins from COVID-19 patients are linked to innate immunity and the regulation of protective immunity. However, our report is more detailed and unique because our analysis of protein-coding DEGs considers data not only from our group but from many interactome studies from around the globe, and the patient samples from which the data were obtained were diverse. Recently, Gordon (104) . The investigation also unfolds new therapeutic targets and will be beneficial for discovering therapeutic. The construction of the PPI network from patient samples has an immense advantage, which will help to understand disease mechanisms (105) (106) (107) . Zhang et al. recently developed the SARS-CoV-2 virus-human PPI network using the random walk model to understand pathological biomarkers (108) . We have developed a human PPI network from independent studies utilizing samples collected from COVID-19 patients presenting mild and severe symptoms. It comprehensively comprises upregulated protein-coding genes and PPIs in COVID-19 patients from an entire proteome landscape. Enrichment cluster analysis shows densely interlinked regions of proteins as intra-cluster and inter-cluster, which helps us understand densely interlinked regions of proteins in the global proteome landscape and the proteome related to innate immunity and protective immunity in response to SARS-CoV-2 infection. We used the MCODE plug-in from Cytoscape to construct the PPI network and enrichment cluster analysis. We performed functional pathway enrichment analysis to understand both the COVID-19 regulated genes and the target genes. We report several genes that are regulated in COVID-19 and target genes related to immune cell activation, such as the Tand B-cell-activating protein-coding genes (109) . Other researchers have also performed pathway enrichment analysis to understand the lncRNA prognostic signature of ovarian cancer (110) . Our functional pathway enrichment analysis may inform us about the significant immune marker genes in COVID-19 patients. Our results also corroborate with the study of Wu et al., who performed functional enrichment analysis to understand the possible role of naïve B cells from the lungs of patients with severe immune responses in COVID-19 patients (111) . Our protein-coding DEG-lncRNA coexpression network pattern revealed the prospective function of differentially expressed lncRNAs in the context of COVID-19. Recently, Hu et al. developed co-expression network construction using DEG-lncRNA pairs to understand lncRNAs and proteins in hypertrophic cardiomyopathy (112) . Our analysis of protein-coding DEG-lncRNA pairs revealed that six lncRNA have participated in the protein-coding DEG-lncRNA network (TALAM1, DLEU2, UICLM, CASC18, SNHG20, and GNAS). These essential lncRNAs may serve as potential biomarkers for COVID-19. However, further functional studies in a larger cohort of patients need to be investigated. It is critical to understand the dysregulation of immune response and inflammation, immunity, and intervention in COVID-19 patients (113) . Recently, Zhou et al. mapped DEGs involved in innate immunity from COVID-19 patients (25) . We have annotated and categorized the function of genes related to the regulation of immune elements and protective immunity by the analysis of DEGs in COVID-19; a list of the immune system's regulatory genes and regulation of immune-related transcripts in COVID-19 is presented. The potential interpretations of the study can be understood in the following points. Firstly, we know the gene expression and transcriptome profiling of mild and severe COVID-19 patients. Secondly, our analysis of the noted six essential lncRNAs might serve as nest generation biomarkers for COVID-19. However, further functional studies in a larger cohort of patients need to be investigated. Finally, the study prepared a detailed list of the immune system's regulatory genes and immune-related transcripts in COVID-19, which has immense implications for understanding the COVID-19 dysregulation of immune response and interference in COVID-19 patients. The study suffers from the limitation that the sample size is relatively small due to the available datasets. The dataset we have used for the study (GSE164805) contains fifteen human subjects (five control, five mild COVID-19 patients, and five severe COVID-19 patients). The dataset was submitted by other researchers (100) . The data set is limited, as it has been collected only from peripheral blood mononuclear cells (PBMC) samples from three groups of human subjects (control, mild COVID-19 patients, and severe COVID-19 patients). A similar dataset is not available in the database that uses the PBMC for their analysis from COVID-19 patients and compares with control. It was noted that this dataset was the only first dataset in the GEO database which captured the gene expression data from three groups of human subjects and informed differential gene expression of both three groups of patients. The gene expression data was analyzed using a microarray platform. The dataset was initially submitted in the database early (January 2021) when no gene expression data were available from both three groups of COVID-19 patients. In this point of view, it is a very significant dataset. However, as the sample size was small, we compared our result with three other COVID-19 gene expression studies, which were performed by Xiong et al. (26) , Ziegler et al. (27) , and Jain et al. (28) . Here, we report the DEG data from COVID-19 patients that will help to understand global gene expression in COVID-19 patients. The data provide valuable information about the immune response in patients infected with SARS-CoV-2, highlighting the molecular genetic mechanisms related to immune elements and protective immunity against COVID-19. We understand that a limited number of patient datasets were analyzed to map the DEGs. However, in the future, we will plan to perform a scRNAseq study in this direction, which will help better to understand the underlying gene expression mechanism of severe COVID-19 patients compared to mild patients. We believe that similar studies with more patient datasets from other parts of the world will significantly augment our understanding of this complex host-virus interaction during COVID-19 disease progression and will help to map the genes involved in protective immunity. The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors. CC analyzed and interpreted the patient dataset from the GEO, performed the main experiments, and wrote the main manuscript. ARS performed the data validation, formal analysis, review, and edited of the manuscript. MB performed the data validation and formal analysis. HZ and S-SL performed review and editing of the manuscript. All authors contributed to the article and approved the submitted version. Substantial Undocumented Infection Facilitates the Rapid Dissemination of Novel Coronavirus (SARS-CoV-2) Response to: Status of Remdesivir: Not Yet Beyond Question Second Wave COVID-19 Pandemics in Europe: A Temporal Playbook Beware of the Second Wave of COVID-19 Real-Time Monitoring Shows Substantial Excess All-Cause Mortality During Second Wave of COVID-19 in Europe Systems Biological Assessment of Immunity to Mild Versus Severe COVID-19 Infection in Humans Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention Viral Dynamics in Mild and Severe Cases of COVID-19 Mild Versus Severe COVID-19: Laboratory Markers Neutrophil to Lymphocyte Ratio and C-Reactive Protein Level as Prognostic Markers in Mild Versus Severe COVID-19 Patients SARS-CoV-2 and COVID-19 in Older Adults: What We May Expect Regarding Pathogenesis, Immune Responses, and Outcomes Immunology of COVID-19: Current State of the Science Current Status of Epidemiology, Diagnosis, Therapeutics, and Vaccines for Novel Coronavirus Disease 2019 (COVID-19) Development of Epitope-Based Peptide Vaccine Against Novel Coronavirus 2019 (SARS-COV-2): Immunoinformatics Approach A SARS-CoV-2 Vaccine Candidate: In-Silico Cloning and Validation Repurposing Drugs, Ongoing Vaccine and New Therapeutic Development Initiatives Against COVID-19 SARS-CoV-2 Protein Drug Targets Landscape: A Potential Pharmacological Insight View for the New Drug Development Molecular Networks as Sensors and Drivers of Common Human Diseases Integrating Gene Expression and Protein-Protein Interaction Network to Prioritize Cancer-Associated Genes Mapping Complex Disease Traits With Global Gene Expression Transcriptomics Technologies RNA-Seq Analysis of Chikungunya Virus Infection and Identification of Granzyme A as a Major Promoter of Arthritic Inflammation RNA-Seq Signatures Normalized by mRNA Abundance Allow Absolute Deconvolution of Human Immune Cell Types Heightened Innate Immune Responses in the Respiratory Tract of COVID-19 Patients Transcriptomic Characteristics of Bronchoalveolar Lavage Fluid and Peripheral Blood Mononuclear Cells in COVID-19 Patients SARS-CoV-2 Receptor ACE2 Is an Interferon-Stimulated Gene in Human Airway Epithelial Cells and Is Detected in Specific Cell Subsets Across Tissues Host Transcriptomic Profiling of COVID-19 Patients With Mild, Moderate, and Severe Clinical Outcomes Analysis of Microarray Experiments of Gene Expression Profiling Microarray and Its Applications Genomic Modulators of the Immune Response A Global Effort to Define the Human Genetics of Protective Immunity to SARS-CoV-2 Infection Immune-Mediated Approaches Against COVID-19 Archive for Functional Genomics Data Sets-Update Limma Powers Differential Expression Analyses for RNA-Sequencing and Microarray Studies Determination of the Differentially Expressed Genes in Microarray Experiments Using Local FDR Robust Volcano Plot: Identification of Differential Metabolites in the Presence of Outliers Interpretation of Differential Gene Expression Results of RNA-Seq Data: Review and Integration A Reference Map of the Human Binary Protein Interactome The STRING Database in 2017: Quality-Controlled Protein-Protein Association Networks, Made Broadly Accessible Protein-Protein Association Networks With Increased Coverage, Supporting Functional Discovery in Genome-Wide Experimental Datasets Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks Biological Network Exploration With Cytoscape 3 Metascape Provides a Biologist-Oriented Resource for the Analysis of Systems-Level Datasets A Scored Human Protein-Protein Interaction Network to Catalyze Genomic Interpretation BioGRID: A General Repository for Interaction Datasets Detection of the Number of Signals Using the Benjamini-Hochberg Procedure A Travel Guide to Cytoscape Plug-Ins RNAcentral: A Hub of Information for Non-Coding RNA Sequences GeneCards Version 3: The Human Gene Integrator The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses Evolution of Male Pregnancy Associated With Remodeling of Canonical Vertebrate Immunity in Seahorses and Pipefishes Histone Deacetylase 7 Promotes Toll-Like Receptor 4-Dependent Proinflammatory Gene Expression in Macrophages Loss of ARHGEF1 Causes a Human Primary Antibody Deficiency Role for CXCR2 and CXCL1 on Glia in Multiple Sclerosis Neutrophils Self-Regulate Immune Complex-Mediated Cutaneous Inflammation Through CXCL2 ORIGINAL ARTICLE: CXCL6 (Granulocyte Chemotactic Protein-2): A Novel Chemokine Involved in the Innate Immune Response of the Amniotic Cavity The CXCL8/IL-8 Chemokine Family and Its Receptors in Inflammatory Diseases Interleukin (IL)-33 Induces the Release of Pro-Inflammatory Mediators by Mast Cells The Emerging Role of CXCL10 in Cancer (Review) MCP-1: Chemoattractant With a Role Beyond Immunity: A Review CXC Chemokines That Target Lymphocytes Chemokines as Adjuvants for Immunotherapy: Implications for Immune Activation With CCL3 B Cells and Professional APCs Recruit Regulatory T Cells via CCL4 A Co-Evolution Perspective of the TNFSF and TNFRSF Families in the Immune System TIMP-1 Promotes the Immune Response in Influenza-Induced Acute Lung Injury Production of Complement Components by Cells of the Immune System Annotating Genes With Potential Roles in the Immune System: Six New Members of the IL-1 Family Neuregulin-1 Elicits a Regulatory Immune Response Following Traumatic Spinal Cord Injury Transcriptional Regulation of the Anti-Inflammatory Cytokine IL-10 in Acquired Immune Cells Human Adenosine Deaminases ADA1 and ADA2 Bind to Different Subsets of Immune Cells GABA Is an Effective Immunomodulatory Molecule LAIR-1, A Novel Inhibitory Receptor Expressed on Interferon-g: An Overview of Signals, Mechanisms and Functions TRIM Family Proteins and Their Emerging Roles in Innate Immunity Evidence That TMPRSS2 Activates the Severe Acute Respiratory Syndrome Coronavirus Spike Protein for Membrane Fusion and Reduces Viral Control by the Humoral Immune Response Phosphorylation of TRIM28 Enhances the Expression of IFN-b and Proinflammatory Cytokines During HPAIV Infection of Human Lung Epithelial Cells ApoA-I), Immunity, Inflammation and Cancer Chapter Four -STAT Transcription Factors in T Cell Control of Health and Disease Association Between Interferon-Inducible Protein 6 (IFI6) Polymorphisms and Hepatitis B Virus Clearance The IFITM Protein Family in Adaptive Immunity Regulation of Innate Immune Functions by Guanylate-Binding Proteins CXCL5 Drives Neutrophil Recruitment in TH17-Mediated GN CXCL12 Is a Constitutive and Inflammatory Chemokine in the Intestinal Immune System The Effects of Immune Cell Products (Cytokines and Hematopoietic Cell Growth Factors) on Bone Cells Ifih1 Gene Dose Effect Reveals MDA5-Mediated Chronic Type I IFN Gene Signature, Viral Resistance, and Accelerated Autoimmunity Interferon-Induced Ifit Proteins: Their Role in Viral Pathogenesis The Role of Interleukin 6 During Viral Infections IL-12 and IL-10 Polymorphisms and Their Effects on Cytokine Production NF-kb Controls Il2 and Csf2 Expression During T Cell Development and Activation Process The Immune System, Bone and RANKL Modulation of Bone Morphogenic Protein Signaling in T-Cells for Cancer Immunotherapy Complement Inhibitor C4b-Binding Protein-Friend or Foe in the Innate Immune System? An Immune Paradox: How Can the Same Chemokine Axis Regulate Both Immune Tolerance and Activation? CCR6/CCL20: A Chemokine Axis Balancing Immunological Tolerance and Inflammation in Autoimmune Disease Interleukin-11: Review of Molecular, Cell Biology, and Clinical Use Human Interleukin-19 and Its Receptor: A Potential Role in the Induction of Th2 Responses Integrative Bioinformatics Approaches to Map Potential Novel Genes and Pathways Involved in Ovarian Cancer Analysis of Differentially Expressed Genes and Molecular Pathways in Familial Hypercholesterolemia Inflammation and Antiviral Immune Response Associated With Severe Progression of COVID-19 Ribosomal Protein L19 and L22 Modulate TLR3 Signaling Looking for Pathways Related to COVID-19: Confirmation of Pathogenic Mechanisms by SARS-CoV-2-Host Interactome A SARS-CoV-2 Protein Interaction Map Reveals Targets for Drug Repurposing Proteomics of SARS-CoV-2-Infected Host Cells Reveals Therapy Targets Landscape Mapping of Functional Proteins in Insulin Signal Transduction and Insulin Resistance: A Network-Based Protein-Protein Interaction Analysis Protein-Protein Interaction Networks: Probing Disease Mechanisms Using Model Systems Evaluating Protein-Protein Interaction (PPI) Networks for Diseases Pathway, Target Discovery, and Drug-Design Using Identification of COVID-19 Infection-Related Human Genes Based on a Random Walk Model in a Virus-Human Protein Interaction Network The Transcriptomic Profiling of SARS-CoV-2 Compared to SARS, MERS, EBOV, and H1N1 Identification Three LncRNA Prognostic Signature of Ovarian Cancer Based on Genome-Wide Copy Number Variation Silico Immune Infiltration Profiling Combined With Functional Enrichment Analysis Reveals a Potential Role for Naïve B Cells as a Trigger for Severe Immune Responses in the Lungs of COVID-19 Patients Identification of Key Proteins and lncRNAs in Hypertrophic Cardiomyopathy by Integrated Network Analysis The Trinity of COVID-19: Immunity, Inflammation and Intervention Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest