key: cord-0022601-xoew9mpc authors: Wang, Yingshuang; Feng, Feifei; Zheng, Pingping; Wang, Lijuan; Wang, Yanjun; Lv, Yaogai; Shen, Li; Li, Kexin; Feng, Tianyu; Chen, Yang; Liu, Zhigang; Yao, Yan title: Dysregulated lncRNA and mRNA may promote the progression of ischemic stroke via immune and inflammatory pathways: results from RNA sequencing and bioinformatics analysis date: 2021-10-26 journal: Genes Genomics DOI: 10.1007/s13258-021-01173-1 sha: 736a3e39218830677a1d8009b0383b95fab60f66 doc_id: 22601 cord_uid: xoew9mpc BACKGROUND: Long non-coding RNAs (lncRNAs) are widely involved in gene transcription regulation and which act as epigenetic modifiers in many diseases. OBJECTIVE: To determine whether lncRNAs are involved in ischemic stroke (IS), we analyzed the expression profile of lncRNAs and mRNAs in IS. METHODS: RNA sequencing was performed on the blood of three pairs of IS patients and healthy controls. Differential expression analysis was used to identify differentially expressed lncRNAs (DElncRNAs) and mRNAs (DEmRNAs). Based on the co-expression relationships between lncRNA and mRNA, a series of bioinformatics analysis including GO and KEGG enrichment analysis and PPI analysis, were conducted to predict the function of lncRNA. RESULTS: RNA sequencing produced a total of 5 DElncRNAs and 144 DEmRNAs. Influenza A pathway and Herpes simplex infection pathway were the most significant pathways. EP300 and NFKB1 were the most important target proteins, and Human leucocyte antigen (HLA) family were the key genes in IS. CONCLUSIONS: Analysis of this study revealed that dysregulated lncRNAs in IS may lead to IS by affecting the immune and inflammation system. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s13258-021-01173-1. Ischemic stroke (IS) is one of the cardiovascular diseases and the leading cause of death and disability worldwide, especially in developing countries (Strong et al. 2007 ). Cardiovascular and cerebrovascular diseases, diabetes, dyslipidemia, obesity, smoking, drinking, and anticoagulants may be the main factors affecting the occurrence of IS (Lavados et al. 2005; O'Donnell et al. 2010) . Factors may lead to IS through vascular and neuronal damage, dysfunction of molecular signaling pathways, inflammatory cytokine damage, and oxidative stress damage (Chaitanya et al. 2013; Jin et al. 2013; Li et al. 2018) . Focusing on the mechanisms of IS injury, repair and inflammation, and its possible molecular mechanisms may provide a more theoretical basis for IS screening, prevention and treatment. Biomarkers associated with stroke are mainly a variety of proteins associated with pathological processes such as inflammation-related damage, neuronal apoptosis, and vascular endothelial dysfunction (Serena et al. 2005; An et al. 2013 ). However, due to the complex protein composition, there are various post-translational modifications, hydrolysis and denaturation, which makes it difficult to select the appropriate method for accurate detection (Calligaris et al. 2011; Lam et al. 2016) . The physical and chemical properties of traditional biomarkers are unstable and poorly conserved, and their content is related to the regulation of gene expression. Therefore, rational application of gene expression variation can be used for diagnostic prediction in the early stages of the disease (Ebert et al. 2006; Sepramaniam et al. 2014) . lncRNA is a group of non-protein coding RNA molecules, and their transcripts are more than 200 nucleotides in length (Yang et al. 2014) . LncRNA regulates DNA methylation, histone modification or chromosome remodeling on multiple layers through diverse molecular regulation mechanisms, and participates in many important biological regulation processes such as genomic imprinting, transcriptional interference and nuclear transport (Wilusz et al. 2009; Chen and Carmichael 2010; Marchese et al. 2016 Marchese et al. , 2017 . Current lncRNA research covers many fields, such as tumor, blood system diseases, cardiovascular and cerebrovascular diseases, especially in ischemic and hypoxic diseases (Harries 2012; Li and Chen 2013; Sánchez and Huarte 2013) . LncRNA UCA1 can be used as a potential diagnostic marker and therapeutic target for acute myocardial infarction (Yan et al. 2016) , lncRNA ANRIL is associated with the sensitivity of atherosclerotic disease and can be used as a marker for the diagnosis (Holdt and Teupser 2018) . Changes in the expression of certain lncRNA after ischemia-reperfusion injury may be biomarkers of ischemia-reperfusion injury during liver surgery or transplantation ). In addition, lncRNA is also considered effective in the treatment of ischemic diseases by promoting stem cell differentiation and preventing erythrocyte apoptosis (Yang and Lu 2016; Zhang et al. 2016) . Although the regulatory mechanism of lncRNA and its relationship with some ischemic and hypoxic diseases have been preliminarily understood, the regulatory network of gene expression mediated by lncRNA and its molecular mechanism in IS remain to be further explored. To determine the functional significance of lncRNAs in the pathophysiological regulation of IS, RNA sequencing technology, which is superior to microarrays, was used to analyze lncRNA and mRNA expression profiles in this study. Three patients first diagnosed IS and three healthy controls, Han Chinese males, 40-60 years old (Supplementary Table 1 ), were recruited from the hospital of Jilin University from July to December 2017. IS patients were diagnosed for the first time based on the "Chinese guidelines for diagnosis and treatment of acute ischemic stroke 2014", and all of them had no history of using antiplatelet or antidiabetic agents. All objectives with history of diabetes mellitus, atrial fibrillation, myocardial infarction, tumor, acute infectious disease, immunity disease, blood disease, renal or liver failure, and hemorrhagic stroke or recurrent stroke were excluded. Written informed consent was obtained. The study was approved by the Ethics Committee of School of Public Health, Jilin University. The design flow chart of this study was shown in Fig. 1 . Peripheral blood samples were collected in the next morning after the participants had fasted for ten hours or overnight. Total RNA was isolated and purified using standard TRIzol Reagent (Invitrogen, Carlsbad, CA) according to the manual. RNA concentration and purity, RNA integrity were then assessed using the Nanodrop2000 spectrophotometer (Thermo Fisher Scientific, Wilmington, DE) and an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). The cDNA library was constructed using the Truseq RNA sample Prep Kit (Illumina, Inc., San Diego, CA). RNA sequencing (2 * 150 bp) was conducted on the Illumina HiSeq 4000 sequencing system (Illumina, Inc.). Quality-control filtering was performed by removing the adapter sequences and low-quality bases, and high-quality clean reads were generated. The clean reads were mapped to the reference using HISAT2 v2.0.4 (Pertea et al. 2016 ) and the reference-based assembly of transcripts was performed using Stringtie v1.3.1 (Pertea et al. 2015 ) (reference genome sequence and annotation files were downloaded from GEN-CODE (GRCh38, https:// www. genco degen es. org/)). The lncRNAs were screened out according to the number of exons, length, known annotation and coding potential of transcripts. Then transcripts including mRNAs, lncRNAs and transcripts of unknown coding potential (TUCPs) were quantified by StringTie-eB. To identify differentially expressed lncRNAs and mRNAs between IS patients and controls, differential expression analyses were performed by "ballgown" package of R software (Pertea et al. 2016) . From the perspective of statistical significance, different types of transcripts (lncRNA, TUCPs and mRNA) were analyzed as a whole, so that the results had no preference for molecular types. Transcripts with a P-value < 0.05 and |log2foldchange| >1 were assigned as differentially expressed. Gene Ontology (GO) enrichment analysis of differentially expressed genes or lncRNAs target genes were implemented by the "GOseq" R package, in which gene length bias was corrected (Young et al. 2010 ). Corrected P-value < 0.05 were considered to be significantly enriched. Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database resource for understanding high-level functions and utilities of the biological system (Kanehisa et al. 2008) . We used KOBAS software to test the statistical enrichment of differential expression genes or lncRNA target genes in KEGG pathways (Mao et al. 2005) . P-value was used to determine whether the pathways were significant. Protein-Protein Interaction (PPI) analysis of differentially expressed genes was based on the online database resource which known as STRING (https:// string-db. org/ cgi/ input. pl) and predicted interactions between proteins. We used mRNAs co-expressed with DElncRNAs to construct the PPI network and visualized it in Cytoscape 3.6.1 (Shannon et al. 2003 ). GO/KEGG Enrichment Pathways Network was constructed using all differentially expressed mRNAs with a plug-in called "ClueGO" in Cytoscape 3.6.1 (Bindea et al. 2009 ). ClueGO integrates the terms of Gene Ontology (GO) and the KEGG/BioCarta pathways, it could achieve comprehensive visualization by creating a GO/KEGG pathways network. Gene expression profiling of GSE22255 dataset was performed in peripheral blood mononuclear cells of 20 IS patients and 20 sex-and age-matched controls using GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array. In order to be consistent with the design of this study, we only downloaded the expression matrix of male cases and male healthy controls (10 vs. 10), and used R software "limma" package (Ritchie et al. 2015) for standardization and differential expression analysis, and obtained the log2(foldchange) value and P value to validate significance of RNAs that we have found. After filtering out the reads with adapter, the reads with N base more than 10 % and the low quality reads from raw reads, 97,077,054, 86,224,578 and 104,855,436 clean reads were produced in IS patients, and 97,740,922, 97,583,408 and 100,893,034 clean reads were produced in the control group (Supplementary Table 2 ). The percentage of total mapped reads was 94.96%, 95.37% and 96.44% in IS patients group, and 95.34%, 97.23% and 96.32% in control group, respectively. The total mapped rate was more than 90%, indicating that the selection of reference genome was appropriate and there was no pollution in the experiment (Supplementary Table 3 A total of five lncRNAs were differentially expressed in IS, including 4 up-regulated lncRNAs and 1 down-regulated lncRNA (Table 1 ). There were 144 differentially expressed mRNAs, among which 70 were up-regulated and 74 were down-regulated, each top 10 were listed in Table 2 , respectively. LncRNA can regulate the expression of target coding genes to achieve function (Bazin et al. 2017) . We analyzed the function of mRNA co-expressed with lncRNA to clarify the biological role of DElncRNA. The lncRNA-mRNA coexpression network ( Supplementary Fig. 1 ) was established to show co-expression relationships of them, the network comprised 946 nodes and 1009 connections. The most crucial sub-network was constructed by the transcripts with The PPI network (Fig. 3) contains 417 nodes and 810 edges, protein interactions were evaluated with the highest confidence (0.9). Degree was used to assess the importance of genes, and the genes with top 30° were listed, EP300 and NFKB1 were regarded as the most important top two target genes that regulated protein expression (Fig. 4 ). We utilized all DEmRNAs for the GO and KEGG analysis and constructed a GO/KEGG pathway network to illustrate the critical mRNAs in the process of IS. The network built 13 groups composed of GO terms and KEGG pathways, and these functional groups with target genes of mRNAs were shown in Table 3 . HLA-DQB1, HLA-DQA1 and HLA-DRB5 were the key genes which had overlapped in at least four pathway groups, and all of them were belonged to the HLA family (Fig. 5 ). In addition, these three genes were mainly involved in the inflammatory bowel disease (IBD) pathway and asthma pathway, both in Th17 cell differentiation group. Five valuable coding genes and three DElncRNAs in this study were verified. Because the early microarray sequencing was not enough to detect all genes in this study, we can only verify the expression of seven IS related genes. EP300, NFKB1, SNHG8 and MIRLET7BHG met the differential expression criteria of P < 0.1 and HLA-DQB1, HLA-DQA1, HLA-DRB5 met the criteria |log2foldchange| > 1 (Table 4) . In this study, 5 DElncRNAs were identified that differentially expressed in IS, among which LNC_000015, LNC_001727 were novel lncRNAs, SNHG8, MIRLET7BHG, AF001548.5 are annotated lncRNA in the database. After a series of bioinformatics analysis, we mainly found that dysregulated lncRNAs and mRNAs in IS vs. controls may lead to IS by affecting the immune system of the body. There are few published RNA-Seq studies of IS patients, and some of these studies have not strictly controlled variables, so a large number of differentially expressed lncRNAs have been screened. Some of the DE-genes may not be related to IS, but related to other IS parallel diseases due to selection bias. In our study, only five differentially expressed lncRNAs were screened by setting strict inclusion and exclusion criteria of patients and control group. Among the five differentially expressed lncRNAs, the lncRNA SNHG8 has been widely reported. Up regulation of lncRNA SNHG8 is a risk factor for many diseases. It plays an important regulatory role in the occurrence and development of acute myocardial infarction (Zhuo et al. 2019 ), a variety of cancers (such as liver cancer, pancreatic cancer, nasopharyngeal carcinoma, endometrial cancer, esophageal squamous cell Tool for the Retrieval of Interacting Genes/Proteins (STRING) based on the mRNAs co-expressed with DElncRNAs. The larger the circle and the redder the color, representing the greater degree of the gene, and the more genes that are connected to it. Conversely, the smaller the circle and the bluer the color, representing the smaller degree of the gene, and the fewer number of genes connected to it carcinoma, ovarian cancer, gastric cancer, breast cancer, cervical cancer, non-small cell lung cancer, prostate cancer, etc.) (Yuan et al. 2021 ) and ischemic diseases (Liu et al. 2019; Tian et al. 2020) . Liu et al. (2019) verified that Snhg8/ miR-384/Hoxa13/FAM3A axis regulating neuronal apoptosis in ischemic mice model, and then Tian et al. (2020) proved that LncRNA Snhg8 attenuates microglial inflammation response and blood-brain barrier damage in IS through regulating miR-425-5p mediated SIRT1/NF-κB signaling. Similarly, we found the hub genes EP300 and NFKB1 in the constructed PPI network of DElncRNAs co-expression mRNA. The conclusion of Tian et al. (2020) is consistent with our study. Although the other four DElncRNAs have not been reported to be associated with IS. They can be used as potential biomarkers and therapeutic targets of IS, which providing direction for future research. Stroke is still a severe challenge for either developed countries or developing countries, poses a substantial socioeconomic burden (Addo et al. 2012 ). It's urgent to find its pathogenesis and control it effectively. KEGG analysis found that Influenza A and Herpes simplex infection pathway were the most significant two pathways in this study, both related to immune and inflammation. Inflammation is considered involved in all forms of brain damage, and immune mechanisms play an important role in the risk and progression of stroke and in cerebral ischemia (Smith et al. 2013; Fu et al. 2015) . Recent data from clinical and experimental research clearly show that systemic inflammatory diseases such as atherosclerosis, obesity, diabetes or infection are associated with dysregulated immune responses, which can profoundly contribute to cerebrovascular inflammation and injury in the central nervous system (Ling et al. 2015) . Some findings have shown that several immune-mediated diseases (IMDs) are linked to cerebrovascular diseases, and many hospitalizations of IMDs have been proved to be related to the increased risk of ischemic or hemorrhagic stroke (Zöller et al. 2012; Cho et al. 2014) . Evidence suggests that acute bacterial and viral infections are prime factors for an increased risk of IS (Urbanek et al. 2010) , and the mortality from vascular disease and hospitalization for stroke increased during and after the influenza pandemic. Influenza vaccination can reduce hospitalization and mortality in the elderly and prevent incapacity in working-aged adults (Madjid et al. 2009 ). Herpes Simplex Virus (HSV) Type 2, a class of influenza virus, was regarded as a cause of IS, and researchers found that untreated HSV-2 meningitis could lead to vascular inflammation and IS ultimately (Snider et al. 2014; Zis et al. 2016) . The pathophysiological mechanism of stroke caused by varicella zoster virus (VZV) infection is believed to be similar to the pathophysiological mechanism of IS caused by HSV central nervous system infection, the prevention of VZV infection is considered as a treatable factor for transient IS (Nagel and Gilden 2014) . In this study, Ep300 and NFKB1 were the hub target genes in PPI network, which regulate protein expression in IS. EP300 was known as Histone acetyltransferase p300 or p300 which is an enzyme encoded by EP300 gene (Eckner et al. 1994) . It regulates transcription of genes via chromatin remodeling, which has been found to play an essential role in the biological function of regulatory T cells and is expected to be used in cancer immunotherapy in the future Ghosh et al. 2016) . Unfortunately, no studies have found the mechanism of action of EP300 related to IS up to now. However, given the important role of EP300 in PPI network, we believe that EP300 should be paid more attention in future studies when studying the molecular mechanism of IS. As for NFKB1, which is a factor that inhibits inflammation, aging and cancer, is thought to be associated with cerebral ischemia-reperfusion injury, and a study of Korean adults found that genetic polymorphisms of NFKB1 are associated with stroke susceptibility (Cartwright et al. 2016; Kim et al. 2018; Zhu et al. 2018) . NFKB1 has been confirmed as a potential biomarker for the diagnosis and treatment of IS in the study of Liang (2015) , indicating that PPI network in this study has certain accuracy to predict the hub genes. From the network of GO/KEGG pathways, we found that HLA-DQB1, HLA-DQA1A and HLA-DRB5 were the key genes which had overlapped in at least four pathway groups, in addition, both of them were belonged to the human leucocyte antigen (HLA) family. A previous study on IS patients in China has shown that the HLA-DRB1*04, HLA-DRB1*03, and the HLA-DRB1*12 alleles have protective effects on stroke (Liu et al. 2011) . Similarly, a study on South Indian patients presented the association of human leucocyte antigen HLA-DRB1/DQB1 alleles and haplotypes with IS (Murali et al. 2016) . Interestingly, we found that the HLA alleles (HLA-DQB1, HLA-DQA1, HLA-DRB5) at the same time in the channel were connected to the inflammatory bowel disease (IBD) pathway. It is worthy to note that IBD was considered to increase the risk of IS in a retrospective cohort study of a Taiwanese population, because of the systemic inflammatory burden that IBD led to may be a key determinant of atherosclerotic thrombosis (Huang et al. 2014) . This finding was supported by previous studies that have shown a disease severity-dependent could increase the risk of developing stroke and myocardial infarction among patients with other chronic inflammatory diseases such as rheumatoid arthritis and psoriasis (Solomon et al. 2010; Ahlehoff et al. 2012) . As this study is a preliminary exploratory study, further experiments are expected to be conducted to validate the sequencing results, and we included a total of 6 subjects (3 patients vs. 3 controls), the sample size of RNA sequencing Table 3 ClueGO results: functional groups with genes Bold genes were the key genes which had overlapped in at least four pathway groups, and all of them were belonged to the HLA family Groups Group genes Apoptosis Group01 ACTG1,AKT1,BIRC2,CAPN2,GZMB,NFKBIA,SPTAN1,TUBA1A Influenza A Group09 ACTG1,AKT1,ARHGDIA,CAMK2G,CCND2,HLA-DQA1, HLA-DQB1,HLA-DRB5,IL2RG,MAPK14,MX1,NFKBIA,OAS1, OSBPL8,STAT5A Pertussis Group02 C4B,CFL1,MAPK14,SERPING1 S100 protein binding Group00 AHNAK,EZR,S100A6 Th17 cell differentiation Group13 ACTG1,AKT1,ARHGDIA,BIRC2,C4B,CAMK2G,CCND2,CD2, CD37,GLG1,GZMB,HIST1H2BD,HIST1H2BE,HLA-DQA1, HLA-DQB1,HLA-DRB5,IL2RG,MAPK14,MX1,NFATC2, NFKBIA,OAS1,OASL,RARA,STAT5A,STX5,TAP2,TBKBP1 Funding This work was supported by the Bethune program of Jilin University (No. 470110000715). The funders had no role in the design of the study or in the collection, analysis, and interpretation of data or in writing the manuscript. Socioeconomic status and stroke: an updated review Psoriasis and risk of atrial fibrillation and ischaemic stroke: a Danish Nationwide Cohort Study Limited clinical value of multiple blood markers in the diagnosis of ischemic stroke Global analysis of ribosome-associated noncoding RNAs unveils new modes of translational regulation ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks Advances in top-down proteomics for disease biomarker discovery NFKB1: a suppressor of inflammation, ageing and cancer A recombinant inhibitory isoform of vascular endothelial growth factor164/165 aggravates ischemic brain damage in a mouse model of focal cerebral ischemia Decoding the function of nuclear long non-coding RNAs Silencing of long noncoding RNA AK139328 attenuates ischemia/reperfusion injury in mouse livers Impact of thyroid autoantibodies on functional outcome in patients with acute ischemic stroke Advances, challenges, and limitations in serum-proteome-based cancer diagnosis Molecular cloning and functional analysis of the adenovirus E1A-associated 300-kD protein (p300) reveals a protein with properties of a transcriptional adaptor Immune interventions in stroke Regulatory T Cell Modulation by CBP/EP300 Bromodomain Inhibition Long non-coding RNAs and human disease Long noncoding RNA ANRIL: Lncing genetic variation at the chromosome 9p21 locus to molecular mechanisms of atherosclerosis Inflammatory bowel diseases increase future ischemic stroke risk: a Taiwanese population-based retrospective cohort study Role of inflammation and its mediators in acute ischemic stroke KEGG for linking genomes to life and the environment The promoter polymorphism of NFKB1 gene contributes to susceptibility of ischemic stroke in Korean population Proteomics research in cardiovascular medicine and biomarker discovery Incidence, 30-day case-fatality rate, and prognosis of stroke in Iquique Targeting long non-coding RNAs in cancers: progress and prospects Oxidative stress and DNA damage after cerebral ischemia: potential therapeutic targets to repair the genome and improve stroke recovery Identification of autophagy signaling network that contributes to stroke in the ischemic rodent brain via gene expression Peripheral blood neutrophil cytokine hyper-reactivity in chronic periodontitis Association of atherosclerotic cerebral infarction and human leukocyte antigen-DRB in a North Chinese Han population Inhibition of p300 impairs Foxp3+ T regulatory cell function and promotes antitumor immunity Mechanism of Snhg8/miR-384/Hoxa13/FAM3A axis regulating neuronal apoptosis in ischemic mice model The influence of oseltamivir treatment on the risk of stroke after influenza infection Automated genome annotation and pathway identification using the KEGG Orthology (KO) as a controlled vocabulary A long noncoding RNA regulates sister chromatid cohesion The multidimensional mechanisms of long noncoding RNA function Susceptible and protective associations of HLA DRB1*/DQB1* alleles and haplotypes with ischaemic stroke Risk factors for ischaemic and intracerebral haemorrhagic stroke in 22 countries (the INTERSTROKE study): a case-control study StringTie enables improved reconstruction of a transcriptome from RNA-seq reads Transcriptlevel expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown limma powers differential expression analyses for RNA-sequencing and microarray studies Long non-coding RNAs: challenges for diagnosis and therapies Circulating microRNAs as biomarkers of acute stroke The prediction of malignant cerebral infarction by molecular brain barrier disruption markers Cytoscape: a software environment for integrated models of biomolecular interaction networks The immune system in stroke: clinical challenges and their translation to experimental research Hemorrhagic and ischemic stroke secondary to herpes simplex virus type 2 meningitis and vasculopathy Explaining the cardiovascular risk associated with rheumatoid arthritis: traditional risk factors versus markers of rheumatoid arthritis severity Preventing stroke: saving lives around the world lncRNA SNHG8 promotes aggressive behaviors of nasopharyngeal carcinoma via regulating miR-656-3p/SATB1 axis Influenza and stroke risk: a key target not to be missed? Long noncoding RNAs: functional surprises from the RNA world Circulating long noncoding RNA UCA1 as a novel biomarker of acute myocardial infarction Long non-coding RNA HOTAIR promotes ischemic infarct induced by hypoxia through up-regulating the expression of NOX2 Long noncoding RNAs: fresh perspectives into the RNA world Gene ontology analysis for RNA-seq: accounting for selection bias Small nucleolar RNA host gene 8: A rising star in the targets for cancer therapy Altered long non-coding RNA transcriptomic profiles in brain microvascular endothelium after cerebral ischemia Berberine attenuates ischemia-reperfusion injury through inhibiting HMGB1 release and NF-κB nuclear translocation LncRNA SNHG8 is identified as a key regulator of acute myocardial infarction by RNA-seq analysis Herpes simplex virus type 2 encephalitis as a cause of ischemic stroke: case report and systematic review of the literature Risk of subsequent ischemic and hemorrhagic stroke in patients hospitalized for immune-mediated diseases: a nationwide follow-up study from Sweden