key: cord-0835162-4dxhaxfb authors: Li, Yanchang; Wang, Yihao; Liu, Huiying; Sun, Wei; Ding, Baoqing; Zhao, Yinghua; Chen, Peiru; Zhu, Li; Li, Zhaodi; Li, Naikang; Chang, Lei; Wang, Hengliang; Bai, Changqing; Xu, Ping title: Urine Proteome of COVID-19 Patients date: 2021-03-05 journal: Urine (Amst) DOI: 10.1016/j.urine.2021.02.001 sha: 24fb243333d98a87e4d8b011d4c1231c04581996 doc_id: 835162 cord_uid: 4dxhaxfb The atypical pneumonia (COVID-19) caused by SARS-CoV-2 is a serious threat to global public health. However, early detection and effective prediction of patients with mild to severe symptoms remains challenging. The proteomic profiling of urine samples from healthy individuals, mild and severe COVID-19 positive patients with comorbidities can be clearly differentiated. Multiple pathways have been compromised after the COVID-19 infection, including the dysregulation of complement activation, platelet degranulation, lipoprotein metabolic process and response to hypoxia. This study demonstrates the COVID-19 pathophysiology related molecular alterations could be detected in the urine and the potential application in auxiliary diagnosis of COVID-19. liver, heart, testis, bladder and kidney, where ACE2 are highly expressed [5, 8, 9, [11] [12] [13] . It has been estimated that about 80% of COVID-19 patients experiencing mild symptoms (M-COVID), recover with, or even without conventional medical treatment [10] . However, the remaining 20% of patients with respiratory distress symptom may die rapidly without urgent and specialized intensive medical care, including immediate oxygen therapy, and mechanical ventilation [14, 15] . Disease stage significantly affects COVID-19 treatment and survivorship. The overall mortality rate for hospitalized patients varied from 2.3% for patients diagnosed at the early stage to 11% at the advanced stage [16] . Unfortunately, the majority of cases are diagnosed at the advanced stage due to the lack of biomarkers and medical resources at the early stage. Therefore, it is critical to develop novel approaches to estimate the disease stages for patients in order to seek appropriate treatments and allocate scarce medical resources. In addition, novel detection methods that genuinely reflect the underlying changes of molecular and biological processes of COVID-19 patients would be favorable to the understanding of SARS-CoV-2 pathogenesis. Blood and urine are frequent biometrics for discovery of biomarkers of human diseases because of their accessibility and non-invasiveness [17, 18] . The compositions of proteins detected in blood and urine samples can genuinely reflect the changes of the body health conditions; thus, they are considered an important source for early warning and sensitive for disease detection [19, 20] . Recently, MS-based serum proteomics studies has been utilized to predict the severity of COVID-19 infection [18, 21] . Additionally, the urinary proteomics analysis showed the molecular changes of immunosuppression and tight junction impairment occurring in the early stage of COVID-19 infection [22] . Therefore, a more detailed comprehensive profiling of the serum or urine proteome of COVID-19 patients will likely provide better diagnostics and clinical investigations of this disease. In this study, we evaluated the diagnostic roles of urine samples on the J o u r n a l P r e -p r o o f progression of mild to severe type of COVID-19, and recovery state with cutting-edge urine proteomics [17, 23] . Six COVID-19 patients, comprised of 3 diagnosed as severe cases including one death and 3 mild patients,were investigated. To confirm the findings derived from the urine proteome, two recovery samples were further analyzed. We found that proteins related with complement activation and hypoxia were highly up-regulated, while proteins associated with platelet degranulation, and glucose and lipid metabolic process were especially down-regulated in the COVID-19 severe type patients. However, the changed proteins during the infectious phase recovered to normal in the recovery stage. We propose that urine proteome characterization can be potentially used to distinguish and predict the COVID-19 progression of the mild to severe type. These urine proteome characteristics and changes may also shed light on the understanding of the COVID-19 pathogenesis. In total, we assayed 40 urine specimens that passed quality check (QC), including 32 healthy controls, 6 COVID-19 patients and 2 corresponding recovery person (Figures 1 and S1 ). All patients were tested positive for the presence of SARS-CoV-2 nucleic acid. They all developed either fever or cough. Severe patients showed typical symptoms of fatigue and dyspnea ( Figure 1A ). All patients had comorbidities, including 4 patients with essential hypertension, 1 patient with both essential hypertension and diabetes, and 1 patient with multiple metastases of colon cancer (dead on March 3, 2020) ( Figure 1A ). According to the Diagnosis standards [10], these six patients were categorized into two disease types: three patients were defined as severe type acute respiratory syndrome (S-COVID) and the other three were diagnosed as mild type (M-COVID). The severe COVID-19 patients showed ground-glass opacity in the lungs on Computed Tomography (CT) scanning ( Figure 1B ). After treatment, the lung shadow disappeared and gradually recovered ( Figure 1B ). Because the patient 4 (P4) had multiple metastases of colon cancer, only X-ray test was obtained ( Figure S2 ). Interleukin-6 (IL-6) is an indicator of inflammatory storms [24] . We found the level of IL-6 in mild patients was 4.73 ± 2.03 pg/mL (mean ± standard deviation), while the expression level of IL-6 in severe patients was significantly higher than the normal standard (≤ 7.0 pg/mL) and drastically fluctuated during the infection, indicating that the stress response to viral infection in S-COVID patients was more severe (Figures 1C and S3 ). The urine samples were collected after the diagnosis of the COVID-19. Four urine samples (H01-H04) of healthy controls were processed in parallel with the samples of COVID-19 patients ( Figure 1D ), which were further compared with the other healthy sample datasets (H05-H32) generated in the laboratory following the same sample preparation processes and mass spectrometry analysis in order to detect sample heterogeneity. To confirm the proteome shift observed from the COVID-19 patients, we also collected urine samples from two recovered patients (P1 and P6) ( Figures 1A and S3 ). As the sample size increases, the number of identified proteins in control group grows quickly, and gradually become saturated ( Figure 2A ). The peptide over protein ratio was 6.0 (Table S1) , indicating high quality and reliability of our protein identification. To improve the accuracy of COVID-19 and recovery samples, 2 technical repeats were measured for each sample. A total number of 2656 proteins was identified from 32 healthy control samples ( Figures 2A and Table S1 ). We identified and quantified 1380 and 1641 proteins in urine samples from COVID-19 and two recovery person in total, which was significantly lower than that of healthy controls ( Figure 2B Figure 2E ). To check whether the SARS-CoV-2 proteins were present in the urine sample, we added SARS-CoV-2 protein sequences to the human proteome database, and no related proteins were identified. To assess the quantitative variation and accuracy of the MS datasets, each urine sample of COVID-19 patients and the respective recovered samples were technically repeated twice. The absolute quantitative information iBAQ value was used for further comparison and analysis. The correlation coefficient (R 2 ) of the two replicates for each sample was higher than 0.80 ( Figure S4 ), indicating the MS data were acquired with high degree of consistency and reproducibility in this study. Due to the differences in sample size and operation during the sample processing, we found significant quantitative variations among different samples ( Figure S5A ). Therefore, the median values of iBAQ for each sample dataset were normalized equally to reduce the potential biases before quantitatively comparing the samples under COVID-19 with healthy conditions ( Figure S5B ). We found that the correlation of samples within the healthy and recovery groups were higher than that between the healthy and patient groups ( Figure S6 ). We found that patients and healthy people can be divided into two categories based on our cluster analysis (Figure 3 ), indicating the distinctive molecular characteristics between healthy and COVID-19 conditions. Interestingly, the urine samples of two recovery patients were clustered with healthy people (Figure 3 ). We also found that normal control individual H5 and To corroborate the identified distinct clusters in our cluster analysis, we also performed principal component analysis (PCA). The result showed that patients and healthy people were clearly divided into two groups ( Figure S7 ). The two samples including one mild (Recovery P1, RP1) and one severe recovery patient (Recovery P6, RP6) were grouped with healthy samples. Interestingly, P1 and P2 COVID-19 patients belonged to M-COVID patient group with only hypertension complication were more closed to healthy control. We found that these two patients could be distinguished from S-COVID patients or P3 mild patient with hypertension and diabetes complications. The P3 mild patient was incorrectly classified as severe ( Figure S7 ), possibly because this female patient has diabetes complication ( Figure 1A ). These results imply that the urine proteomics analysis can be served as a potential auxiliary prediction tool to differentiate M-COVID and S-COVID patients. [21, 22] . We also chose the cluster 1 and 12 as the down-regulated filter of the severe COVID-19 from mild COVID-19 as well ( Figure 5C ). These filtered proteins were highly associated with the platelet degranulation, glucose metabolic process, protein metabolic process and lipid metabolic and transport pathways. The molecular features used to distinguish the patient type (M and S) in our classifier ( Figure 5B and 5D, Tables S4-5) contain several potential biomarkers which were highly associated with the clinical characteristics of mild and severe COVID-19. For example, the hypoxia up-regulated protein 1 (HYOU1) belonging to cluster 2 was more than three-fold higher in the severe COVID-19 ( Figure 5B ). HYOU1 plays a pivotal role in cyto-protective cellular J o u r n a l P r e -p r o o f mechanisms triggered by oxygen deprivation and is highly expressed in tissues such as liver and pancreas that contain well-developed endoplasmic reticulum and also regulates large amounts of secretory proteins [26, 27] . Patients with hypoxia warrant more attention to their intravascular coagulation, such as the elevated levels of D-dimer, a blood marker of excess clotting. It was reported that the heparin could boost patients' low oxygen levels regardless of whether they were struggling to breathe [28] . In this study, we found that the heparin cofactor 2 (SERPIND1) belonging to cluster 10 was specially up-regulated more than four-fold higher in the mild and two-fold higher in the severe COVID-19 (Table S4) . SERPIND1, also known as heparin cofactor II, is a glycoprotein in human plasma that inhibits thrombin and chymotrypsin, and the rate of inhibition of thrombin is rapidly increased by Dermatan sulfate (DS), heparin (H) and glycosaminoglycans (GAG) [29, 30] . We speculated that the SERPIND1 could be the protective response to reduce the risk of excess intravascular coagulation in the COVID-19 patients. We also found that the cyclic AMP-responsive element-binding protein 3-like protein 3 (CREB3L3) belonging to cluster 10 ( Figure S8 ) was specially up-regulated in the M-COVID (Table S4 ). In acute inflammatory response, CREB3L3 may activate expression of acute phase response (APR) genes, which was activated in response to cAMP stimulation [31] . This might be the protective mechanism for body to fight against the virus. For the down-regulated molecular clusters, the proteins related with platelet degranulation was also reported in the sera proteomics recently [21, 22] . Additionally, the down-regulated pathway of lipid metabolic and transport in the COVID-19 patients caused our attention. The cholesterol homeostasis was reported to impact COVID-19 prognosis, virus entry and the antiviral therapies [32] . In our data, the lipid metabolism and transporting, including the cholesterol homeostasis, were down-regulated in the S-COVID ( Figure 5D and Table S5 ). The proteins NPC intracellular cholesterol transporter 2 (NPC2), apolipoproteins A1 (APOA1), and Cubilin (CUBN) were changed with the J o u r n a l P r e -p r o o f similar trends ( Figure 5D ). These results indicated that after the SARS-CoV-2 infection, the lipoprotein-mediated cholesterol uptake and transporting was disordered. Our study suggests that more characteristic molecular changes at protein levels can be used to build a predictive filter for the prospective identification of severe cases and shed light on the understanding of COVID-19 pathophysiology. The COVID-19 pandemic caused by SARS-CoV-2 is not only putting huge pressure on global healthcare, but also having a devastating impact on the economy and society. Although much effort towards COVID-19 diagnostics and treatment has been made, the mortality of this infectious disease has not been significantly improved because of the limited mechanistic understanding of the pathogenesis [2, 33] . Patients progressing into the S-COVID often face very limited treatment options [8, 9] . Imaging technology, such as CT has been widely used to diagnose the COVID-19 patients, but suffers from high cost and demand for technical expertise. There is an urgent need for low-cost and reliable diagnostic techniques to estimate and predict the transition of severe COVID-19 patients from mild COVID-19 ones. Urine is one of the most frequently studied biomaterials for biomarkers of human diseases in proteomics study because of its accessibility. It is less complex and has a relatively lower dynamic range with less technical challenges compared to blood [17, 22, [34] [35] [36] [37] . It is powerful to identify molecular groups to distinguish healthy controls, mild and serious COVID-19 patients through urinary proteomics. Then more robust and quick approaches could be developed for the targeted MS detection technology or multi-target microarray to improve the speed and throughput of sample detection. Our study demonstrated that urine profiling could separate the healthy control from COVID-19 patients and also tell recovery person from COVID-19. Specific Altogether, our data demonstrate that a urine proteome-based proteomics study can reliably and sensitively differentiate COVID-19 patients from healthy people. It might be able to serve as a powerful tool to help scientists and clinicians fight the COVID-19 pandemic. Human urine proteomics samples were prepared as described previously with slight modification [23, [34] [35] [36] [37] . Briefly, 1 mL urine samples were centrifuged at 2,000 g for 4 min to remove cell debris before reduced with 5 mM dithiotheitol (DTT) at 56℃ for 30 min, which could also inactivate the virus. The treated samples were alkylated with 10 mM iodoacetamide in dark at room temperature for 30 min. The supernatant was loaded into a 10 kDa ultrafiltration tube and the larger molecular weight proteins (proteome) were separated from the endogenous peptides (peptidome) by centrifugation. Proteome samples were digested with trypsin at 37 ℃ for 14h then the digestion reaction was terminated by 1% formic acid (FA). The digested peptides were desalted through a StageTip [38, 39] ions was set to 20 ppm. Full cleavage by trypsin was set and a maximum of two missed cleavages was allowed. The protein identification must met the following criteria: (1) the peptide length≥7 amino acids; (2) the FDR≤1% at the PSM, peptide and protein levels. The peptides were quantified by the peak area derived from their MS1 intensity with MaxQuant software [40] . The intensity of unique and razor peptides was used to calculate the protein intensity. The intensity based absolute quantification (iBAQ) algorithm was used as protein quantification value [41] . In order to exclude the influence of differences in sample sizes and loading amounts for MS analysis, we used median value of each sample to normalize protein iBAQ values [42] . All missing values were substituted with the minimal value. Overlapped 1008 proteins were used for the subsequent statistical analysis. Pearson correlation analysis of all datasets was realized by Perseus [43] . Differential proteins were filtered using R package limma (version 3.34.9). The significantly differentially expressed proteins were selected using the criteria of adjusted p value less than 0.05 and log 2 FC larger than 1. Proteins were clustered using R package mFuzz (version 2.46.0) into 16 significant discrete clusters. The function of differential proteins was analyzed in David Bioinformatics (https://david.ncifcrf.gov/) and Human Protein Atlas (http://www.proteinatlas.org/) platforms including tissue-specific enrichment, molecular function, biological process, cellular component, etc. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the iProX partner repository [44] . The accession numbers for the mass spectrometry proteomics data reported in this paper are the iProX (https://www.iprox.org/) dataset identifier: IPX0002166000. All the data will be publicly released upon publication. A pneumonia outbreak associated with a new coronavirus of probable bat origin A Novel Coronavirus from Patients with Pneumonia in China A new coronavirus associated with human respiratory disease in China The architecture of SARS-CoV-2 transcriptome Liver injury in COVID-19: management and challenges Autopsy in suspected COVID-19 cases Multi-organ proteomic landscape of COVID-19 autopsies Care for Critically Ill Patients With COVID-19 Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. The Lancet Proof-of-Concept Workflow for Establishing Reference Intervals of Human Urine Proteome for Monitoring Physiological and Pathological Changes Multi-Omics Resolves a Sharp Disease-State Shift between Mild and Moderate COVID-19 Urine proteomics for discovery of improved diagnostic markers of Kawasaki disease A comprehensive analysis and annotation of human normal urinary proteome. Sci Rep Proteomic and Metabolomic Characterization of COVID-19 Patient Sera. medRxiv Immune suppression in the early stage of COVID-19 disease Urine proteome profiling predicts lung cancer from control cases and other tumors. EBioMedicine Rethinking IL-6 and CRP: Why they are more than inflammatory biomarkers, and why it matters Hepatocyte nuclear factor 4alpha is implicated in endoplasmic reticulum stress-induced acute phase response by regulating expression of cyclic adenosine monophosphate responsive element binding protein H Cholesterol Metabolism--Impact for SARS-CoV-2 Infection Prognosis, Entry, and Antiviral Therapies. medRxiv Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Exosomal proteins as potential diagnostic markers in advanced non-small cell lung carcinoma Prospects for urinary proteomics: exosomes as a source of urinary biomarkers Urine proteomics: the present and future of measuring urinary protein components in disease A tool for biomarker discovery in the urinary proteome: a manually curated human and animal urine protein biomarker database Systematic research on the pretreatment of peptides for quantitative proteomics using a C 18 microcolumn A rapid and easy protein N-terminal profiling strategy using (N-Succinimidyloxycarbonylmethyl)tris(2,4,6-trimethoxyphenyl)phosphonium bromide (TMPP) labeling and StageTip MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification Global quantification of mammalian gene expression control The MaxQuant computational platform for mass spectrometry-based shotgun proteomics The Perseus computational platform for comprehensive analysis of (prote) omics data iProX: an integrated proteome resource. Nucleic acids research