key: cord-0832093-nifz133q authors: Shen, Bo; Yi, Xiao; Sun, Yaoting; Bi, Xiaojie; Du, Juping; Zhang, Chao; Quan, Sheng; Zhang, Fangfei; Sun, Rui; Qian, Liujia; Ge, Weigang; Liu, Wei; Liang, Shuang; Chen, Hao; Zhang, Ying; Li, Jun; Xu, Jiaqin; He, Zebao; Chen, Baofu; Wang, Jing; Yan, Haixi; Zheng, Yufen; Wang, Donglian; Zhu, Jiansheng; Kong, Ziqing; Kang, Zhouyang; Liang, Xiao; Ding, Xuan; Ruan, Guan; Xiang, Nan; Cai, Xue; Gao, Huanhuan; Li, Lu; Li, Sainan; Xiao, Qi; Lu, Tian; Zhu, Yi Judy; Liu, Huafen; Chen, Haixiao; Guo, Tiannan title: Proteomic and Metabolomic Characterization of COVID-19 Patient Sera date: 2020-04-07 journal: nan DOI: 10.1101/2020.04.07.20054585 sha: 8695528795dd02723afd17afba6e768d08372aba doc_id: 832093 cord_uid: nifz133q Severe COVID-19 patients account for most of the mortality of this disease. Early detection and effective treatment of severe patients remain major challenges. Here, we performed proteomic and metabolomic profiling of sera from 46 COVID-19 and 53 control individuals. We then trained a machine learning model using proteomic and metabolomic measurements from a training cohort of 18 non-severe and 13 severe patients. The model correctly classified severe patients with an accuracy of 93.5%, and was further validated using ten independent patients, seven of which were correctly classified. We identified molecular changes in the sera of COVID-19 patients implicating dysregulation of macrophage, platelet degranulation and complement system pathways, and massive metabolic suppression. This study shows that it is possible to predict progression to severe COVID-19 disease using serum protein and metabolite biomarkers. Our data also uncovered molecular pathophysiology of COVID-19 with potential for developing anti-viral therapies. were performed at later disease stages. 91 We used stable isotope labeled proteomics strategy TMTpro (16plex) (Li Identification of severe patients using machine learning 107 We next investigated the possibility of classify the severe COVID-19 108 patients based on the molecular signatures of proteins and metabolites (Table 109 S2). We built a random forest machine learning model based on proteomic the reason of incorrectly classified is unclear. 117 We then tested the model on an independent cohort of ten patients 118 ( Figure 2E , Table S3 ). All severe patients were correctly identified, except one previously been reported in COVID-19 (Liang et al., 2020) . SAA1 was 206 reported to be elevated in severe SARS patients, but was not specific to 207 SARC-CoV (Pang et al., 2006) . As a major contributor to acute phase 208 response, complement system plays a crucial role in eliminating invading 209 pathogens in the early stage of infection. Among those APPs, two proteins 210 belong to the complement membrane attack complex, including complement 6 We also observed an accumulation of mannose and its derivatives in 215 severe patients. In the complement system, binding of mannose to lectin 216 leads to cleaveage of C2 and C4, which then form a C3 convertase to 217 promote complement activation (Ricklin et al., 2010) . Suppressed platelet degranulation in severe sera 220 Fifteen of 17 proteins involved in platelet degranulation were down-221 regulated in SARS-CoV-2 infected patients, which may be associated with 222 observed thrombocytopenia in this patient cohort (Zheng et al., 2020) . Low 223 platelet count is also reported to be associated with severe COVID-19 and 2.07-fold (p = 1.86e-04) and 3.31-fold (p = 9.07e-07), respectively. 240 We also detected low levels of fatty acids such as arachidonate and the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10. 1101 /2020 Seven patients were correctly classified in the independent validation 301 cohort containing ten patients. Two of them could be explained by the 302 patients' complex comorbidity and medication history. The relatively small 303 sample size necessitates future validation studies in independent cohorts. 306 Our data shed light on the molecular changes reflected in COVID-19 sera 307 which could potentially yield critical diagnostic markers or therapeutic targets 308 for managing severe COVID-19 patients ( Figure 5 ). These molecular the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10. 1101 /2020 Our proteomic data showed that proteins related to platelet degranulation the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is Jiang, Y., Zhao, G., Song, N., Li, P., Chen, Y., Guo, Y., Li, J., Du, L., Jiang, S., Guo, R., et al. (2018) . 469 Blockade of the C5a-C5aR axis alleviates lung damage in hDPP4-transgenic mice infected with 470 All rights reserved. No reuse allowed without permission. the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10.1101/2020.04.07.20054585 doi: medRxiv preprint lung injury of influenza pneumonitis. Am J Pathol 179, [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] Nie, S., Zhao, X., Zhao, K., Zhang, Z., Zhang, Z., and Zhang, Z. (2020). Metabolic disturbances and 519 inflammatory dysfunction predict severity of coronavirus disease 2019 (COVID-19): a retrospective 520 study. medRxiv, 2020 medRxiv, .2003 medRxiv, .2024 . Zou, Z., Yang, Y., Chen, J., Xin, S., Zhang, W., Zhou, X., Mao, Y., Hu, L., Liu, D., Chang, B., et al. (2004) . the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is . https://doi. org/10.1101 org/10. /2020 the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10.1101/2020.04.07.20054585 doi: medRxiv preprint the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10.1101/2020.04.07.20054585 doi: medRxiv preprint Table S1 . Patients labeled in red (Y-axis) 596 indicate chronic infection of viruses including HBV. (B) Study design for 597 machine learning-based classifier development for severe COVID-19 patients. 598 We first procured samples in a training cohort for proteomic and metabolomic 599 analysis. The classifier was then validated in an independent cohort. All rights reserved. No reuse allowed without permission. the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10. 1101 /2020 Public Health Medical Center during January 23 and February 4, 2020. They peptides' N-termini. Precursor ion mass tolerance was set to 20 ppm, and 720 product ion mass tolerance was set to 0.06 Da. The peptide-spectrum-match 721 allowed 1% target false discovery rate (FDR) (strict) and 5% target FDR 722 (relaxed). Normalization was performed against the total peptide amount. The 723 other parameters followed the default setup. Quality control of proteomic data 726 The quality of proteomic data was ensured at multiple levels. First, a mouse 727 liver digest was used for instrument performance evaluation. We also run 728 water samples (buffer A) as blanks every 4 injections to avoid carry-over. The mobile solutions used in the gradient elution were water and methanol 760 All rights reserved. No reuse allowed without permission. the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is After raw data pre-processing, peak finding/alignment, and peak 775 annotation using in-house software, metabolites were identified by searching 776 an in-house library containing more than 3,300 standards with library data 777 entries generated from running purified compound standards through the Metabolites and therapeutic compounds with over 80% missing ratios in a 803 particular patient group were removed for the metabolomics dataset 804 All rights reserved. No reuse allowed without permission. the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is Hochberg correction. The statistical significantly changed proteins or 811 metabolites were selected using the criteria of adjust p value less than 0.05 812 indicated and absolute log2 FC larger than 0.25. From the training cohort, the 813 important features were selected with mean decrease accuracy larger than 3 814 using random forest containing a thousand trees using R package 815 randomForest (version 4.6.14) random forest analysis with 10-fold cross All rights reserved. No reuse allowed without permission. the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10.1101/2020.04.07.20054585 doi: medRxiv preprint the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10.1101/2020.04.07.20054585 doi: medRxiv preprint Figure 1 inactivated and sterilized at 56℃ for 30 min, and 677 processed as previously with some modifications. Five μL serum from each 678 specimen was lysed in 50 μL lysis buffer (8 M urea in 100 mM 679 triethylammonium bicarbonate, TEAB) at 32℃ for 30 min. The lysates were 680 reductive with 10 mM tris (2-carboxyethyl) phosphine (TCEP) for 32℃, then alkylated for 45 min with 40 mM iodoacetamide (IAA) in darkness 682 at room temperature (25℃). The protein extracts were diluted with 200 μL 100 683 mM TEAB, and digested with double-step trypsinization each step with an enzyme-to-substrate ratio of 1:20, at 32℃ 685 for 60 min. The reaction was stopped by adding 30 µL 10% trifluoroacetic acid 686 (TFA) in volume. Digested peptides were cleaned-up with SOLAμ USA) following the manufacturer's instructions, 688 and lableled with TMTpro 16plex label reagents USA) as described previously. The TMT samples were fractionated 690 using a nanoflow DIONEX UltiMate 3000 RSLCnano System USA) with an XBridge Peptide BEH C18 column The samples were separated using a gradient from 5% to 35% acetonitrile 694 (ACN) in 10 mM ammonia (pH=10.0) at a flow rate of 1 mL/min. Peptides 695 were separated into 120 fractions The fractions were subsequently dried and re-dissolved in 2% ACN/0.1% 697 formic acid (FA). The re-dissolved peptides were analyzed by LC-MS/MS with 698 the same LC system coupled to a Q Exactive USA) in data dependent 700 acquisition (DDA) mode. For each acquisition, peptides were loaded onto a 701 precolumn (3 µm, 100 Å, 20 mm*75 µm i.d.) at a flowrate of 6 μL/min for 4 min 702 and then injected using a 35 min LC gradient (from 5% to 28% buffer B) at a 703 flowrate of 300 nL/min Buffer A was 2%ACN, 98% H2O containing 0.1% FA, and buffer B was 98% The m/z 706 range of MS1 was 350-1,800 with the resolution at 60,000 (at 200 m/z), AGC 707 target of 3e6, and maximum ion injection time (max IT) of 50 ms. Top 15 708 precursors were selected for MS/MS experiment, with a resolution at 45 All rights reserved. No reuse allowed without permission.