key: cord-0845576-09madxcy authors: Li, Na; An, Peng; Wang, Jifeng; Zhang, Tingting; Qing, Xiaoqing; Wu, Bowen; Sun, Lang; Ding, Xiang; Niu, Lili; Xie, Zhensheng; Zhang, Mengmeng; Guo, Xiaojing; Chen, Xiulan; Cai, Tanxi; Luo, Jianming; Wang, Fudi; Yang, Fuquan title: Plasma proteome profiling combined with clinical and genetic features reveals the pathophysiological characteristics of β-thalassemia date: 2022-03-16 journal: iScience DOI: 10.1016/j.isci.2022.104091 sha: 0361de1c98690e7196c085cdbacc3dae60c7f847 doc_id: 845576 cord_uid: 09madxcy The phenotype of β-thalassemia underlies multigene interactions, making clinical stratification complicated. An increasing number of genetic modifiers affecting the disease severity have been identified, but are still unable to meet the demand of precision diagnosis. Here, we systematically conducted a comparative plasma proteomic profiling on patients with β-thalassemia and healthy controls. Among 246 dysregulated proteins, 13 core protein signatures with excellent biomarker potential are proposed. The combination of proteome and patients' clinical data revealed patients with codons 41/42 -TTCT mutations have an elevated risk of higher iron burden, dysplasia, and osteoporosis than patients with other genotypes. Notably, 85 proteins correlating to fetal hemoglobin (Hb F) were identified, among which the abundance of 27 proteins may affect the transfusion burden in patients with β-thalassemia. The current study thus provides protein signatures as potential diagnostic biomarkers or therapeutic clues for β-thalassemia. Highlights 246 dysregulated proteins are detected in plasma of patients with b-thalassemia 13 potential biomarkers and 27 proteins related to disease progression are found Variations in plasma proteome reveal the disease pathophysiological characteristics Codons 41/42 -TTCT carriers have higher ferritin levels compared to non-carriers INTRODUCTION b-thalassemia is one of the most common inherited single-gene diseases, with a high prevalence in tropical and sub-tropical areas (Marengo-Rowe, 2007; Peters et al., 2012) . Epidemiology surveys indicate b-thalassemia is highly pervasive in the provinces of southern China, with a rate of 6.43% in the Guangxi Zhuang Autonomous Region and 3.80% in Guangdong Province (Li et al., 2014; Xiong et al., 2010) . b-thalassemia results from mutations in the b-globin gene (HBB gene), causing reduced (b ++ and b + ) or absent (b 0 ) synthesis of the b-globin chain of hemoglobin. The relative excess of a-globin chains precipitate in the erythroid cell membrane, leading to ineffective erythropoiesis and hemolytic anemia. Subsequent clinical manifestations include splenomegaly, marrow expansion, extramedullary hemopoiesis, and bone marrow dilatation (Taher et al., 2018) . Besides, excess iron from blood transfusion and ineffective erythropoiesis deposits in major organs, causing cardiac dysfunction, liver fibrosis, diabetes, and neurological complications (Palmer et al., 2018; Taher and Saliba, 2017) . The severity and clinical presentations of a patient with b-thalassemia are largely determined by the extent to which the mutation impairs the synthesis of b-globin chain. Thus, b-thalassemia classification is based on the heterozygous or homozygous state of the mutation(s). b-thalassemia trait (TT) is usually asymptomatic while homozygosity or compound heterozygosity for b-thalassemia mutations cause b-thalassemia intermedia (TI) and b-thalassemia major (TM) (Shang and Xu, 2017) . However, varying levels of disease severity can be observed for patients with the same genotype. This is due to multiple factors contributing to the clinical severity of this disease, such as genetic modifiers involved in the production of fetal hemoglobin (Hb F) (Sripichai and Fucharoen, 2016) and the coexistence with a-thalassemia (Sripichai et al., 2008) . Moreover, complications from standard treatment (i.e., blood transfusion, iron chelation therapy, and splenectomy) also impact the clinical signs. This poor correlation between genotype and phenotype prevents precision clinical stratification. Therefore, exploration of the genotype-phenotype relationship and identification of biomarkers for disease are of paramount importance for precise diagnosis and treatment of the disease. Despite complicating the classification of b-thalassemia, certain genetic modifiers have been further studied as therapeutic targets of this disease. For example, BCL11A is a well known repressor of Hb F. Frangoul et al. (2021) reported that the production of Hb F was successfully reactivated in one patient with b-thalassemia after receiving autologous CD34 + cells edited with CRISPR-Cas9 targeting the BCL11A enhancer. As increased Hb F is a well characterized ameliorating factor in patients, identifying proteins potentially affected by Hb F levels may provide new therapeutic approaches of the disease. To date, extensive research of b-thalassemia has mainly used genomic or transcriptome methods. However, genetic or transcriptional information is not sufficient for the prediction of corresponding protein levels due to the poor correlation between mRNA and protein abundance (Greenbaum et al., 2003) . Mass spectrometry (MS)-based proteomics has become a powerful approach for the exploration of biomarkers or pathophysiological characteristics in various diseases (Shen et al., 2020; Tewari et al., 2018; Wewer Albrechtsen et al., 2018) . Previous studies of the proteomics profile of patients with b-thalassemia reported dysregulated proteins associated with the key pathological conditions implicated in thalassemia, including oxidative stress (Ponnikorn et al., 2019) , hemolysis (Kittivorapart et al., 2018) , and the hypercoagulable state (Chanpeng et al., 2019) when compared with healthy individuals. Several proteins correlating with the disease, for example, haptoglobin, hemopexin, and cathepsin S, were proposed as potential clinically relevant biomarkers (Kittivorapart et al., 2018) . In addition, although the links to thalassemia are not yet clear, the novel differentially expressed proteins identified may lead to a better understanding of the pathophysiology of thalassemia (Hatairaktham et al., 2013) . As differential constituents of plasma proteins can reflect the ongoing physiological and pathological state of an individual, we systematically carried out a deep plasma proteomic profiling study using tandem mass tags (TMT)-labeled quantitative proteomics to delineate protein molecular features in patients with b-thalassemia. Overall synopsis of the plasma proteomic profiling of patients with b-thalassemia Plasma samples from 20 patients with TM, 20 patients with TI, and 20 healthy controls were used for the quantitative proteomic analysis. The overall workflow of this study is exhibited in Figure S1A . To monitor the LC-MS/MS platform instrument stability, 1 mg of a tryptic digest of 293T whole-cell lysates was analyzed between batches. Similar distribution of peptide spectrum matches (PSMs) in all quality control samples and strong positive correlations between samples allowed for data acquisition of TMT-labeled samples (Figures S1B and S1C). We then performed unsupervised principal component analysis (PCA) and hierarchical clustering of quantified proteins. The samples were clustered by clinical groups but not TMT sets, indicating no significant batch effect or measurable bias (Figures 1A and 1B, Figure S2 ). Although the patients with b-thalassemia were clearly separated from the controls by PCA and the heatmap, these measures could not statistically separate patients with TM and TI. A correlation was noted between serum ferritin (SF) quantified by immunoassay methods and ferritin light chain (FTL) quantified by our proteomics (r = 0.91, p < 0.001). Transferrin receptor protein 1 (TFRC) and serum TfR (sTfR) also exhibited a high degree of correlation (r = 0.96, p < 0.001), as measured by TMT-based proteomics and immunoturbidimetry-based methods using a biochemistry analyzer, respectively ( Figure 1C ). In the multiplexed TMT-based quantitative proteomics, a total of 868 proteins were identified with high confidence, of which 592 proteins were quantified across all samples. The following data analysis was performed on these 592 proteins and collected clinical data for 337 recruited individuals (Table S1 ). First, we calculated the protein fold changes in pairwise comparison between experimental groups and checked their correlation between TMT sets. Even with inter-individual proteomic heterogeneity, positive correlations indicated the consistency in proteomic features of patients with b-thalassemia, and the similarity between proteomic profiling of patients with TI and TM ( Figures S3A and S3B ). Next, prior to differential protein analysis, we analyzed the contribution of quantitative variances from TMT sets and experimental groups. Quantitative variances across TMT sets were smaller than those brought by experimental groups (Figure S3D ), indicating negligible batch effects. Therefore, one-way ANOVA was used for comparison between experimental groups. Of 280 significantly altered proteins in the three groups (p < 0.05 and fold change >2 or <0.85), a comparison of patients with TM and healthy controls showed 193 upregulated proteins and 39 downregulated proteins in patients with TM. Similarly, a total of 199 upregulated and 41 downregulated proteins were identified in TI versus healthy control patients. However, only slight changes in protein abundance were found when comparing TM and TI Table S2 ). Interestingly, a large number of proteins were upregulated in patients with b-thalassemia when compared with healthy controls ( Figure S3C ). The plasma proteome can represent relevant characteristics of diseases (Pernemalm et al., 2019; Sun et al., 2019; Tewari et al., 2018; Wewer Albrechtsen et al., 2018) . To explore the altered pathways or protein processes caused by b-thalassemia, we conducted comparative enrichment map analysis on differential protein profiles in TM vs. control and TI vs. control (Table S3 ). The significantly enriched gene sets were organized into higher-level modules based on shared components to minimize the redundant gene-sets information ( Figure 2A ). By capturing the entire enriched results of the gene set enrichment analysis (GSEA), we found the variation tendency of gene sets in TM vs. control and TI vs. control to be strikingly similar. In total, there were 115 overlapping gene sets (over 50%) significantly enriched in the two pairs, thereby strengthening the similarity of altered pathways and protein processes between TM and TI ( Figure S4A ). Compared with healthy controls, both patients with TM and TI were accompanied by dozens of upregulated functional modules. For instance, driven by C-C motif chemokine 18/14 (CCL18/CCL14), platelet basic protein (PPBP) and platelet factor 4 (PF4), abnormal granulocyte chemotaxis was enriched. Besides, platelet functions were prominently affected, including platelet activation, platelet aggregation, platelet degranulation, and response to elevated platelet cytosolic Ca 2+ . Platelet activation (Fayed et al., 2018) and aggregation (Winichagoon et al., 1981) observed in patients with b-thalassemia are risk factors for hypercoagulopathy, resulting in thromboembolic events (Cappellini et al., 2012; Chanpeng et al., 2019) . In the above-mentioned platelet modules, we also observed a significant upregulation of pleckstrin (PLEK), whose intronic polymorphism is an independent genetic risk factor for venous thromboembolism (Lindstrom et al., 2019) . Some other modules, such as regulation of nervous system development and synapse, consisting of 14-3-3 protein zeta/delta (YWHAZ), 14-3-3 protein epsilon (YWHAE), and amyloid-beta precursor protein (APP), emphasized the underlying neurodegenerative disorders linked to iron accumulation in the brain (Manfre et al., 1999 ) ( Figure 2B ). Downregulated gene sets in patients included the triglyceride and neutral lipid metabolic process, steroid metabolic process, and scavenging of heme from plasma. The downregulated phosphatidylinositolglycan-specific phospholipase D (GPLD1), apolipoprotein C-I (APOC1), and catalase (CAT) are the leading proteins in the triglyceride and neutral lipid metabolic process. Impaired lipid metabolism in patients with b-thalassemia is further supported by the higher triglyceride (TG) levels found in patients with b-thalassemia compared with healthy controls ( Figure S4B ). Significant downregulation of hemopexin (HPX), HBB, hemoglobin subunit alpha (HBA1), apolipoprotein L1 (APOL1), and apolipoprotein A-I (APOA1) was identified in patients with b-thalassemia, contributing to the enrichment of the scavenging of heme from plasma pathway ( Figure 2C ). Heme is a toxic and proinflammatory molecule that activates the Tolllike receptor signaling pathway, thus inducing the production of reactive oxygen species and inflammatory cytokines (Belcher et al., 2014) . As the main plasmatic scavenger of heme, HPX transports heme to the liver to be broken down. The deficiency of HPX reflects the severity of hemolysis in thalassemia (Smith and McCulloh, 2015) . Differential proteomic analysis identifies potential diagnostic biomarkers of b-thalassemia As mentioned above, patients with TM and TI have quite similar plasma proteomic profiles. The discrepancies in differential proteins between TM and TI were small; however, the majority of these proteins were directly related to the pathophysiology of the disease, such as hemoglobin subunit gamma-2 Figure 2 . Enrichment map analysis of altered pathways and protein processes (A) The significantly enriched pathways and protein processes of TM vs. control or TI vs. control. Each gene set is a node, of which the size is scaled to the number of genes in the gene set and the degree of overlap between gene sets is scaled to edge thickness. Quadrants delineate the comparison groups, in which the color is mapped to the normalization enrichment scores (NES). Red refers to the upregulated gene sets, whereas blue refers to the downregulated gene sets. (B) The shared proteins in upregulated gene-set modules including platelet function, synapse, and nervous system. (C) The shared proteins in downregulated gene-set modules including the triglyceride and neutral lipid metabolic process and scavenging of heme from plasma and transporter activity. (*p < 0.05). Statistical analysis was performed using ANOVA, followed by post-hoc analysis using TukeyHSD with 95% of confidence. See also Figure S4 and Table S3 . iScience 25, 104091, April 15, 2022 5 iScience Article (HBG2), HBA1, and HBB ( Figure 1E ). Therefore, differential protein analysis was concentrated on patients with b-thalassemia and healthy controls to discover potential diagnostic biomarkers of b-thalassemia. Significant dysregulation was observed in 246 proteins, accounting for approximately 40% of all quantified proteins ( Figure 3A , Table S4 ). As a result of cell death or damage, tissue proteins are proposed to leak into the blood. To explore the sources of these altered proteins, they were queried using the PaGenBase iScience Article (Pan et al., 2013) , a database repository for the collection of tissue-and time-specific pattern genes (Figure 3B) . As plasma proteins are largely secreted by the liver (Anderson and Anderson, 2002) , it is no surprise that we found that the liver is the most significant tissue source for differential proteins. In addition, proteins from muscle, spleen, adipocyte, cardiac myocyte, bone marrow, and CD34 + were also enriched. This observation is consistent with current knowledge about organ damage brought on by iron deposition and ineffective erythropoiesis in b-thalassemia (Taher et al., 2021) . The differentially expressed proteins between patients with b-thalassemia and healthy individuals then underwent a protein-protein interaction enrichment analysis using Metascape (Zhou et al., 2019 ) ( Figure S5 ). Densely connected protein complexes detected by the Molecular Complex Detection (MCODE) clustering algorithm (Bader and Hogue, 2003) were also highly enriched in the pathways or processes revealed by the enrichment map analysis shown in Figure S5 . These include the complement system, immune cell response, platelet degranulation, and actin-myosin-associated movement. Moreover, glycolysis and gluconeogenesis, post-translational protein phosphorylation, and regulation of IGF transport and uptake by IGFBPs were markedly enriched. To further screen the most discriminant proteins, a stricter criterion p value < 0.05 and fold change >2 or <0.7 was used. A total of 36 proteins were identified, which were divided into two main clusters by network interaction analysis (Table S5 ). One cluster annotated by gene ontology (GO) corresponds to platelet degranulation and the actin filament-based process; the other cluster relates to iron metabolism (Figure 3C) . Furthermore, we assessed the differential proteins consistently changed across all six TMT sets by separately processing the raw data to avoid the impact from batches. At the general significance levels (p value < 0.05 and fold change >2 or <0.7), the intersection of differential proteins in the six TMT sets resulted in the identification of 30 proteins. The core protein signatures of 13 proteins were identified by the combination of both the stricter criterion and consistently changed results from the separated analyses, as demonstrated in Figure 3D . With respect to clinical application, diagnostic biomarker candidates should meet the requirements of sensitivity and specificity. Thus, receiver operating characteristic (ROC) curve analysis was performed for the most discriminant proteins and the corresponding area under curve (AUC) values were calculated. With the highest AUC, FTL, ferritin heavy chain (FTH1), cathepsin S (CTSS), and platelet-activating factor acetylhydrolase (PLA2G7) were identified to have the best diagnostic potential ( Figure 3E ). FTL and FTH1 are the light and heavy chains of ferritin, respectively. The increase of ferritin directly reflects the iron overload status in patients with b-thalassemia. CTSS is a member of the family of cysteine cathepsin. It acts as a lysosomal protease, which promotes the degradation of damaged or harmful proteins via the lysosome. The fourth protein, PLA2G7, is reported to be associated with cardiovascular disease risk (Jensen et al., 2014) . Global correlation analysis was performed to discover co-regulated protein functions and the links between altered proteins and patients' clinical data. Three high correlation squares (S1, S2, and S3) with different enriched pathways are shown in Figure 4A ( Table S6 ). Four of these pathways were significantly upregulated in the GSEA ( Figure 4B ), including the lysosome, regulation of actin cytoskeleton, focal adhesion, and hypertrophic cardiomyopathy pathways. The upregulated hypertrophic cardiomyopathy in S3 recapitulates iron-overload-induced cardiomyopathy in b-thalassemia, which is the leading cause of mortality in patients with b-thalassemia (Shah et al., 2019) . Interestingly, compared with healthy controls, the lysosome pathway is upregulated more significantly in TM than TI ( Figure 4C ). Here, iron overload in b-thalassemia causes large amounts of iron to accumulate in the lysosome, which may jeopardize lysosomal membrane integrity (Terman and Kurz, 2013) . In addition, one major cause of ineffective erythropoiesis is accelerated apoptosis, a pathway involving the lysosomal degradation of cytoplasmic components (Rund and Rachmilewitz, 2005) . Therefore, the lysosome pathway may be directly related to the severity of the mutations. We observed that ferritin in S1 correlated negatively with high-density lipoprotein cholesterol (HDL-C). Thus, we analyzed the relationship between ferritin and lipid profiles in 170 patients with b-thalassemia. HDL-C showed significantly negative correlations with ferritin after adjusting for age and body mass index (r = À0.64), while there was no correlation between ferritin and TG or between ferritin and total cholesterol (T-CHO) ( Figure 4D , Figure S6 ). Figure S8 ). These results indicate that patients with TM with codons 41/42 -TTCT have significantly higher iron burden than those with other mutations. Likewise, even though TT does not require blood transfusion, such carriers may require more attention to monitor their iron content. We also comparatively analyzed the differential proteins between codons 41/42 -TTCT mutation carriers and non-carriers. These altered proteins were mainly enriched into three significant functional components: the regulation of IGF transport and uptake by IGFBPs, the lysosome, and low-density lipoprotein particle ( Figure S9A ). Among them, the proteins associated with the insulin-like ternary complex were all downregulated ( Figure S9B ), while the proteins associated with the lysosome were all upregulated. Plasma proteins related to the fetal hemoglobin level and the progression of b-thalassemia Increased Hb F levels in patients with b-thalassemia can compensate for the loss of b-globin, reduce the redundant a-globin chain in red blood cells, and ameliorate the ineffective erythropoiesis and anemia. To identify proteins potentially affected by Hb F levels, three methods were used to distinguish proteins associated with Hb F levels. For the first and second methods, patients were categorized into high Hb F and low Hb F groups according to the upper and lower quartile of Hb F levels (first method) or Hb F levels R10 g/dL and <10 g/dL (second method). This is because Hb F levels R10 g/dL have been reported to alleviate transfusion burden in patients (CRISPR gene therapy trial on hold, 2018). Comparative proteome analyses were then applied and the number of differential proteins identified using the two methods were 34 and 19, respectively. In the third method, Hb F levels were directly correlated with plasma proteins after adjusting for disease status, and 50 proteins were significantly associated with Hb F level ( Figure 5A ). In total, 85 proteins related to Hb F levels were identified using the three methods ( Figure 5B ). We then wanted to determine whether these proteins are associated with the progression of b-thalassemia. The age at first transfusion represents the severity of ineffective hematopoiesis (Danjou et al., 2012) . Based on the age at first transfusion, cumulative incidence rate of patients with b-thalassemia receiving transfusion therapy (referred to as cumulative transfusion rate here) were compared in patients with higher or lower levels of Hb F-related proteins, as shown in Figure 5B . Kaplan-Meier curve analyses showed that the levels of 27 proteins were associated with cumulative transfusion rates, with these proteins relating to the C1q complex, gas transport, and myeloid leukocyte activation ( Figures 5C and 5D ). Using a more stringent significance level (p value % 0.01), Figure 5E shows proteins with higher abundance may alleviate the transfusion burden of patients with b-thalassemia, including L-lactate dehydrogenase B chain (LDHB), C1QC, galectin-3-binding protein (LGALS3BP), C1QB, TFRC, heat shock 70 kDa protein 1B (HSPA1B), C1QA, and carbonic anhydrase 2 (CA2); conversely, lower abundance of FTH1, ectonucleoside triphosphate diphosphohydrolase 5 (ENTPD5), or FTL was potentially associated with alleviated transfusion burden. These proteins in patients displayed higher significance levels than HBG2 (the subunit of Hb F; p value = 0.04) with respect to the cumulative transfusion rate. Exploration of proteins closely associated with b-thalassemia phenotypes can illuminate the underlying mechanisms involved in disease pathogenesis, ultimately improving the diagnosis and clinical management of b-thalassemia. To this end, our study utilized a TMT-based quantitative proteomics approach to characterize the plasma differences between 40 patients with b-thalassemia and 20 healthy controls. With the combination of high-abundance protein removal and prefractionation techniques, we achieved an extensive identification of the plasma proteome and detected a plethora of dysregulated proteins. Of the 246 differential proteins detected in patients with b-thalassemia and healthy controls, 13 core protein signatures exhibited significant biomarker potential for b-thalassemia diagnosis (Table S7) , of which most have direct or indirect links to b-thalassemia. For instance, the upregulation of TFRC, lactotransferrin (LTF), FTH1, and FTL is directly related to iron overload and the impaired hemoglobin synthesis in patients. Myeloperoxidase (MPO) is an oxidative stress biomarker, and upregulation of neutrophil gelatinase-associated lipocalin (LCN2) was suspected to decrease ROS or iron in b-thalassemia (Roudkenar et al., 2008) . Excess heme produced by hemolysis in b-thalassemia depletes endogenous HPX (Muller-Eberhard et al., 1968) , the most prominent downregulated protein in this study. Administration of exogenous HPX has been shown to prevent heme-iron loading in the cardiovascular system (Vinchi et al., 2013) and revert the heme-induced proinflammatory state of macrophages (Vinchi et al., 2016) . Importantly, significant changes in the abundance of HPX and CTSS were also observed in the thalassemic plasma extracellular vesicles by Kittivorapart et al. (2018) , and evaluated plasma proteins including PLA2G7, CCL18, and LCN2, have been reported separately in other comparative studies of b-thalassemia (Dimitriou et al., 2005; Tselepis et al., 2010) , reinforcing their potential to be clinical biomarkers in plasma. However, Gumus et al. (2016) did not find a significant difference in plasma MMP9 levels between patients with b-thalassemia and healthy individuals. Notably, this is the first report of elevated multiple inositol polyphosphate phosphatase 1 (MINPP1) in the plasma of patients with b-thalassemia. As elevated expression of MINPP1 involved in pontocerebellar hypoplasia (Appelhof et al., 2020; Ucuncu et al., 2020) , it may be a modulator of neurodegenerative syndrome in b-thalassemia. The diverse and complex clinical manifestations observed in patients with b-thalassemia prompted us to further investigate the perturbed pathways based on the previous identification of differential proteins. Our analysis of differential protein profiles using an enrichment map revealed a series of pathways affected by b-thalassemia, the majority of which are confirmed or proposed to be related to the disease. It is worth mentioning that these upregulated pathways, such as myeloid leukocyte differentiation, regulation of DNA binding transcription factor activity, and the cell cycle may be caused by the compensatory increase in hematopoiesis. We also observed that the synapse and regulation of nervous system development were upregulated in patients, which implies the pathophysiological change in the nervous system. Hemoglobin was reported to be expressed in neurons (Biagioli et al., 2009) . The impaired production of hemoglobin and iron deposition in the nerve system may be involved in neurodegenerative diseases such as Alzheimer disease (Altinoz et al., 2019; Bagwe-Parab and Kaur, 2020) . Global correlation analysis found the lysosome pathway is upregulated in turn among patients with TM, patients with TI, and healthy controls. Concurrently, hydrolytic enzymes in the lysosome (cathepsin B, cathepsin C, cathepsin D, cathepsin S, cathepsin H, cathepsin Z, NPC intracellular cholesterol transporter 2, lysosome-associated membrane glycoprotein 2, and prosaposin) showed the same trend and high detection accuracy ( Figure S10 ). This suggests that these hydrolytic enzymes may serve as indicators for the severity of b-thalassemia. A possible explanation for the correlation of the lysosome with disease severity is iron accumulation in the intracellular compartments. Transferrin-bound or ferritin-bound iron can be transported into lysosome for recycling (Li et al., 2010) . Accumulation of large amounts of iron in the lysosome yields strong reactive oxygen species (ROS), which may impair lysosomal activity. Impaired lysosomal activity by ROS accumulation has also been reported to induce ferroptotic cell death triggered (B) Venn diagram of identified proteins in the three methods. In total, 85 proteins were found to be related to Hb F level. (C) Proteins that significantly associated with cumulative incidence rate of patients with b-thalassemia receiving transfusion therapy. For Kaplan-Meier curve analysis, each patient was classified into high/low-abundance subgroups based on the median protein abundance. The log rank test was used to assess the statistical significance of differences between subgroups and the corresponding p value is scaled to the color. Up: proteins in patient with higher levels correspond to a later time for first transfusion. Down: proteins in patient with lower levels correspond to a later time for first transfusion. (D) Bar graph of enriched terms in Metascape across proteins related to cumulative transfusion rates. (E) Kaplan-Meier curve analysis presenting significant difference of overall age at first transfusion between the high-abundance and low-abundance groups (high-abundance group: patients with protein abundance > median; low-abundance group: patients with protein abundance < median). HBG2 and proteins with p < 0.01 are shown. The statistical analysis was performed using the log rank test. iScience 25, 104091, April 15, 2022 iScience Article by HSP90, along with HSC70 and Lamp-2a . In our plasma proteomics, increased levels of HSP90 and Lamp-2a proteins underline an important role of the lysosome in b-thalassemia (Table S2 ). In addition, hypertrophic cardiomyopathy, a highly lethal complication of iron overload, was enriched in high correlation square S3. In contrast to the upregulated signaling pathways, the triglyceride and neutral lipid metabolic pathway was significantly downregulated ( Figure 2C ). This implies a dysregulation of lipid metabolism, as abnormal TG, HDL-C, and low-density lipoprotein cholesterol (LDL-C) levels were seen in patients ( Figure S4B ). Moreover, high levels of plasma lipids and lipid peroxidation were commonly found in previous studies investigating the lipid profiles of patients with b-thalassemia (Madani et al., 2011; Nasr et al., 2008; Rachmilewitz et al., 1976) . The observation of altered lipid metabolic pathways in the plasma further validates our proteomic strategy. Although patients with TM and TI displayed some obvious distinctions in their clinical manifestations, hematological parameters, and complications (Karimi et al., 2014) , we observed a minimal difference between patients with TM and TI in their plasma proteome. The reasons may be as follows: first, TI and TM share same pathological mechanisms in the erythroid cell; second, TI and TM were categorized based on the genotypes, but the disease severity of patients with same genotype are always affected by genetic or environmental modifiers. However, several pronounced changes were still identified between patients with TI and TM, especially in proteins directly related to disease pathophysiology. For example, the incremental abundances of FTL and FTH1 among Ctr, TI, and TM directly support clinical distinguish of TI and TM by transfusion dependence and degree of ineffective erythropoiesis. Moreover, the incremental abundances of lysosome-associated proteins among three groups were also apparent, indicating a continuous but not distinct variation spectrum of the disease from TI to TM ( Figure S10 ). Despite the minimal difference between TI and TM in their plasma proteome, we noticed that ferritin levels of codons 41/42 -TTCT carriers are higher than that of non-carriers, no matter which type of b-thalassemia (TM, TI, or TT) the person has. This is consistent with evidence that more severe clinical symptoms are observed in patients with b 0 -thalassemia with either the homozygous or heterozygous codons 41/42 -TTCT mutation (Laosombat et al., 2001) . It also reinforces that TT with the codons 41/42 -TTCT mutation may require more strict iron status monitoring than other b-thalassemia genotypes. Differential proteins between codons 41/42 -TTCT carriers and non-carriers also highlighted the downregulation of the insulin-like ternary complex, including IGF1, IGFBP3 (insulin-like growth factor-binding protein 3), IGFALS (insulin-like growth factor-binding protein complex acid labile subunit), and IGFBP5 ( Figure S9 ). As reported previously, patients with b-thalassemia with growth retardation had lower IGF1 and IGFBP3 levels than those without growth retardation (Wu et al., 2003) . Furthermore, ferritin levels have been reported to be significantly higher in patients with b-thalassemia with IGF1 at less than 0.1 percentile compared to patients with normal IGF1 (Roth et al., 1997) . IGF1 is a major effector of bone development and IGFBPs mediate the bioavailability of IGFs (Clemmons, 2018) . Disruption of both IGF1 and IGFALS genes in mice exhibited a dramatic retardation in growth (Yakar et al., 2002) . In addition, a positive correlation between bone mineral density and the levels of IGF-1 and IGFBP3 in patients with b-thalassemia was observed (Lasco et al., 2002; Soliman et al., 1998) .Therefore, patients with the codons 41/42 -TTCT mutation are more likely to acquire iron overload compared to patients with other mutations. This in turn leads to increased risk of growth retardation, dysplasia, and osteoporosis diseases. Patients with the codons 41/42 -TTCT mutation thus also need to monitor growth and development. To the best of our knowledge, this is the first time these associations in patients with b-thalassemia with the codons 41/ 42 -TTCT mutation have been reported. Plasma proteins associated with Hb F levels could potentially be utilized as therapeutic targets of b-thalassemia. Our analysis revealed 85 candidate proteins. Furthermore, cumulative transfusion rate analyses in patients suggested 27 of these proteins are potentially related to the progression of b-thalassemia (Table S7) , such as LDHB, C1QC, and LGALS3BP ( Figure 5C ). For example, LGALS3BP is a negative regulator of the NF-kB signaling pathway (Hong et al., 2019) . Evidence indicates that NF-kB plays a crucial role in erythropoiesis (Jeong et al., 2011; Zhang et al., 1998) while the induction of Hb F by decitabine may be through the suppression of NF-kB activity (Theodorou et al., 2020) . Therefore, increased LGALS3BP protein expression may suggest restricted NF-kB activity and increased Hb F production in the erythroid cells. Further work is required to delineate the roles of these proteins in b-thalassemia. To our knowledge, ll OPEN ACCESS iScience 25, 104091, April 15, 2022 iScience Article this is the first report showing significant associations between these proteins and patients' cumulative transfusion rates. In summary, this study revealed protein signatures potentially involved in the pathophysiology of b-thalassemia utilizing deep plasma proteomics, coupling genetic and clinical information. We further discovered plasma proteins correlated with Hb F levels and that could affect transfusion burden in patients with b-thalassemia. These proteins can be further explored as potential diagnostic biomarkers or therapeutic targets of b-thalassemia. Owing to the relatively small number of participants, the diagnostic nature of the 13 core protein signatures awaits further verification in larger independent populations. Besides, co-inheritance of b-thalassemia and other hemoglobinopathies like a-thalassemia complicates clinical manifestations. Whether these proteins could be used in such conditions needs to be further explored. Another important finding of this study is that codons 41/42 carriers have higher ferritin levels, implying a higher risk or level of iron overload. However, the lack of long-term follow-up studies limits the assessment of growth and development in these carries. In addition, functional investigations of these proteins related with Hb F levels and cumulative transfusion rates are needed. Detailed methods are provided in the online version of this paper and include the following: iScience Article report original code. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request. Patients and healthy volunteers were recruited at The First Affiliated Hospital of Guangxi Medical University (Guangxi Zhuang Autonomous Region, China). Written informed consent was provided by all participants. For patients under the age of 18 years, consent was provided by the parents or legal guardian. Two weeks after receiving their last blood transfusion, patients offered their blood sample, and completed a questionnaire including the first time for transfusion, treatment conditions, and other medical conditions. A total of 286 b-thalassemia patients (mean G SD, 23.3 G 14.6 years) and 51 healthy controls (8.7 G 2.1 years) were recruited for this study and 20 TI (b + b + /b + b 0 , 9.2 G 2.7 years), 20 TM (b 0 b 0 , 8.9 G 2.3 years), and 20 age-and gender-matched healthy controls (9.9 G 2.3 years) were enrolled for the TMT-based plasma proteomic study. The detailed clinical information of these participants was list in Table S1 . The patients included in the plasma proteomic study were all received blood transfusions and iron chelation therapy, and none underwent splenectomy. Whole blood was collected into an EDTA Vacutainer and centrifuged at 3,000 3 g for 10 min. Harvested plasma from the supernatant were stored at À80 C until use. This study was designed and conducted in accordance with the Declaration of Helsinki. The Ethics Committee of The First Affiliated Hospital of Guangxi Medical University approved this study. b-thalassemia genotyping Hematological and serum iron parameters were measured from fasting blood samples collected two weeks after the most recent blood transfusion. Hematological values were measured using a CELL-DYN automated hematology analyzer (Abbott Diagnostics). Fetal hemoglobin (Hb F) was measured using high-performance liquid chromatography (VARIANT II, Bio-Rad). Serum ferritin (SF) was measured using an electrochemiluminescence-based immunoassay (Cobas e601, Roche). Serum soluble transferrin receptor (sTfR) was measured by immunoturbidimetry utilizing an automatic biochemistry analyzer (Abbott C8000). To remove the top 14 high-abundance proteins, 10 mL of plasma was loaded on High-Select TM Top14 Abundant Protein Depletion Mini Spin Columns (Thermo Fisher Scientific, USA) according to the manufacturer's protocols. The plasma and resin slurry was incubated in the column with gentle end-over-end mixing for 10 min at room temperature. After incubation, the mini column was placed into a 2 mL collection tube and underwent centrifugal concentration at 1,000 3 g for 2 min to collect the low-abundance components. The collected sample was concentrated and buffer exchanged with 8 M Urea/100 mM NH 4 HCO 3 using a Millipore Amiconâ Ultra-0.5 3 kDa MWCO filter according to the manufacturer's protocol. The protein concentration was quantified using a BCA Protein Assay Kit (Thermo Fisher Scientific, USA). For digestion, the concentrated proteins were reduced with 20 mM DTT (Sigma-Aldrich) at 37 C for 1 h, followed by alkylation with 40 mM iodoacetamide (IAM, Sigma-Aldrich) for 30 min at room temperature in the dark. The urea in the sample solution was then diluted to less than 2 M with 50 mM NH 4 HCO 3 . The sample was digested with Lys C at an enzyme:protein ratio of 1:50 (w/w) for two hours at 37 C and further digested at 37 C overnight with trypsin at a trypsin:protein ratio of 1:50 (w/w TMT 10-plex labeling Desalted peptides were resuspended in 100 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES, pH 8.5). TMT 10-plex reagents (0.4 mg) in acetonitrile were then added to 30 mg of peptides to a final acetonitrile concentration of $30% (v/v) and incubated for one hour at room temperature. The reaction was finally quenched with hydroxylamine to a final concentration of 0.1% (v/v). To test the labeling efficiency and optimize the mixing ratio, 2 mg of TMT-labeled peptides from each sample were pooled together. After the labeling check, the remaining labeled peptides were mixed together at a ratio of 1:1:1:1:1:1:1:1:1:1 and vacuum centrifuged to dryness. The pooled peptide sample was then resolved with 0.1% FA/H 2 O, subjected to HLB solid-phase desalination, and dried. Prior to subsequent liquid chromatography-tandem mass spectrometry (LC-MS/MS) processing, peptides were dissolved in buffer A (2% ACN/98% H 2 O, pH=10) and fractionated using an XBridge peptide BEH C18 column (130 Å , 3.5 mm, 2.1 mm 3 150 mm, Waters) on a Rigol L-3000 HPLC system with a binary buffer system. The gradient of buffer B (98% ACN/2% H 2 O, pH = 10) was set as follows: 5-8% for 5 min, 8-18% for 35 min, 18-32% for 22 min, 32-95% for 2 min, and 95% for 4 min at a constant flow rate of 0.23 mL/min. All fractions were collected at 90 s intervals and concatenated into 12 post-fractions as described (Li et al., 2016) . The combined fractions were lyophilized and stored at À80 C for further LC-MS/MS analysis. All mass spectrometry data were collected on a Q Exactive mass spectrometer (Thermo Fisher Scientific) coupled with an EASY-nLC 1000 HPLC system (Thermo Fisher Scientific). Each sample was resuspended in 0.1% FA and loaded onto an in-house packed C18 trap column (100 mm 3 2 cm, Reprosil-Pur C18 AQ, 5 mm, Dr. Maisch GmbH) and separated on a PFSelf-Pack C18 analytical column (75 mm 3 20 cm, Reprosil-Pur C18 AQ , 3 mm, Dr. Maisch GmbH) with mobile phase A (0.1% FA) and mobile phase B (ACN/ 0.1% FA) with the following gradient: 4-9% phase B, 5 min; 9-22% B, 80 min; 22-32% B, 25 min; 32-90% B, 1 min; 90% B, 9 min at a constant flow rate of 310 nL/min. Full scan spectra were acquired with a resolution of 70,000 (m/z 200), and the dynamic exclusion time was 50 s. The AGC value was 3e6 and maximum fill time was 60 ms in full scan mode. MS/MS spectra were acquired at AGC 5e4 with maximum injection time of 80 ms. The collision energy of HCD was 32%. Raw MS/MS data were searched against the UniProt human protein database (released 24 July 13, 2018), combined with commonly observed contaminants (245 sequences), in Proteome Discoverer 2.2 (PD2.2, Thermo Scientific) using the Sequest HT search engine for protein identification. Reporter ion intensities were corrected according to the distribution of TMT reporter ion isotope specified by the manufacturer. Fully digested with trypsin was selected and two missed cleavages were allowed during database search. Mass tolerance for precursor and fragment ions were 10 ppm and 0.02 Da, respectively. Cysteine carbamidomethylation (+57.021) and TMT labeling of lysine and peptide N-terminus (+229.163) were specified as fixed modifications, while methionine oxidation (+15.995) and N-terminal protein acetylation (+42.011) were selected as variable modifications. False discovery rate (FDR) was determined by searching a reverse sequence decoy database and set to 1% for both protein and peptide level with a minimum length of 7 amino acids. Proteins matching to contaminants were filtered out. To control for inconsistencies in sample loading, the protein abundance was normalized by setting the Normalization Mode parameter in PD 2.2 as total peptide amount. The average of normalized abundance for each protein was then scaled to 100 to enable the comparison across experiments. Thus, the scaled abundance of proteins was used in the following bioinformatics analysis. The data were processed using R (version 3.6.2). Log 2 -transformed abundance of proteins was used for statistical tests. iScience Article A two-way Analysis of Variance (ANOVA) was performed to assess the effects of TMT sets and experimental groups on each protein abundance. The resulting p value was adjusted using the Benjamini-Hochberg method, and an adjusted p value < 0.05 was marked as a significant result. The magnitude of studied effects was estimate by partial eta squared (partial h2). Differential protein analyses were performed using a One-way ANOVA for comparison of three groups (TM, TI, and control) and a two-sample t-test for comparison of two groups (b-thalassemia patients and healthy controls). Tests for two-sample comparison were two-sided. For multiple comparisons among the three groups, Tukey's Honest Significant Difference (TukeyHSD) statistical test was used for the post-hoc analysis. For general significance levels, proteins with p value < 0.05 and fold change >1.2 or <0.85 were considered as up-or downregulated proteins. The stricter criterion is p value < 0.05 and fold change >2 or <0.7. The global correlation analysis (Wewer Albrechtsen et al., 2018) was finished by the Corrplot package. Functional enrichment analysis of proteins in high correlation squares was performed in STRING (version 11.0) (Szklarczyk et al., 2019) . For Gene Set Enrichment Analysis (GSEA), all quantified proteins were ranked by log 2 (fold change) and analyzed using the GSEA (V4.1.0) software (Subramanian et al., 2007) . Differential proteins identified during the comparison of b-thalassemia patients and healthy controls were submitted to Metascape (Zhou et al., 2019) (version 3.5) for tissue-specific enrichment analysis using PaGenBase database and for protein-protein interaction enrichment analysis using the Molecular Complex Detection (MCODE) clustering algorithm. Network interaction and functional enrichment analyses were performed by STRING or Metascape. The Pearson's correlation coefficient was used to measure the strength of a linear association between two variables. The receiver operating characteristic (ROC) curve analysis was performed in the pROC package (Robin et al., 2011) . The relationship between the first time of patients' transfusion and proteins associated with Hb F was estimated by Kaplan-Meier curves, which was finished by survival and survminer packages. For enrichment map analysis, ranked proteins in TM vs. control and TI vs. control were separately analyzed using the GSEA (V4.1.0) software (Subramanian et al., 2007) . The enriched gene-sets with p value < 0.05 and FDR <0.25 were visualized using plug-in EnrichmentMap 3.3 of Cytoscape (Merico et al., 2010) , where nodes represented gene-sets and colors represented Normalized Enrichment Scores (NES) (red: positive; blue: negative). Gene-sets were grouped and scaled according to shared components (edge >0.5) and their similarity (size). To determine the effect of Hb F at the protein level, the following linear regression model was used: Protein level $ a + b 1 3 Hb F level + b 2 3 group + ε ll OPEN ACCESS The human plasma proteome: history, character, and diagnostic prospects Pontocerebellar hypoplasia due to biallelic variants in MINPP1 An automated method for finding molecular complexes in large protein interaction networks Molecular targets and therapeutic interventions for iron induced neurodegeneration Heme triggers TLR4 signaling leading to endothelial cell activation and vaso-occlusion in murine sickle cell disease Unexpected expression of alpha-and beta-globin in mesencephalic dopaminergic neurons and glial cells Hypercoagulability in beta-thalassemia: a status quo Platelet proteome reveals specific proteins associated with platelet activation and the hypercoagulable state in beta-thalassmia/HbE patients Role of IGF-binding proteins in regulating IGF responses to changes in metabolism CRISPR gene therapy trial on hold Genetic modifiers of beta-thalassemia and clinical severity as assessed by age at first transfusion Elevated plasma chemokine CCL18/PARC in beta-thalassemia Study of platelet activation, hypercoagulable state, and the association with pulmonary hypertension in children with beta-thalassemia CRISPR-Cas9 gene editing for sickle cell disease and beta-thalassemia Comparing protein abundance and mRNA expression levels on a genomic scale Association of thalassemia major and gingival inflammation: a pilot study Differential plasma proteome profiles of mild versus severe beta-thalassemia/Hb E Gal-3BP negatively regulates NF-kappaB signaling by inhibiting the activation of TAK1 Novel metabolic biomarkers of cardiovascular disease Resveratrol ameliorates TNFalpha-mediated suppression of erythropoiesis in human CD34(+) cells via modulation of NF-kappaB signalling Guidelines for diagnosis and management of beta-thalassemia intermedia Quantitative proteomics of plasma vesicles identify novel biomarkers for hemoglobin E/beta-thalassemic patients IthaGenes: an interactive database for haemoglobin variations and epidemiology Clinical and hematologic features of beta0-thalassemia (frameshift 41/42 mutation) in Thai patients Binding and uptake of H-ferritin are mediated by human transferrin receptor-1 High prevalence of thalassemia in migrant populations in Guangdong Province Identification and characterization of 293T cellderived exosomes by profiling the protein, mRNA and MicroRNA components Genomic and transcriptomic association studies identify 16 novel susceptibility loci for venous thromboembolism iProX: an integrated proteome resource Plasma lipids and lipoproteins in children and young adults with major betathalassemia from western Iran: influence of genotype MR imaging of the brain: findings in asymptomatic patients with thalassemia intermedia and sickle cellthalassemia disease The thalassemias and related disorders Enrichment map: a networkbased method for gene-set enrichment visualization and interpretation Plasma concentrations of hemopexin, haptoglobin and heme in patients with various hemolytic diseases Plasma lipid profile and lipid peroxidation in beta-thalassemic children Diagnosis and management of genetic iron overload disorders PaGenBase: a pattern gene database for the global and dynamic understanding of gene function In-depth human plasma proteome analysis captures tissue proteins and transfer of protein variants across the placenta Diagnosis and management of thalassaemia Comparative proteome-wide analysis of bone marrow microenvironment of beta-thalassemia Lipid membrane peroxidation in betathalassemia major pROC: an open-source package for R and S+ to analyze and compare ROC curves Short stature and failure of pubertal development in thalassaemia major: evidence for hypothalamic neurosecretory dysfunction of growth hormone secretion and defective pituitary gonadotropin secretion Upregulation of neutrophil gelatinase-associated lipocalin, NGAL/Lcn2, in beta-thalassemia patients Betathalassemia Challenges of blood transfusions in beta-thalassemia Update in the genetics of thalassemia: what clinicians need to know Proteomic and metabolomic characterization of COVID-19 patient sera Hemopexin and haptoglobin: allies against heme toxicity from hemoglobin not contenders Bone mineral density in prepubertal children with betathalassemia: correlation with growth and hormonal data Fetal hemoglobin regulation in beta-thalassemia: heterogeneity, modifiers and therapeutic approaches Coinheritance of the different copy numbers of alpha-globin gene modifies severity of beta-thalassemia/Hb E disease GSEA-P: a desktop application for gene set enrichment analysis Circulating proteomic panels for diagnosis and risk stratification of acute-on-chronic liver failure in patients with viral hepatitis B STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets Iron overload in thalassemia: different organs at different rates beta-thalassemias 4885. sickle cell anemia and silent cerebral infarction Proteomic studies for the investigation of gamma-globin induction by decitabine in human primary erythroid progenitor cultures Plasma levels of lipoprotein-associated phospholipase A(2) are increased in patients with beta-thalassemia MINPP1 prevents intracellular accumulation of the chelator inositol hexakisphosphate and is mutated in Pontocerebellar Hypoplasia Hemopexin therapy improves cardiovascular function by preventing heme-induced endothelial toxicity in mouse models of hemolytic diseases Hemopexin therapy reverts heme-induced proinflammatory phenotypic switching of macrophages in a mouse model of sickle cell disease Plasma proteome profiling reveals dynamics of inflammatory and lipid homeostasis markers after roux-En-Y gastric bypass surgery Increased circulating platelet aggregates in thalassaemia Growth hormone (GH) deficiency in patients with betathalassemia major and the efficacy of recombinant GH treatment Chaperonemediated autophagy is involved in the execution of ferroptosis Molecular epidemiological survey of haemoglobinopathies in the Guangxi Zhuang Autonomous Region of Southern China Circulating levels of IGF-1 directly regulate bone growth and density NF-kappaB transcription factors are involved in normal erythropoiesis Metascape provides a biologist-oriented resource for the analysis of systems-level datasets This study did not generate new unique reagents. The mass spectrometry data have been deposited to the ProteomeXchange Consortium via the iProX partner repository . Accession number is listed in the key resources