key: cord-0021641-vp7cvbs9 authors: Liu, Hui; Zhang, Luming; Xu, Fengshuo; Li, Shaojin; Wang, Zichen; Han, Didi; Zhang, Feng; Lyu, Jun; Yin, Haiyan title: Establishment of a prognostic model for patients with sepsis based on SOFA: a retrospective cohort study date: 2021-09-29 journal: J Int Med Res DOI: 10.1177/03000605211044892 sha: 5ded513e341b4d3e7d676a45206d6f3377638c7b doc_id: 21641 cord_uid: vp7cvbs9 OBJECTIVE: To construct a nomogram based on the Sequential Organ Failure Assessment (SOFA) that is more accurate in predicting 30-, 60-, and 90-day mortality risk in patients with sepsis. METHODS: Data from patients with sepsis were retrospectively collected from the Medical Information Mart for Intensive Care (MIMIC) database. Included patients were randomly divided into training and validation cohorts. Variables were selected using a backward stepwise selection method with Cox regression, then used to construct a prognostic nomogram. The nomogram was compared with the SOFA model using the concordance index (C-index), area under the time-dependent receiver operating characteristics curve (AUC), net reclassification improvement (NRI), integrated discrimination improvement (IDI), calibration plotting, and decision-curve analysis (DCA). RESULTS: A total of 5240 patients were included in the study. Patient’s age, SOFA score, metastatic cancer, SpO(2), lactate, body temperature, albumin, and red blood cell distribution width were included in the nomogram. The C-index, AUC, NRI, IDI, and DCA of the nomogram showed that it performs better than the SOFA alone. CONCLUSION: A nomogram was established that performed better than the SOFA in predicting 30-, 60-, and 90-day mortality risk in patients with sepsis. According to the third international consensus definitions for sepsis and septic shock (Sepsis-3), 1 sepsis is defined as lifethreatening organ dysfunction caused by a dysregulated host response to infection. In the clinic, organ dysfunction can be indicated by an increase in the Sequential Organ Failure Assessment (SOFA) score of 2 points or more. Sepsis has a very high morbidity and mortality rate, and more than 30 million people worldwide are estimated to be affected by sepsis each year, which may relate to 6 million deaths annually. 2 Early sepsis prognostic assessment is as critical as early sepsis diagnosis, as it may lead to greater vigilance among medical workers and the provision of timely and appropriate treatment for the patient. Several scores are widely used for such clinical situations, including the SOFA score, Logistic Organ Dysfunction Score (LODS), 3 Acute Physiology and Chronic Health Evaluation (APACHE) II score, and Simplified Acute Physiology Score (SAPS). 4 The SOFA score, in particular, is demonstrated to be a valuable tool for predicting short-term mortality in patients with sepsis, 3,5 but is not without limitations and cannot replace conventional evaluation indicators (such as procalcitonin and lactate clearance rate) to evaluate the prognosis of patients with sepsis. 6 Lactate displays similar ability to SOFA in discriminating the mortality of patients with sepsis. 7 Furthermore, research by Zhang et al. 8 into the mortality of sepsis has included the development and verification of a new scoring system based on machine learning, that may replace the SOFA score to more effectively predict 30-day mortality of intensive care unit (ICU) patients with sepsis. 9 A nomogram is a graphical tool based on a statistical prediction model that is used to determine the probability that a single clinical event may occur in a patient. 10 Nomograms combine several risk factors to make an accurate prediction, and are widely used in the clinic. 11 There are numerous predictive models for sepsis, including a nomogram to predict 30-day mortality in patients with septic encephalopathy, 12 and a nomogram to predict the risk of sepsis in patients with cholangitis. 13 The purpose of the present study was to construct a new nomogram based on the SOFA score, that is more suitable for patients with sepsis, and to discuss its value in predicting the risk of mortality in such patients. The Medical Information Mart for Intensive Care (MIMIC)-III database was established in 2003 at Beth Israel Deaconess Medical Centre and Massachusetts General Hospital and the Massachusetts Institute of Technology (MIT) , with funding from the National Institutes of Health (NIH). 14 Version 1.4 of the MIMIC-III database was used in the current study, which covers data obtained between 2001 and 2012 from more than 58 000 patient hospitalizations at Beth Israel Deaconess Medical Centre, including 38 645 adult patients and 7875 neonatal patients. 15 The database provides a large amount of real data that can be utilized in clinical research and comprises information related to patients admitted to critical care units at a large tertiary care hospital. Data include vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more. All data can be extracted in the SQL language for further analysis. Personnel involved in the current research participated in a series of courses provided by the NIH and obtained authorization to access the MIMIC-III database after passing the required assessment (certificate No. 38601114). The present retrospective study analysed data from a third-party anonymized, publicly available database (MIMIC-III), with pre-existing institutional review board approval. Patient information in the database is anonymous, thus, informed consent for the study was not required. The reporting of this study conforms to STROBE guidelines. 16 International Classification of Diseases (ICD)-9 codes 99591, 99592, and 78552 were used to extract data from the MIMIC-III database for patients diagnosed with sepsis, severe sepsis, and septic shock, for subsequent retrospective analyses. Exclusion criteria were as follows: (1) patients aged <18 years; (2) patients with a <24-h stay in the ICU to ensure sufficient data for evaluation; and (3) patients with SOFA scores <2 (as the database comprises data collected between 2001 and 2012, and Sepsis-3 was not updated and used for diagnosis at that time). In patients with !2 admissions to the ICU, data from first ICU admission only were extracted. All data were translated into SQL for further analysis. The hadm_id (hospital admission id) variable for each included patient was used to extract the following information from the MIMIC-III database: sex; age; SOFA score; continuous renal replacement therapy use; first care unit (surgical ICU, trauma surgical ICU, medical ICU [MICU], coronary care unit, or cardiac surgery recovery unit); comorbidities, namely, congestive heart failure, cardiac arrhythmia, renal failure, liver disease, metastatic cancer (MC), diabetes, coagulopathy, fluid electrolytes, and blood loss anaemia; laboratory tests, namely, white blood cell count (WBC), neutrophil percentage (NET), red blood cell distribution width (RDW), haematocrit, sodium, potassium, albumin, and lactate levels, and blood pH; and vital signs, comprising heart rate, respiratory rate, body temperature, and SpO 2 . All of the above information and data were extracted for the first 24 h of ICU stay. Categorical variables are presented as frequency and percentage values, and v 2 -test or Fisher's exact test was used to determine differences between cohorts. Continuous variables were assessed for normality of distribution with Shapiro-Wilk test, and are presented as mean AE SD or median (interquartile range) depending on normality of distribution. Multivariate Cox regression was used to select variables for plotting the 30-, 60-, and 90-day survival curves of the patients. The survival-probability nomogram was constructed using Cox regression. Data were analysed using SPSS, version 24.0 (IBM, Armonk, NY, USA) and R software, version 4.0.2 (https://www.r-project.org/), and a P value <0.05 in a two-sided test was considered statistically significant. Effectiveness of the nomogram was evaluated by comparing two models using the concordance index (C-index) and the area under the receiver operating characteristic curve (AUC). 17 The predictive accuracy of the model was determined by calculating the integrated discrimination improvement (IDI) and the net reclassification improvement (NRI). 18, 19 Calibration plots were used to evaluate consistency between the nomogram prognosis prediction and the actual situation. Finally, the net clinical benefit of the predictive model developed in the present study was assessed using decision-curve analysis (DCA). 20 Data from a total of 7770 patients diagnosed with sepsis, severe sepsis and septic shock between 2001 and 2012 were extracted from the MIMIC-III database. After exclusion of patients who did not meet the study criteria, a total of 5240 patients were included in the study ( Figure 1 ). For laboratory examination results, the indexes with >20% missing values were omitted, and the remaining data were filled using the multiple difference complement method. Patients were randomly assigned to the training cohort (70%, n ¼ 3667) and validation cohort (30%, n ¼ 1573) for constructing and validating the nomogram, respectively ( Table 1 ). The median SOFA score of the entire patient cohort was 6; median ages in the training and validation cohorts were 68 and 67 years, respectively; most patients were male (55.7% and 54.9%); the MICU was the most common first care unit (69.2% and 69.9%); and fluid electrolyte was the most common complication (54.8% and 53.6%). Moreover, most patients in both groups had a WBC count between 10 and 40 k/ml (61.5% and 61.7%); the NET was mainly >70% (78.6% and 78.2%); median SpO 2 values were 97.14 and 97.19%, respectively; and median albumin levels were 2.90 and 2.80 mg/dl, respectively. In both cohorts, the median lactate level was 2.20 mmol/l, median RDW was 15.40%, the most common sodium level was between 130 and 149 k/ml, the most common potassium level was between 3.5 and 5.6 k/ml, the most common PaCO 2 value was between 36 and 45 mmHg, the most common heart rate was between 60 and 100 beats/min, the most common respiratory rate was between 20 and 30 breaths/ min, and the most common body temperature was between 36 and 37.2 C (all summarised in Table 1 ). Multifactorial Cox regression analysis was performed with data from the training cohort to control for confounding factors ( Table 2) . After performing a comprehensive evaluation of the variables and then applying Occam's razor (the simplest explanation is preferable to one that is more complex), the variables included in the nomogram for predicting 30 . A graphical illustration of the nomogram is presented in Figure 2 , in which age, SOFA score, SpO 2 and RDW are shown to have larger weighting than other included factors. Nomogram performance was evaluated using a variety of metrics, including C-index, AUC, NRI and IDI. The C-index was higher for the nomogram than the single SOFA score, whether in the training cohort As described above, a nomogram for predicting the outcome of patients with sepsis was established by including age, SOFA score, MC, SpO 2 , lactate level, body temperature, albumin level, and RDW. Correlations between calibration and standard curves in the nomogram calibration plots for the training and validation cohorts indicated that predicted values for 30-, 60-, and 90-day survival were in good agreement with observed values (Figure 4) . The clinical value of the model and its impact on actual decision-making was verified with DCA. The net benefit of the nomogram was shown to be greater than the SOFA score for any predicted probability, both in the training and validation cohorts ( Figure 5) , indicating that the nomogram has a substantial net benefit in predicting 30-, 60-, and 90-day survival rates. In this retrospective study of data from the MIMIC-III database, Cox regression analysis was used to select variables comprising age, SOFA score, MC, SpO2, lactate, body temperature, albumin, and RDW that were integrated to generate a prediction model for mortality in patients with sepsis, visualized as a nomogram. The C-index and AUC were used to evaluate nomogram performance, and showed that the model has excellent distinguishability. In addition, the nomogram had a positive improvement effect, shown by NRI and IDI. In the training and validation cohorts, the calibration curve of the new nomogram matched the standard curve very well. In addition, the 30-, 60-, and 90-day DCA curves of the training and validation cohorts revealed that the nomogram generated net benefits. Overall, the results suggest that use of the nomogram may benefit patients with sepsis, as it shows improved performance compared with the SOFA score alone, in predicting 30-, 60-and 90-day mortality in patients with sepsis. Age had a relatively strong influence (higher weighting) in the present nomogram. Age has been shown to be an independent risk factor for death in patients with sepsis, and fatality rate is shown to increase linearly with age. 21 The present research revealed the same trend; the nomogram score increased with age, which might be due to elderly patients being more susceptible to infection with gram-negative bacteria, having more comorbidities such as cancer, and having weaker immune function. 22, 23 The predictions made using the nomogram regarding whether a patient has MC are also related to the immune system, and the inclusion of albumin in the nomogram reflects the tendency of older patients to have a worse nutritional status than their younger counterparts. 24, 25 The haemodynamic performance of patients with septic shock is exceptionally complicated. 26, 27 In sepsis, endothelial cell dysfunction, the interaction of leukocytes/ platelets and endothelial cells, coagulation activation, inflammation, abnormal hemorheology, and functional shunting, together lead to microcirculatory disorders, insufficient tissue perfusion, and hypoxia, ultimately leading to multiple-organ dysfunction or even septic shock. 28 SpO 2 can reflect the body's real-time oxygen supply state and degree of hypoxia, and may therefore serve as a factor associated with sepsis. 29 Additionally, SpO 2 can be measured more conveniently and rapidly than arterial blood gas level, making it an attractive parameter for use in prediction models, and justifying its recruitment and inclusion into the present nomogram. Patients with sepsis or septic shock will experience anaerobic metabolism due to microcirculatory disturbances and lactate accumulation. Serum lactate level is conventionally regarded as a marker of tissue hypoxia, and used as a clinical indicator of sepsis/septic shock severity and patient prognosis. 30 Lactate also has a high predictive value in the present nomogram, with the score increasing with rising lactate levels. 31 The indicator with the highest weight in the present nomogram was RDW, a prognostic biomarker for cardiovascular disease, stroke, and metabolic syndrome mortality. 32 A previous retrospective study, based on the MIMIC-III database, found that RDW was useful in predicting long-term all-cause mortality in patients with severe sepsis, 33 and a multicentre observational study showed that RDW is a powerful predictor of risk of all-cause death and blood infection in critically ill patients. 34 Moreover, an increase in the distribution of red blood cells when patients are discharged from the ICU is a powerful predictor of subsequent all-cause mortality. 35 These observations show the predictive value of RDW, however, while this index is commonly measured in clinical workups, it is often ignored and, given its importance in the above studies and the present results, deserves greater attention. Sepsis is a complex disease, and while guidelines on sepsis are continually updated, the mechanism underlying the pathophysiological changes induced by sepsis remains difficult to characterize. Our understanding of sepsis may be enhanced by summarizing patient data, identifying the degree of risk by patient indicators in the early stages of disease, and adopting reasonable intervention methods, such as improving patient prognosis by improving SpO2 level. The nomogram constructed here may have considerable advantages in predicting the risk of death in patients with sepsis, and is easy to use, making it very suitable for clinical applications. The results of the present study may be limited by several factors. First, it was a retrospective design based on the MIMIC-III database. Secondly, the nomogram was internally validated using data from only one database. This limitation should be addressed in future studies by using a separate database or data from a separate group or study. Thirdly, further research on the aetiology and antibiotic treatment of sepsis was not conducted. In addition, as the database collected patient data between 2001 and 2012, and Sepsis-3 criteria were not updated at that time, the diagnosis of sepsis was based on Sepsis-1 criteria. Patients with SOFA scores <2 were excluded on this basis, but sepsis diagnoses were still not completely consistent with Sepsis-3. Lastly, sepsis is a heterogeneous disease. 36 In future research, sepsis subclasses should be identified, and reliable predictions should be generated for each subclass, in order to promote and optimize the clinical management of sepsis. In the information era, data reusability and data sharing strategies are receiving increasing attention worldwide. Nomograms are a significant component of modern medical decision-making. The present study established a nomogram that performed better than the SOFA alone in predicting 30-, 60-, and 90-day mortality risk in patients with sepsis. Patient data are available on the MIMIC-III website at https://mimic.physionet.org/, https:// doi.org/10.13026/C2HM2Q. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3) Mortality in sepsis and septic shock in Europe, North America and Australia between 2009 and 2019-results from a systematic review and meta-analysis Prognostic values of SOFA score, qSOFA score, and LODS score for patients with sepsis Predictive value of SAPS II and APACHE II scoring systems for patient outcome in a medical intensive care unit The early change of SOFA score as a prognostic marker of 28-day sepsis mortality: analysis through a derivation and a validation cohort Predictive value of SOFA, qSOFA score and traditional evaluation index on sepsis prognosis Prognostic accuracy of the serum lactate level, the SOFA score and the qSOFA score for mortality among adults with sepsis Defining persistent critical illness based on growth trajectories in patients with sepsis Development and validation of a sepsis mortality risk score for Sepsis-3 patients in intensive care unit Nomograms in oncology: more than meets the eye How to build and interpret a nomogram for cancer prognosis Development of a nomogram to predict 30-day mortality of patients with sepsisassociated encephalopathy: a retrospective cohort study A nomogram for predicting the risk of sepsis in patients with acute cholangitis Brief introduction of medical database and data mining technology in big data era MIMIC-III, a freely accessible critical care database The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation The net reclassification index (NRI): a misleading measure of prediction improvement even with independent test data sets Clinical risk reclassification at 10 years Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers The effect of age on the development and outcome of adult sepsis Global cancer incidence and mortality rates and trends-an update The immune system in cancer prevention, development and therapy Could nutritional and functional status serve as prognostic factors for COVID-19 in the elderly? Nutritional determinants of frailty in older adults: a systematic review A multicentre prospective observational study comparing arterial blood gas values to those obtained by pulse oximeters used in adult patients attending Australian and New Zealand hospitals Comparison of SpO2 to PaO2 based markers of lung disease severity for children with acute lung injury Sepsis and septic shockis a microcirculation a main player Microcirculation in sepsis: new perspectives Lactate and immunosuppression in sepsis Lactate level versus lactate clearance for predicting mortality in patients with septic shock defined by Sepsis-3 Red blood cell distribution width is associated with mortality in elderly patients with sepsis Red blood cell distribution width predicts long-term outcomes in sepsis patients admitted to the intensive care unit Red cell distribution width and allcause mortality in critically ill patients The association of red cell distribution width at hospital discharge and out-ofhospital mortality following critical illness* Identification of subclasses of sepsis that showed different clinical outcomes and responses to amount of fluid resuscitation: a latent profile analysis The authors declare that there is no conflict of interest. The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Jun Lyu https://orcid.org/0000-0002-2237-8771 Haiyan Yin https://orcid.org/0000-0002-9680-4219