key: cord-0912188-usszuzax
authors: Sherak, R. A. G.; Sajjadi, H.; Khimani, N.; Tolchin, B.; Jubanyik, K.; Taylor, R. A.; Schulz, W.; Mortazavi, B. J.; Haimovich, A. D.
title: SOFA score performs worse than age for predicting mortality in patients with COVID-19
date: 2022-05-03
journal: nan
DOI: 10.1101/2022.05.02.22274575
sha: 1d587102fadfadc985e03e84bac8632e5e88d874
doc_id: 912188
cord_uid: usszuzax

The use of the Sequential Organ Failure Assessment (SOFA) score, originally developed to describe disease morbidity, is commonly used to predict in-hospital mortality. During the COVID-19 pandemic, many protocols for crisis standards of care used the SOFA score to select patients to be deprioritized due to a low likelihood of survival. A prior study found that age outperformed the SOFA score for mortality prediction in patients with COVID-19, but was limited to a small cohort of intensive care unit (ICU) patients and did not address whether their findings were unique to patients with COVID-19. Moreover, it is not known how well these measures perform across races. In this retrospective study, we compare the performance of age and SOFA scores in predicting in-hospital mortality across two cohorts: a cohort of 2,648 consecutive adult patients diagnosed with COVID-19 who were admitted to a large academic health system in the northeastern United States over a 4-month period in 2020 and a cohort of 75,601 patients admitted to one of 335 ICUs in the eICU database between 2014 and 2015. Among the COVID-19 cohort, age (area under receiver-operating characteristic curve (AU-ROC) 0.795, 95% CI 0.762, 0.828) had a significantly better discrimination than SOFA score (AU-ROC 0.679, 95% CI 0.638, 0.721) for mortality prediction. Conversely, age (AU-ROC 0.628 95% CI 0.608, 0.628) underperformed compared to SOFA score (AU-ROC 0.735, 95% CI 0.726, 0.745) in non-COVID-19 ICU patients in the eICU database. There was no difference between Black and White COVID-19 patients in performance of either age or SOFA Score. Our findings bring into question the utility of SOFA score-based resource allocation in COVID-19 crisis standards of care.

patients diagnosed with COVID-19 who were admitted to a large academic health system in the northeastern United States over a 4-month period in 2020 and a cohort of 75,601 patients admitted to one of 335 ICUs in the eICU database between 2014 and 2015.

Among the COVID-19 cohort, age (area under receiver-operating characteristic curve (AU-ROC) 0.795, 95% CI 0.762, 0.828) had a significantly better discrimination than SOFA score (AU-ROC 0.679, 95% CI 0.638, 0.721) for mortality prediction. Conversely, age (AU-ROC 0.628 95% CI 0.608, 0.628) underperformed compared to SOFA score (AU-ROC 0.735, 95% CI 0.726, 0.745) in non-COVID-19 ICU patients in the eICU database. There was no difference between Black and White COVID-19 patients in performance of either age or SOFA Score. Our findings bring into question the utility of SOFA score-based resource allocation in COVID-19 crisis standards of care.

The COVID-19 pandemic has prompted hospitals to develop protocols for allocating resources if the number of patients exceed their capacity in order to save as many lives as possible. Many of these protocols use the Sequential Organ Failure Assessment (SOFA) score to identify patients who are unlikely to survive and thus should be deprioritized for care. There are concerns that the SOFA score may not accurately predict mortality in patients with COVID-19 or perform better in one racial group over another. We asked whether a simple measure, patient age, could better predict mortality than SOFA score in a group of adult patients admitted to a large academic health system in 2020. To see if any findings are unique to patients with COVID-19, we performed the same analysis in a group of adult patients taken from the eICU database, a large publicly available dataset that was collected prior to the COVID-19 pandemic.

We found that age was better than SOFA score at predicting patient mortality in patients with COVID-19, but not in patients without COVID. For COVID-19, neither age or SOFA score performed better in one racial group over another. Caution is needed when applying an established disease severity index model to a new illness.

The Sequential Organ Failure Assessment (SOFA) score was developed in 1994 by a European Society of Intensive Care Medicine working group to objectively quantify the degree of organ dysfunction and failure in intensive care unit (ICU) patients with sepsis (1). The SOFA score assigns a value of 0 to 4 to the dysfunction of 6 organ systems (respiratory, coagulation, hepatic, cardiovascular, neurologic, and renal), with higher numbers indicating more dysfunction. The score is calculated using the worst clinical values observed in the previous 24 hours. It may be used to describe each organ system or as a summative measure.

Although the initial intention of the SOFA score was to describe morbidity in ICU patients with sepsis, subsequent studies have used the SOFA score or SOFA scorebased models to predict mortality (2) (3) (4) . While the SOFA score on admission performed well for mortality prediction in sepsis in some studies (5, 6) , other studies have suggested only intermediate discriminatory accuracy (7, 8) . The use of SOFA score has also expanded beyond patients with sepsis (9,10) and is used outside of the ICU (6, 11) .

With COVID-19 positive patients occupying up to 90% of ICU beds during the global SARS-CoV-2 pandemic (12), the SOFA score gained new applications. One study reported that 20 of the 26 COVID-19 ventilator triage policies surveyed used the SOFA score (13) . Many of these guidelines for crisis standards of care involve withholding resources from patients with an expected low likelihood of survival, with the rationale of saving the most lives possible with the limited resources available. One way this is done is by assuming patients with a SOFA score at or above a predetermined threshold are unlikely to survive the hospitalization. However, there is a dearth of data about the use of the SOFA score to predict mortality in COVID-19, potentially leading to improper allocation of resources (14, 15) .

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 3, 2022. ; https://doi.org/10.1101/2022.05.02.22274575 doi: medRxiv preprint One prominent study of 675 COVID-19 positive patients from a single healthcare system found that SOFA score had inferior discriminant prognostic accuracy for inhospital mortality compared to patient age (16) . However, the study was limited by its small sample size with missing data for approximately 25% of its cohort and restriction to patients requiring mechanical ventilation. Moreover, the study population had higher and less variable SOFA scores than the average ICU population (6, 7) .

Here, we expand on prior work of SOFA score mortality prediction in COVID-19 by assessing SOFA score performance for an undifferentiated population of admitted patients. The primary aim of this retrospective study was to assess how well SOFA score predicts in-hospital mortality in a consecutive cohort of patients with COVID-19 admitted to a quaternary medical center in the Northeast United States (US). We compare the discriminative performance of SOFA to age alone. To examine whether any findings are specific to patients with COVID-19, we contrast these findings to a large cohort of general ICU patients from the publicly accessible eICU database. As a secondary objective, given concerns that prediction models can perpetuate systemic inequities (17) (18) (19) (20) (21) (22) , we compared the prognostic value of SOFA across race in both cohorts.

This was a retrospective study comprising two separate patient cohorts.

Data Acquisition and Preprocessing:

Patients in the COVID-19 cohort consisted of all patients with an age ≥ 18 admitted to any of the 5 hospitals in the Yale-New Haven Health System (YNHH) from March 29th, 2020 to August 1, 2020 with a diagnosis of COVID-19, defined as either a positive PCR test for COVID-19 or designated as a COVID-19 patient by an attending physician. Data was obtained retrospectively from the YNHH electronic medical record (EMR, Epic Systems Corporation, Verona, WI). SOFA score was automatically calculated for all . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Patients with an ICU stay of at least 24 hours were included. Also, for the patients having more than one hospitalization, only the most recent hospitalization was used.

The SOFA scores on eICU were calculated and extracted using Python's Pandas library, based on its standard definition on the values of creatinine, bilirubin, platelets, fraction of inspired oxygen, partial pressure of inspired oxygen, Glasgow Coma Scale, mean arterial pressure, and mechanical ventilation status. As part of the eICU patient de-identification process, all patients with age >89 are grouped together and were assigned an age of 90 for the analysis. Then, the corresponding tables and columns were pre-processed to calculate the final scores. The overall steps taken in this procedure are described in the supplement.

In both cohorts, gender was dichotomized and race was classified as a single category based on predefined fields in the EMR. Depending on a patient's clinical status, gender and race were either self-selected by patients or assigned by hospital registration staff.

Age in years at time of admission and the maximum SOFA score recorded in the first 24 hours after admission were each used as predictor variables in separate univariate logistic regression models for the binary outcome of in-hospital mortality. Each model was fit to a sample of 60% of the respective cohorts.

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 3, 2022. ; https://doi.org/10.1101/2022.05.02.22274575 doi: medRxiv preprint Model Assessment:

The remaining 40% of each cohort were used as a validation cohort to calculate area under the receiver operator characteristic curves (AU-ROCs). AU-ROC intervals were estimated using DeLong's method (25) . This method was then repeated to calculate area under precision-recall curves (AU-PRCs). All analyses were performed in Python (Version 3.7.7) and R (Version 1.4.1717) .

We performed several secondary analyses. We stratified each cohort by race including only patients identifying as Black/African American or White/Caucasian as we had insufficient sample sizes for further groups. Because the eICU cohort is restricted to patients admitted to the ICU, we performed an additional analysis of the COVID-19 cohort restricted to ICU patients. We also conducted an exploratory analysis of the COVID-19 cohort of survival rates at and above a given SOFA score. We then stratified by race to explore whether there was a difference in survival rates at various SOFA score thresholds. The binomial exact test was used to calculate confidence intervals. The Yale and eICU cohorts respectively were 52.4% and 46.1% female and 25.4% and 10.7% Black/African American (Table 1) .

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 3, 2022. ; https://doi.org/10.1101/2022.05.02.22274575 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 3, 2022. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 3, 2022. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 3, 2022 (Figure 3 and Table 3 ). There also was no significant difference in the performance of the SOFA score between Black and White patients with COVID-19. In the eICU cohort, SOFA score was better than age at discriminating between survivors and non-survivors both in Black and White patients. Additionally, the SOFA score performed better in Black patients compared to White patients in the eICU cohort.

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 3, 2022. In the COVID-19 cohort, there was no significant difference in survival rates between Black and White patients at and above SOFA scores of 3 (Supplement Figure 2) . The survival rate of COVID-19 patients only dropped below 50% at SOFA scores greater than or equal to 12 (Supplement Figure 3 ).

Age significantly outperformed SOFA score for predicting mortality in hospitalized COVID-19 patients, including those in the ICU. This phenomenon may be unique to COVID-19 as SOFA score was significantly better at predicting mortality in the eICU cohort. The finding that a simple metric such as age outperformed SOFA score for mortality prediction in COVID-19 patients suggests that caution should be taken when applying established prediction models to a completely novel disease process. This is especially prudent when using mortality prediction models to guide treatment decisions . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 3, 2022. ; https://doi.org/10.1101/2022.05.02.22274575 doi: medRxiv preprint and resource allocation. Many guidelines for crisis standards of care suggested a SOFA score threshold of ≥ 6 to identify patients with low likelihood of survival (26, 27) .

However, there was only a 33% mortality rate using that threshold in our COVID-19 cohort.

Although the SOFA score was originally created for ICU patients (1), the inclusion of non-ICU patients in the COVID-19 cohort does not explain our findings. When restricted to COVID-19 cohort to ICU patients, a similar trend was noted. Age performed better than SOFA score for mortality prediction, with minimally overlapping confidence intervals. However, these results were limited by the relatively small sample size of COVID-19 ICU patients (642). While applying SOFA score to non-ICU patients is outside of the original intent of SOFA score, so is using it for mortality prediction. In the original study describing the SOFA score, Vincent et al. state, "it is important to realize that the SOFA score is designed not to predict outcome but to describe a sequence of complications"(1).

Prior studies have theorized that SOFA score underperforms for mortality prediction in COVID-19 patients because the illness affects fewer organ systems resulting in lower variability in scores (16) . Contradictory to that theory, patients in the COVID-19 cohort had the same standard deviation in scores (2.7) as the eICU cohort. Since age >65 is one of the characteristics with the strongest association with increased mortality in COVID-19 positive patients (28, 29) , it may partially explain why age outperforms SOFA score only in COVID-19 patients.

There was no difference between White and Black COVID-19 patients in the performance of either age or SOFA score for mortality prediction. Additionally, there was no significant difference in mortality rate between White and Black COVID-19 patients at SOFA scores greater than 2. Notably, a prior study of patients from the same COVID-19 cohort, Black patients had 1.5 times the odds of a SOFA score ≥ 6 than white patients, even when adjusting for age, sex, insurance status, BMI, liver and renal diseases (23) . This suggests that Black patients with COVID-19 may be more likely than . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 3, 2022. ; https://doi.org/10.1101/2022.05.02.22274575 doi: medRxiv preprint White patients to be assigned higher SOFA scores but will have similar mortality rates at those higher SOFA scores.

This study builds on a recent analysis from a single health system of 675 COVID-19 positive patients requiring mechanical ventilation in which SOFA score had inferior prognostic accuracy for in-hospital mortality compared to simply using patient age (16) .

However, the study was limited by its cohort having higher and less variable SOFA scores than the average ICU population as well as missing data for approximately 25% of the cohort. Our study had over 2,600 COVID-19 positive patients with an average SOFA score similar to prior analyses on non-COVID-19 patients (6,7) and a mortality rate comparable to other cohorts of COVID-19 patients that were admitted to US hospitals during a similar time period (28) . Moreover, the SOFA scores used in the COVID-19 cohort were automatically calculated by the electronic health record, a pragmatic approach that would be used in triage scenarios.

Our study has several limitations. The cohort of COVID-19 patients was restricted to a single health system in the northeast US which may not have a comparable population to COVID-19 patients at other academic health systems or those in the eICU database.

Due to sample size, we also restricted our analysis to two racial groups and did not consider patients that identified as multi-racial. We did not consider sex or its potential interaction with race or age in our study. Furthermore, an unknown proportion of patients in the COVID-19 cohort were too critically ill to answer demographic questions and had their race and sex recorded by a hospital clerk based on assumption.

Additionally, this study only describes the performance of max SOFA score within 24 hours of hospital admission. Many COVID-19 positive patients present to the hospital with respiratory complaints and develop multisystem organ dysfunction later in their disease course (29) . However, information on SOFA subscores in the cohort of COVID-19 patients were not available so we were unable to test this hypothesis. SOFA score may have greater utility for predicting mortality with serial measurements or later in disease course (2) .

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 3, 2022. ; https://doi.org/10.1101/2022.05.02.22274575 doi: medRxiv preprint

As the COVID-19 pandemic continues well into 2022, SOFA score continues to feature prominently in guidelines for crisis standards of care (30). This study suggests caution should be used when considering SOFA score as a prognostic tool, as it has limited prognostic performance.

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 3, 2022. ; https://doi.org/10.1101/2022.05.02.22274575 doi: medRxiv preprint

The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine

Serial evaluation of the SOFA score to predict outcome in critically ill patients

Use of the SOFA score to assess the incidence of organ dysfunction/failure in intensive care units: Results of a multicenter, prospective study

Crossvalidation of a Sequential Organ Failure Assessment score-based model to predict mortality in patients with cancer admitted to the intensive care unit

The Sequential Organ Failure Assessment score for predicting outcome in patients with severe sepsis and evidence of hypoperfusion at the time of emergency department presentation

Comparative prognostic accuracy of sepsis scores for hospital mortality in adults with suspected infection in non-ICU and ICU at an academic public hospital

Prognostic Accuracy of the SOFA Score, SIRS Criteria, and qSOFA Score for In-Hospital Mortality Among Adults With Suspected Infection Admitted to the Intensive Care Unit

Assessment of Clinical Criteria for Sepsis: For the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3)

Evaluation of SOFA-based models for predicting mortality in the ICU: A systematic review

Comparison of risk prediction scoring systems for ward patients: a retrospective nested case-control study. Crit Care

ICU Bed Utilization During the Coronavirus Disease 2019 Pandemic in a Multistate Analysis-March to

Ventilator Triage Policies During the COVID-19 Pandemic at U.S. Hospitals Associated With Members of the Association of Bioethics Program Directors

Sequential Organ Failure Assessment in H1N1 pandemic planning

An assessment of the validity of SOFA score based triage in H1N1 critically ill patients during an influenza pandemic

Discriminant Accuracy of the SOFA Score for Determining the Probable Mortality of Patients With COVID-19 Pneumonia Requiring Mechanical Ventilation

Triage and justice in an unjust pandemic: ethical allocation of scarce medical resources in the setting of racial and socioeconomic disparities

Structural Racism, Social Risk Factors, and Covid-19 -A Dangerous Convergence for Black Americans

Accuracy of the Sequential Organ Failure Assessment Score for In-Hospital Mortality by Race and Relevance to Crisis Standards of Care

Equitably Allocating Resources During Crises: Racial Differences in Mortality Prediction Models

Assessment of Disparities Associated With a Crisis Standards of Care Resource Allocation Algorithm for Patients in 2 US Hospitals During the COVID-19 Pandemic

Performance of intensive care unit severity scoring systems across different ethnicities in the USA: a retrospective observational study. Lancet Digit Health

Racial disparities in the SOFA score among patients hospitalized with COVID-19

The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Sci Data [Internet]

Fast Implementation of DeLong's Algorithm for Comparing the Areas Under Correlated Receiver Operating Characteristic Curves

A Framework for Rationing Ventilators and Critical Care Beds During the COVID-19 Pandemic

Developing a Triage Protocol for the COVID-19 Pandemic: Allocating Scarce Medical Resources in a Public Health Emergency

Factors associated with hospital admission and critical illness among 5279 people with coronavirus disease

Pathophysiology, Transmission, Diagnosis, and Treatment of Coronavirus Disease 2019 (COVID-19): A Review

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.