key: cord-0755745-enyqzuct authors: Wolfisberg, Selina; Gregoriano, Claudia; Struja, Tristan; Kutz, Alexander; Koch, Daniel; Bernasconi, Luca; Hammerer-Lercher, Angelika; Mohr, Christine; Haubitz, Sebastian; Conen, Anna; Fux, Christoph A.; Mueller, Beat; Schuetz, Philipp title: Call, chosen, HA(2)T(2), ANDC: validation of four severity scores in COVID-19 patients date: 2021-11-19 journal: Infection DOI: 10.1007/s15010-021-01728-0 sha: 401505f3dc619e2a221ff11f6b1f81e458347017 doc_id: 755745 cord_uid: enyqzuct PURPOSE: To externally validate four previously developed severity scores (i.e., CALL, CHOSEN, HA(2)T(2) and ANDC) in patients with COVID-19 hospitalised in a tertiary care centre in Switzerland. METHODS: This observational analysis included adult patients with a real-time reverse-transcription polymerase chain reaction or rapid-antigen test confirmed severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2) infection hospitalised consecutively at the Cantonal Hospital Aarau from February to December 2020. The primary endpoint was all-cause in-hospital mortality. The secondary endpoint was disease progression, defined as needing invasive ventilation, ICU admission or death. RESULTS: From 399 patients (mean age 66.6 years ± 13.4 SD, 68% males), we had complete data for calculating the CALL, CHOSEN, HA(2)T(2) and ANDC scores in 297, 380, 151 and 124 cases, respectively. Odds ratios for all four scores showed significant associations with mortality. The discriminative power of the HA(2)T(2) score was higher compared to CALL, CHOSEN and ANDC scores [area under the curve (AUC) 0.78 vs. 0.65, 0.69 and 0.66, respectively]. Negative predictive values (NPV) for mortality were high, particularly for the CALL score (≥ 6 points: 100%, ≥ 9 points: 95%). For disease progression, discriminative power was lower, with the CHOSEN score showing the best performance (AUC 0.66). CONCLUSION: In this external validation study, the four analysed scores had a lower performance compared to the original cohorts regarding prediction of mortality and disease progression. However, all scores were significantly associated with mortality and the NPV of the CALL and CHOSEN scores in particular allowed reliable identification of patients at low risk, making them suitable for outpatient management. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s15010-021-01728-0. The coronavirus disease 2019 (COVID-19) pandemic, with its overwhelming resource use, has been a major challenge for clinicians and health care institutions worldwide. Identifying patients at high risk of disease progression may help allocating resources more efficiently. Since presentation and course of the infection can vary considerably (including asymptomatic cases), no single trait is sufficient to appropriately categorise patients [1] [2] [3] [4] [5] [6] [7] [8] [9] . Thus, several scores have attempted to improve identification of patients at high risk of progression or death of COVID-19. Among these scores, the CALL, CHOSEN, HA 2 T 2 and the ANDC score have generated much interest [10] [11] [12] [13] . This retrospective observational analysis included all consecutive adult patients (≥ 18 years) with a confirmed Severe Acute Respiratory Syndrome Corona Virus type 2 (SARS-CoV-2) infection that required hospitalisation for at least 24 h at the Medical University Clinic of the Cantonal Hospital Aarau (Switzerland) between February 26, 2020 and April 30, 2020 (first wave) and between October 1, 2020 and December 31, 2020 (second wave). In this tertiary care centre with 130 medical ward beds, indications for in-hospital treatment of COVID-19 were respiratory distress with need for oxygen supplementation, high fever or relevant clinical deterioration. This study was approved by the local ethics committee (EKZN, 2020-01306). Detailed description of the study methodology has been reported previously [6, 15] . A confirmed SARS-CoV-2 infection was defined as a combination of typical clinical symptoms (e.g., respiratory symptoms with or without fever, and/or pulmonary infiltrates and/or anosmia/dysgeusia) and a positive real-time reverse-transcription polymerase chain reaction (RT-PCR) test, obtained from nasopharyngeal swabs or lower respiratory tract samples, according to guidance by the World Health Organization (WHO) [16, 17] . Data for the second wave also included patients with positive rapid-antigen tests. However, due to their lower positive predictive value, we excluded asymptomatic patients unless their rapid-antigen results were confirmed by a positive RT-PCR test. We further excluded patients from the analysis if they did not provide general informed consent or if they had not yet been discharged when data collection was closed (January 20, 2021). This study adheres to the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) statement for reporting of prediction models. All analysed data were collected as part of the clinical routine during the hospitalisation (from admission to discharge or death). We performed chart reviews and automatic export from electronic health records (EHR), including vital signs and clinical characteristics upon admission as well as sociodemographic factors, comorbidities based on pre-existing diagnoses and home medication. COVID-19-specific inpatient medication was assessed until hospital discharge or death and exported from the EHR. Experimental treatment was offered to all suitable patients according to ongoing clinical trials and WHO guidelines [16] [17] [18] . During the second wave, this also included the application of high-dose glucocorticoids [19] . The age-adjusted Charlson comorbidity index (ACCI) [20] and the Clinical Frailty Scale score (CFS) [21] were calculated for all patients as part of the clinical routine or through chart review. Laboratory values were available according to clinical routine and derived from the first blood draw obtained within 7 days from admission. All-cause in-hospital mortality was defined as the primary endpoint. The secondary endpoint, disease progression, had different definitions in the original studies. For easier comparability between the scores, we defined disease progression as needing invasive ventilation, ICU admission or death in our own analysis. Originally, the CALL score defined progression as respiratory rate ≥ 30 bpm, SpO 2 ≤ 93%, PaO 2 / FiO 2 ≤ 300 mmHg, requiring mechanical ventilation or worsening of lung CT findings. CT findings were not available for our analysis and thus not considered. The definition of progression for the CHOSEN score was requirement of supplemental oxygen, admission to the ICU or death. Validation results were based on these original definitions. Discrete variables are expressed as frequency (percentage) and continuous variables as medians with interquartile ranges (IQR, for skewed data) or mean with standard deviation (SD, for normally distributed data). We used the Wilcoxon rank-sum test to compare continuous variables and the Pearson's chi-squared test to compare categorical or binary variables. Odds ratios (OR) were calculated with corresponding 95% confidence intervals (CI) as measures of association. We assessed calibration for mortality numerically by tabulating the observed risks against those reported in the original studies. These were not available for the CALL and CHOSEN scores. We considered a two-sided p-value of < 0.05 significant and calculated the unadjusted area under the receiver operating characteristic curve (AUC) as a measure of discrimination. Statistical analysis was performed as a complete-case-analysis based on the original regression coefficients using Stata 15.1 (StataCorp, College Station, TX, USA). Figure 1 provides an overview of the study flow and Table 1 shows overall patient demographics, comorbidities, laboratory values and vital signs on admission as well as stratified according to the individual score cohorts. In total, 399 patients hospitalised with a confirmed SARS-CoV-2 infection were included in this analysis (mean age 66.6 years ± 13.4 SD, 68% male). Complete data sets to allow for the calculation of the CALL and CHOSEN score were available in 297 and 380 patients, respectively. Fewer patients had all values necessary to calculate the HA 2 T 2 (n = 151) and ANDC score (n = 124). There were several noticeable differences between the score cohorts, for example, transfer rates from other hospitals (range from 14.5% for ANDC to 28.5% for HA 2 T 2 ), supplemental oxygen (29.8% for CALL to 45.7% for HA 2 T 2 ), obesity (30.8% for CHO-SEN to 41.7% for ANDC) and ICU admission (19.5% for CHOSEN to 46.4% for HA 2 T 2 ). However, overall comorbidity and frailty were similar. Table 2 shows the discriminative power of each score for mortality and disease progression (defined as requiring invasive ventilation, ICU admission or death for all scores for easier comparability). For mortality, the HA 2 T 2 performed best (AUC 0.78, 95%-CI 0.70-0.85). For progression, overall discriminative capacity was lower, with the CHOSEN score performing slightly better than the others (AUC 0.66, 95%-CI 0.72-0.60). All scores were associated with mortality. Sensitivity and specificity as well as positive and negative predictive value for each proposed cut-off are summarised in Table 3 and visualised in Fig. 2 . The negative predictive value of the CALL score was highest (≥ 6 points: 100%, 95%-CI 75.3-100), while the highest positive predictive value was found for the HA 2 T 2 score (≥ 3 points: 58.6%, 95%-CI 38.9-76.5). The direct comparison with the original outcomes can be found in Table 4 . Only the HA 2 T 2 score performed similarly with an AUC of 0.78 (95%-CI 0.72-0.84) in the original validation cohort and an AUC of 0.78 (95%-CI 0.70-0.85) in our sample. The discriminative power for all other scores was markedly worse in comparison with their respective original cohorts. These results persisted when performed in the cohort with full data sets for all scores (n = 67, data not shown). The calibration assessment for mortality for the HA 2 T 2 and ANDC scores can be found in the additional files 1 and 2 (Tables S1 and S2). Overall, calibration was poor, with the ANDC score performing slightly better (overprediction Fig. 1 Overview of study flow. In total, 399 patients were included in the final analysis, 67 of whom had complete data sets available up to 18 percentage points) than the HA 2 T 2 score (underprediction up to 30 percentage points). Calibration for the CALL and CHOSEN scores were not possible due to lacking published data. In this validation study, four currently available scores to predict mortality and disease progression in COVID-19 patients performed markedly worse in patients hospitalised at a Swiss tertiary care centre than in their original cohorts. The HA 2 T 2 score showed the best discrimination for mortality (AUC 0.78, 95%-CI 0.70-0.85) and the only results similar to the derivation cohort. Some loss of predictive ability can be explained by the differences between our study population and the original derivation cohorts. This is most apparent when comparing age, which has been recognised as an important risk factor for worse outcomes [22] and is included in all four scores. Mean age ranged from 44 to 65 years for the CALL, CHOSEN, HA 2 T 2 and ANDC scores in the original publications whereas the mean age in our population was 67 years. However, even when comparing the scores among the 67 patients who had all parameters required for all scores, the HA 2 T 2 score showed the best discriminative power (data not ANDC Score Fig. 2 Survival time analysis for a CALL score, b CHOSEN score, c HA 2 T 2 score, d ANDC scores and their respective cut-off subgroups shown). Apart from the small sample size, further limitations in this comparison arise from the fact that the study populations were also different in their origins. The CALL and ANDC scores were based on Chinese patients while the CHOSEN and the HA 2 T 2 score were derived in US American patients. Interestingly, the other currently available external validations of the CALL score in Italian and Turkish patients resulted in AUCs that were very similar to our own (original AUC for disease progression 0.91 vs. Italian AUC 0.62, Turkish AUC 0.59, our AUC 0.61) [14, 23] . Hence, it seems that compatibility and comparability of these scores for different populations cannot be assumed. Further difficulties are rooted in the novelty of COVID-19. Much is still unknown about the disease including which factors best predict progression or mortality. This is reflected in the very different factors included in the scores. Still, these more recent approaches are already an improvement to initial scores which included up to 12 different items, making them difficult to use in a clinical setting [24] . However, in a busy environment such as the emergency department, ease of use is crucial. The scores discussed here all use no more than four variables that are relatively readily available in middle-to high-income countries. There also exists a simplified version of the CHOSEN score that does not rely on laboratory values but did also not perform as well in the original cohort [11] . All scores were significantly associated with mortality and their respective discriminative capacities were moderate to good but calibration was poor due to considerable population differences. Furthermore, the negative predictive value of the CALL score was particularly high and could thus help identify patients who are not at risk. The CHOSEN score, whose explicit aim was to differentiate between patients who needed hospitalisation and those who could be sent home safely, also had a high negative predictive value and, in addition, showed a relatively balanced relation between sensitivity and specificity, making it a potentially valuable tool for risk stratification. Since we did not include outpatients in our study, our results are likely to underestimate the true value of the CHOSEN score. There are certain limitations to our study. First, our findings are limited to hospitalised patients in a single centre in Switzerland, limiting generalisability. In addition, baseline parameters of our population were markedly different from the original study populations including ethnicity and important predictors such as age. Unfortunately, regression coefficients could not be updated based on the available data. Similarly, we could not calculate calibration for the CALL and CHOSEN score. Internal validity is also limited due to the retrospective design, which meant that a considerable proportion of patients had to be excluded from certain score cohorts because the required data were missing. Additional validation analyses should be conducted in larger data sets. Furthermore, troponin and d-dimer values (required for the HA 2 T 2 and ANDC scores, respectively) were usually available for sicker patients who reached the primary and secondary endpoints more often, which not only limited study population sizes but also comparability between scores. Finally, we had to exclude four patients due to missing outcome data, thus increasing the risk for selection bias. In our independent validation, the four analysed scores performed worse than in their original cohorts regarding prediction of mortality and disease progression. However, all scores were significantly associated with mortality. While the HA 2 T 2 score identified high risk patients, the negative predictive values of the CALL and CHOSEN scores allowed reliable identification of patients at low risk, which may make them suitable for outpatient management. Clinical characteristics of coronavirus disease 2019 in China Presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with COVID-19 in the New York City Area Characteristics, interventions, and longer term outcomes of COVID-19 ICU patients in Denmark-A nationwide, observational study Clinical characteristics and outcomes of 905 COVID-19 patients admitted to imam khomeini hospital complex in the capital city of Tehran Baseline characteristics and outcomes of 1591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy Region Characteristics, predictors and outcomes among 99 patients hospitalised with COVID-19 in a tertiary care centre in Switzerland: an observational analysis Clinical characteristics and mortality predictors of COVID-19 patients hospitalized at nationally-designated treatment hospitals Patient characteristics and predictors of mortality in 470 adults admitted to a district general hospital in England with Covid-19 Baseline phenotype and 30-day outcomes of people tested for COVID-19: an international network cohort including > 3.32 million people tested with real-time PCR and > 219,000 tested positive for SARS-CoV-2 in South Korea, Spain and the United States. medRxiv Prediction for progression risk in patients with COVID-19 pneumonia: the CALL score Derivation of a clinical risk score to predict 14-day occurrence of hypoxia, ICU admission, and death among patients with coronavirus disease 2019 Troponin and other biomarker levels and outcomes among patients hospitalized with COVID-19: derivation and validation of the HA2T2 COVID-19 mortality risk score ANDC: an early warning score to predict mortality risk for patients with Coronavirus Disease The CALL score for predicting outcomes in patients with COVID-19 Comparison of characteristics, predictors and outcomes between the first and second COVID-19 waves in a tertiary care centre in Switzerland: an observational analysis Clinical management of severe acute respiratory infection when novel coronavirus (nCoV) infection is suspected: interim guidance Clinical management of COVID-19: interim guidance Repurposed antiviral drugs for Covid-19-Interim WHO solidarity trial results Living Guidance 2 Validation of a combined comorbidity index Clinical frailty scale in an acute medicine unit: a simple tool that predicts length of stay Clinical features of COVID-19 mortality: development and validation of a clinical prediction model Application of CALL score for prediction of progression risk in patients with COVID-19 at university hospital in Turkey A clinical risk score to identify patients with COVID-19 at high risk of critical care admission or death: an observational cohort study We thank all participating patients, their families and all healthcare workers at the Cantonal Hospital Aarau for their help and dedication to reduce the burden of the ongoing pandemic.Author contributions PS and SW conceived of the study and its design. SW performed the statistical analysis and wrote the first draft of the paper. SW, CG, DK, LB, AH, CH, CM and SH collected and compiled the data. CG, PS, AK, CF, TS and BM critically revised the manuscript. All authors read and approved the final manuscript.Funding This study was funded by the Research Council KSA (Kantonsspital Aarau). The funding agency had no bearing on the study design, data collection and analysis or writing of the manuscript. The datasets used during the current study are available from the corresponding author on reasonable request. The authors declare that they have no competing interests.Ethics approval and consent to participate This study was approved by the local ethics committee (EKZN, 2020-01306).Consent for publication Not applicable. The online version contains supplementary material available at https:// doi. org/ 10. 1007/ s15010-021-01728-0.