key: cord-0951233-v4ooc3vo authors: Au-Yong, Iain; Higashi, Yutaro; Giannotti, Elisabetta; Fogarty, Andrew; Morling, Joanne R.; Grainge, Matthew; Race, Andrea; Juurlink, Irene; Simmonds, Mark; Briggs, Steve; Cruikshank, Simon; Hammond-Pears, Susan; West, Joe; Crooks, Colin J.; Card, Timothy title: Chest Radiograph Scoring Alone or Combined with Other Risk Scores for Predicting Outcomes in COVID-19 date: 2021-09-14 journal: Radiology DOI: 10.1148/radiol.2021210986 sha: 52283283b676aaca17af6b77b81728c24946d267 doc_id: 951233 cord_uid: v4ooc3vo BACKGROUND: Radiographic severity may predict patient deterioration and outcomes from COVID-19 pneumonia. PURPOSE: To assess the reliability and reproducibility of three chest radiograph reporting systems (RALE, Brixia, and percentage opacification) in proven SARS-CoV-2 and examine the ability of these scores to predict adverse outcomes both alone and in conjunction with two clinical scoring systems: NEWS2 and ISARIC-4C mortality. MATERIALS AND METHODS: This retrospective cohort study used routinely collected clinical data of PCR-positive SARS-CoV-2 patients admitted to a single UK center from February 2020 until July 2020. Initial chest radiographs were scored for RALE, Brixia, and percentage opacification by one of three radiologists. Intra- and inter-rater agreement was assessed with Intraclass correlation coefficients. The rate of ICU admission or death until 60 days after scored chest radiograph was estimated. NEWS2 and ISARIC-4C mortality, on hospital admission were calculated. Daily risk of admission to ICU or death was modelled with Cox proportional hazards models, incorporating the chest radiograph scores adjusted for NEWS2 or ISARIC-4C mortality. RESULTS: Admission chest radiographs of 50 patients (mean age, 74 years +/-16 [sd], 28 men) were scored by all 3 radiologists, with good inter-rater reliability for all scores (ICCs (95% CIs) of for RALE 0.87 (0.80, 0.92), BRIXIA 0.86 (0.76, 0.92), and percentage opacification 0.72 (0.48, 0.85)). Of 751 patients with chest radiograph, those with >75% opacification had a median time to ICU admission or death of just 1-2 days. Among 628 patients with data (median age 76 years (IQR 61 – 84), and 344 were men), 50-75% opacification increased risk of ICU admission or death by twofold (1.6 - 2.8), and over 75% opacification by 4 fold (3.4 – 4.7), compared to a 0-25% opacification when adjusted for NEWS2 score. CONCLUSION: BRIXIA, RALE, and percent opacification scores all reliably predicted adverse outcomes in SARS-CoV-2. See also the editorial by Little. Background Radiographic severity may predict patient deterioration and outcomes from COVID-19 pneumonia. To assess the reliability and reproducibility of three chest radiograph reporting systems (RALE, Brixia, and percentage opacification) in proven SARS-CoV-2 and examine the ability of these scores to predict adverse outcomes both alone and in conjunction with two clinical scoring systems: NEWS2 and ISARIC-4C mortality. This retrospective cohort study used routinely collected clinical data of PCR-positive SARS-CoV-2 patients admitted to a single UK center from February 2020 until July 2020. Initial chest radiographs were scored for RALE, Brixia, and percentage opacification by one of three radiologists. Intra-and inter-rater agreement was assessed with Intraclass correlation coefficients. The rate of ICU admission or death until 60 days after scored chest radiograph was estimated. NEWS2 and ISARIC-4C mortality, on hospital admission were calculated. Daily risk of admission to ICU or death was modelled with Cox proportional hazards models, incorporating the chest radiograph scores adjusted for NEWS2 or ISARIC-4C mortality. The reference standard for diagnosis of SARS-CoV-2 is the reverse transcription polymerase chain reaction (PCR) testing, owing to its high specificity, despite limitations in its sensitivity [1] . Chest radiography has been shown to have limited sensitivity and specificity for identification of patients with SARS-CoV-2 [3] but can help in identifying patients with the disease [3] [4] [5] . Also, a number of studies have demonstrated that the severity of lung involvement on chest radiography in SARS-CoV-2 is closely correlated with a number of key outcomes for patients including intensive care unit (ICU) admission and death [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] . A number of scoring systems have been described in the literature for assessing the severity of lung involvement in SARS-CoV-2. The most widely employed and studied systems include Brixia [7, 12, 21] and radiographic assessment of lung oedema (RALE/modified RALE) [6, 13, 14, 16, 22] . A large study has not yet been performed comparing all three of BRIXIA, RALE and percentage opacification. During the pandemic, severity of illness and risk of need for escalation of care could be predicted using clinical scoring systems. In the UK, two examples are the ISARIC-4C mortality score and NEWS2 [24, 25] . The utility of these scores has been established independent of chest radiology in previously published studies [23] [24] [25] [26] [27] . We aimed to assess the reliability, reproducibility, and ability to predict intensive care admission or death (adverse outcomes) of three chest radiograph reporting systems (RALE, Brixia, and percentage opacification) in patients with SARS-CoV-2. We aimed further to examine the ability of these scores to predict adverse outcomes of SARS-CoV-2 both alone and in conjunction with two clinical scoring systems: NEWS2 and ISARIC-4C mortality. A retrospective, observational cohort study was conducted at Nottingham University Hospitals (NUH) NHS Trust, UK. NUH serves a population of approximately 2.5 million with 1,500 inpatient hospital beds. Using electronic health records within NUH, all patients admitted with rt-PCR proven SARS-CoV-2 and a chest radiograph were selected consecutively between February 2020 and July 2020 as described elsewhere [29] . Patients whose first chest radiograph was performed after ICU admission were excluded as they had already reached a study end point. There were no additional exclusions and a sample size calculation indicated more than 500 patients were required for a multivariate prediction model with more than 10 parameters [29] . We had follow up for discharge, subsequent admissions to NUH and death both in and outside of hospital (via the NHS Patient Demographics Service). At NUH a plain, single view, PA or AP chest radiograph was routinely performed for patients admitted with proven or suspected SARS-CoV-2 infection. The first chest radiograph obtained during the admission relating to the PCR was selected for I n p r e s s analysis. During the early part of the pandemic, reporting was not uniform and no specific scoring system was used. Thus, all chest radiographs were reassessed using the RALE [22] , BRIXIA [7, 21] , and percent opacification scores [18] . Appendix E1 and Table 1 gastrointestinal and breast radiology, measured the time to score images (totaled across multiple chest radiographs) using a stopwatch. For validation of intra-rater reliability, a second read of each chest radiograph was performed by each of the radiologists, independent of the first read, at least one week later. Once reliability of scoring was established, the entire cohort of consecutive SARS-CoV-2 PCR positive patients was divided so that each chest radiograph was scored I n p r e s s by one of the radiologists: each radiologist therefore applied all three scores to approximately one third of the chest radiographs. The radiologists were aware of the PCR status but were blinded to any other clinical information. Collection of other data Data collection beyond the scoring of chest radiographs has been fully described elsewhere [29] Briefly, data were extracted from electronic records including demographics, clinical decisions on eligibility for escalation (an assessment with no fixed rules based upon a physician's assessment of the patient's frailty, co-morbidity and the patient's wishes), ward type (standard inpatient ward, ICU), and all other data needed to calculate ISARIC-4C mortality and NEWS2 scores (Table E1 ). Data were extracted from the date of relevant hospital admission, (or of suspicion of SARS-CoV-2 if preceding the date of admission from an ED attendance) until admission to ICU, discharge from hospital, or in-hospital death. The time between both earliest suspected diagnosis and earliest confirmed diagnosis, and the time of the chest radiograph were also calculated. These data were anonymized before delivery to the research team and used to calculate NEWS2 and ISARIC-4C mortality scores. A P value of < 0.05 was used as a threshold for significance where relevant. Reproducibility and reliability of scoring One author (TC) conducted examined inter-and intra-observer reliability using intraclass correlation (ICC) for each of RALE, BRIXIA, and percentage opacification scores.. The ICC for inter-rater agreement was calculated using a 2-way random I n p r e s s effects model for absolute agreement and individual raters . A similar model with mixed effects was used for assessing intra-rater agreement. One author (CC) conducted the cohort analysis. Follow-up started on the date of the reviewed chest radiograph and was continued for 60 days with complete follow up for death or admission to ICU. Baseline demographics on the day of the chest radiograph were tabulated against the worst outcome observed over the follow up period and stratified by the initial decision on the eligibility of the patient for escalation since potential patient outcomes are altered by this assessment. The median and interquartile range for each chest radiograph was tabulated and categorized (0=none, 1=<25%, 2=25-50%, 3=50-75%, 4=>75%. for % opacification, 0-12,13-24, 25-36, 37-48 for RALE and 0-5, 6-10, 11-15, 16-18 for BRIXIA). Kaplan-Meier survival curves were plotted and stratified by the categories of each score. 95% confidence intervals were calculated and estimated median survival times presented where plausible for each category. Differences in overall survival by opacity score quantile were tested using the log rank test. Finally, we assessed the additional predictive contribution of the opacity scores compared to previously published scores. Cox proportional hazard models were fitted using either ISARIC-4C mortality or NEWS2 calculated on the day of the reviewed chest radiograph, and then each opacity score added to the model. AIC I n p r e s s was used to compare models, and Harrell's C-statistic to measure changes in discrimination. Models were stratified by the initial escalation decision. The initial 50 selected chest radiographs were performed in 50 different patients between February 22, 2020 and May 27, 2020. The patients ranged in age from 40 to 95 years, mean 74 (sd +/-16) years, and 28/50 (56%) were men. Inter-rater reliability was good for all 3 scores with ICCs (95% CIs) being 0.87 (0.80-0.92) for RALE score, 0.86 (0.76-0.92) for BRIXIA score and 0.72 (0.48-0.85) for percentage opacification (Table E2 ). Intra-rater reliability was also good with ICCs being 0.86 For the further adjusted survival analysis by ISARIC-4C mortality or NEWS scores, I n p r e s s years), and 344 (57%) were men. The adjusted analysis was also stratified by the decision for escalation; 266 who were eligible and 362 were ineligible. Patients were followed up for 60 days from the date of their initial chest radiograph. The median and interquartile range between the absolute time of chest radiograph and earliest suspected diagnosis was 6 hours 40 minutes (72 minutes -69 hours). Between confirmed diagnosis and chest radiograph was 10 hours 43 minutes (1 hour 38 minutes -60 hours). Patients' baseline demographics are shown by their worst recorded outcomes during this follow up period in Table 1 . Among the patients not eligible for escalation who eventually died, the median percentage opacification was 20% (IQR 4-45%), and among those who survived the median percentage opacification was lower at 10% (IQR 0 -30%). Overall median percentage opacification was higher among patients who were eligible for escalation, 20% (5, 30) for those who did not need escalation, and 40% both for those were either escalated to ICU or died (IQR 22-62% and 10-65% respectively). There was a similar pattern for the other opacity scores (Table 1) . Discrimination increased for all percentage opacification, BRIXIA and RALE scores. For example, the percentage opacification, BRIXIA and RALE scores, adjusted for the ISARIC-4C mortality score, increased the C statistic from 0.58 to 0.65, 0.65, and 0.66 respectively) and improved the corresponding goodness of fit from R 2 0.06 to 0.13, 0.15, and 0.15 respectively as shown in Table 2 . This was also seen when stratified by the decision of eligibility for escalation in Tables 3 and 4 : for example in patients eligible for escalation, for each score adjusted for ISARIC-4C mortality, the C statistic improved from 0.63 to 0.72, 0.72, and 0.71 respectively, and the corresponding R 2 from 0.07 to 0.19,0.20, and 0.21 respectively. Models and opacity scores had greater discrimination and goodness of I n p r e s s fit in patients eligible for escalation than those who were ineligible; for example for comparison patients ineligible for escalation adjusted for the ISARIC-4C mortality score the C statistic improved from 0.56 to 0.60, 0.60, and 0.61 respectively for each score, and the corresponding R 2 from 0.03 to 0.07, 0.09, 0.11 respectively). The highest discrimination was amongst patients eligible for escalation using RALE and NEWS2 scores (C statistic 0.77 and R 2 0.27) The identification of SARS-CoV-2 patients at high-risk of deterioration is important for management. Such prediction may aid triage, enable timely prevention of further deterioration, and aid resource allocation. This cohort study in a single English city aimed to examine the impact combining established clinical scores predicting adverse outcomes with a chest radiograph severity score in order to predict key outcomes. We demonstrate that chest radiograph severity in SARS-CoV-2 patients can be rapidly and reliably reported with RALE, Brixia, and percentage opacification scoring systems. All 3 scores were calculated in under one minute, percentage opacification being the fastest. We found that higher scores were strongly associated with ICU admission or death up to 60 days after SARS-CoV-2 diagnosis, with overall adjusted 3-4 fold increased risk of ICU or death associated with over 75% opacification compared to 0-25% opacification on a radiograph. Those in the highest categories of each score had a median time to ICU admission or death of just 1-2 days. Our data demonstrate that all methods of chest radiograph severity scoring can improve prediction of admission to ICU or death when added to the ISARIC-4C mortality and NEWS2 scores. In our cohort, the best approach by discrimination for predicting outcome on admission with SARS-CoV-2 was to combine a chest radiograph severity score with the NEWS2 score (C statistic 0.77 for patients eligible for escalation using RALE and NEWS2 score) . Our cohort had a full 60-day follow-up without loss of any patients to follow-up, stratified by eligibility for escalation of care, in contrast to other published literature [6, 7, 12, 13, 14, 16, 21, 22] . Notably, the chest radiograph scores were less discriminative in patients who were not eligible for escalation. This presumably I n p r e s s relates to greater contribution of other factors such as increasing age, frailty, and comorbidity in this group. Scoring systems such as NEWS [30, 31] , and ISARIC-4C mortality [32] , can help to predict the need for escalation in these patients. But neither they nor the closely related ISARIC-4C deterioration score[33] (assessing requirement of ventilatory support or critical care, or death) incorporate a chest radiograph scoring system. There has been recent interest in the potential of scoring radiographic severity of SARS-CoV-2 pneumonitis changes to predict important patient outcomes. A number of studies report such data [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] . Our study focusses on percentage opacification, a simple and readily applicable scoring system, comparing it directly to BRIXIA and RALE scores that have been studied more extensively. To our knowledge, only one other study has examined a percentage opacification score in relation to patient outcomes[18]. One advantage of percentage opacification is that it is quick and straightforward to calculate, reflected in its speed of calculation compared with the BRIXIA and RALE. The latter scores are also more complex to apply in clinical practice owing to incorporation of zonal scoring and the need to assess the nature and intensity of the parenchymal opacification. Thus, RALE and BRIXIA may be more challenging for junior medical staff to apply in practice and increased complexity may add to existing pressures on staff given the increasingly recognized risks of burnout [34] . Two other studies have also examined the impact of radiological severity scoring systems in predicting patient outcome. Maroldi et al [12] examined the BRIXIA score in a cohort of similar size. That study incorporated radiographic scoring for Finally, there has been interest in artificial intelligence (AI) in scoring of chest radiograph [6, 35] by comparing the performance of AI systems to severity scores in predicting patient outcomes. The studied AI algorithm incorporates scoring which appears similar to percentage opacification. The results of these AI studies suggest a potential future role for AI when calculating radiographic severity of SARS-CoV-2 pneumonitis, particularly if it could be combined with the calculation of an early warning score like NEWS2. Our study had limitations. It was a retrospective single center study, all patients were rRT-PCR-positive (clinical diagnoses of SARS-CoV-2 were not included) and only admitted patients who underwent chest radiography were studied which may limit the generalizability of our results particularly to mild cases. Also, the participating radiologists were not all chest radiologists but all highly experienced in chest I n p r e s s radiograph reporting, especially during the pandemic, thus reflecting real world practice. In conclusion, we show that three scoring systems for assessing chest radiograph severity can be calculated quickly for SARS-CoV-2 with good reliability and reproducibility. Overall, the higher the score, the worse the outcome for patients with SARS-CoV-2. Incorporation of all three scores into well-described risk prediction models substantially improved the ability to predict adverse outcomes of SARS-CoV-I n p r e s s I n p r e s s Note-IQR=Interquartile range BRIXIA: A chest radiograph severity scoring system RALE: Radiographic assessment of lung edema (A chest radiograph severity scoring system) NEWS: National Early Warning score I n p r e s s The lungs are divided into six zones, and the degree of opacification is scored as I n p r e s s follows: interstitial opacities, interstitial and alveolar opacities (interstitial predominate) and interstitial and alveolar opacities (alveolar predominate), scored as 1, 2, and 3, respectively. The patient has a Brixia score of 11 (1+2+2+1+2+3). The highest possible Brixia score is 18. (D) Percentage opacification is a simple visual estimate of the total percentage of lung parenchymal opacification. I n p r e s s Brixia, RALE and % opacification are systems for chest xray scoring of which details are given in appendix 1. Should RT-PCR be considered a gold standard in the diagnosis of COVID-19 Frequency and Distribution of Chest Radiographic Findings in Patients Positive for COVID-19 Chest x-ray in the COVID-19 pandemic: Radiologists' real-world reader performance Development and Validation of Risk Prediction Models for COVID-19 Positivity in a Hospital Setting Clinical and Epidemiological Characteristics of 1,420 European Patients with mild-to-moderate Coronavirus Disease 2019 Initial chest radiographs and artificial intelligence (AI) predict clinical outcomes in COVID-19 patients: analysis of 697 Italian patients Chest X-ray severity index as a predictor of in-hospital mortality in coronavirus disease 2019: A study of 302 patients from Italy Performance of a Severity Score on Admission Chest Radiograph in Predicting Clinical Outcomes in Hospitalized Patients with Coronavirus Disease (COVID-19) Chest radiograph at admission predicts early intubation among inpatient COVID-19 patients The association of chest radiographic findings and severity scoring with clinical outcomes in patients with COVID-19 presenting to the emergency department of a tertiary care hospital in Pakistan Prognostic factors in patients admitted to an urban teaching hospital with COVID-19 infection Which role for chest x-ray score in predicting the outcome in COVID-19 pneumonia? Clinical Features and Chest Imaging as Predictors of Intensity of Care in Patients with COVID-19 Clinical and Chest Radiography Features Determine Patient Outcomes In Young and Middle Age Adults with COVID-19 Chest radiographs may assist in predicting the outcome in the early phase of Covid-19. UK district general hospital of Covid-19 first wave Racial/Ethnic Disparities in Disease Severity on Admission Chest Radiographs among Patients Admitted with Confirmed COVID-19: A Retrospective Cohort Study Chest x-ray severity score in COVID-19 patients on emergency department admission: a two-centre study Chest X-ray for predicting mortality and the need for ventilatory support in COVID-19 patients presenting to the emergency department Proposed Scoring System for Evaluating Clinico-radiological Severity of COVID-19 using Plain Chest X-ray (chest radiograph) changes (CO X-RADS): Preliminary results Correlation of chest radiography findings with the severity and progression of COVID-19 pneumonia COVID-19 outbreak in Italy: experimental chest X-ray scoring system for quantifying and monitoring disease progression Severity scoring of lung oedema on the chest radiograph is associated with clinical outcomes in ARDS National Early Warning Score 2 (NEWS2) on admission predicts severe disease and in-hospital mortality from Covid-19 -A prospective cohort study Predicting In-Hospital Mortality in COVID-19 Older Patients with Specifically Developed Scores Predicting severe COVID-19 in the Emergency Department Utility of established prognostic scores in COVID-19 hospital admissions: Multicentre prospective evaluation of CURB-65, NEWS2 and qSOFA NEWS2 is a valuable tool for appropriate clinical management of COVID-19 patients Chest X-ray in new Coronavirus Disease 2019 (COVID-19) infection: findings and correlation with clinical outcome Predicting the need for escalation of care or death from repeated daily clinical observations and laboratory results in patients with SARS-CoV-2 during 2020: A retrospective population-based cohort study from the United Kingdom p r e s s Opacity scores have been divided into strata for analysis (% opacification -0-25%, 26-50%, 51-75% and 76-100% opacification, BRIXIA -0-5, 6-10,11-15 and 16-18 BRIXIA scores, and RALE score -0-12, 13-24, 25-36 and 37-48 RALE scores) with in each case the lowest category as the reference group for analysis.(95% confidence interval) I n p r e s s Opacity scores have been divided into strata for analysis (% opacification -0-25%, 26-50%, 51-75% and 76-100% opacification, BRIXIA -0-5, 6-10,11-15 and 16-18 BRIXIA scores, and RALE score -0-12, 13-24, 25-36 and 37-48 RALE scores) with in each case the lowest category as the reference group for analysis.(95% confidence interval) I n p r e s s to be reproducible and correlate with outcomes [22] . There is also some evidence of its utility in SARS-CoV-2 [28] . It involves dividing the chest radiograph into quadrants. Each quadrant was assessed for the density and extent of opacities. Intensity was described as hazy, moderate, or dense and scored as 1, 2, or 3. The extent of opacities were scored as follows: 0=none, 1=<25%, 2=25-50%, 3=50-75%, 4=>75%. These two scores were multiplied together giving the overall score for the quadrant. This was repeated for the four quadrants scores are summed, to give a maximum possible score of 48. The BRIXIA score ( fig 3C) was developed for assessment of severity of SARS-CoV-2 lung involvement and shown to predict mortality in a cohort of Italian patients undergoing chest radiography [7] .It involves dividing each lung into three equal sections, which were assessed for the presence of infiltrates. Each section was scored as follows: interstitial infiltrates, interstitial and alveolar infiltrates (interstitial predominate) and interstitial and alveolar infiltrates (alveolar predominate), scored as 1, 2, and 3, respectively. This was repeated for each section to give a maximum total score of 18. Percentage opacification ( fig 3D) was a visual estimate of the overall percentage of the visualized lungs involved by any form of opacification in 1% increments. There was some evidence of the utility of this simple score in SARS-CoV-2 patients[18], which was straightforward to implement in clinical practice.