key: cord-0927850-e1j2auxg authors: Burdick, Hoyt; Lam, Carson; Mataraso, Samson; Lynn-Palevsky, Anna; Braden, Gregory; Dellinger, R. Phillip; McCoy, Andrea; Vincent, Jean-Louis; Green-Saxena, Abigail; Barnes, Gina; Hoffman, Jana; Calvert, Jacob; Pellegrini, Emily; Das, Ritankar title: Prediction of respiratory decompensation in Covid-19 patients using machine learning: The READY trial date: 2020-08-06 journal: Comput Biol Med DOI: 10.1016/j.compbiomed.2020.103949 sha: 1447fd3038ea3680baec971e0f5080d349ffadcf doc_id: 927850 cord_uid: e1j2auxg BACKGROUND: Currently, physicians are limited in their ability to provide an accurate prognosis for COVID-19 positive patients. Existing scoring systems have been ineffective for identifying patient decompensation. Machine learning (ML) may offer an alternative strategy. A prospectively validated method to predict the need for ventilation in COVID-19 patients is essential to help triage patients, allocate resources, and prevent emergency intubations and their associated risks. METHODS: In a multicenter clinical trial, we evaluated the performance of a machine learning algorithm for prediction of invasive mechanical ventilation of COVID-19 patients within 24 h of an initial encounter. We enrolled patients with a COVID-19 diagnosis who were admitted to five United States health systems between March 24 and May 4, 2020. RESULTS: 197 patients were enrolled in the REspirAtory Decompensation and model for the triage of covid-19 patients: a prospective studY (READY) clinical trial. The algorithm had a higher diagnostic odds ratio (DOR, 12.58) for predicting ventilation than a comparator early warning system, the Modified Early Warning Score (MEWS). The algorithm also achieved significantly higher sensitivity (0.90) than MEWS, which achieved a sensitivity of 0.78, while maintaining a higher specificity (p < 0.05). CONCLUSIONS: In the first clinical trial of a machine learning algorithm for ventilation needs among COVID-19 patients, the algorithm demonstrated accurate prediction of the need for mechanical ventilation within 24 h. This algorithm may help care teams effectively triage patients and allocate resources. Further, the algorithm is capable of accurately identifying 16% more patients than a widely used scoring system while minimizing false positive results. , caused by the novel coronavirus SARS-CoV-2, remains a public health emergency in the United States. The rapidly evolving evidence surrounding pharmaceutical treatments and the lack of established preventive resources has made the effective triage of COVID-19 patients challenging. Prognostic scores such as the Modified Early Warning Score (MEWS) [1] guide decision-making for the non-COVID-19 critically ill population [2] . However, literature examining the ability of these scoring systems to predict COVID-19 patient prognosis and mortality is limited, and recent research has suggested that discriminatory ability of such rules-based scores is moderate to poor [3] . Epidemiologic predictions indicate that hospitals will continue to see large numbers of COVID-19 patients in the coming months [4] [5] [6] . Patient triage will remain important to facilitate the effective allocation of limited resources. Early identification of patients who are at risk of decompensation and who are likely to need mechanical ventilation would enable physicians to more aggressively monitor these patients, which may facilitate a more controlled environment for intubation. Inadequate lead time and subsequent emergency intubation of critically ill patients is associated with known risks, including peri-intubation hypoxia, hypotension, arrhythmia, and cardiac arrest [7] . In an effort to address this growing need, researchers have begun to develop machine learning (ML)-based models for risk prediction critical illness development in J o u r n a l P r e -p r o o f patients. Liang et al [8] developed such a model and achieved strong performance in predicting a composite outcome including admission to the intensive care unit (ICU), invasive ventilation, or death, and reported an area under the curve (AUC) of 0.88. However, the model was only evaluated retrospectively. Although retrospective studies are useful for providing preliminary data and for guiding future research, many of these analyses are subject to threats in internal validity [9, 10] . Studies often fail to be replicated in prospective clinical settings, leaving uncertainty regarding the performance and the utility of the intervention in a live clinical setting [11, 12] . To assess how ML risk prediction models may assist with caring for COVID-19 patients in a clinical setting, we have performed the first prospective validation of a machine learning algorithm for the prediction of mechanical ventilation requirements in a COVID-19 positive population. In the READY clinical trial, we assessed the performance of a previously developed algorithm at five US health systems. All predictions were made two hours after the start of the patient encounter using patient data obtained within the first two hours of an emergency department (ED) visit. If the patient did not originate in the ED, data as used from the first two hours of hospital admission. The algorithm predicted the need for mechanical ventilation within the next 24 hours. Performance was compared to patient evaluation using MEWS, a score commonly used to identify likely patient deterioration and mortality. The primary endpoint of the study was mechanical ventilation within 24 hours of the prediction. J o u r n a l P r e -p r o o f 6 The remainder of this paper is organized as follows. Section 2 contains the study methods, including patient enrollment and data processing. Section 3 contains the study results. Section 4 contains the discussion, including study limitations. Section 5 contains the conclusion of the study. Patients who enrolled in the READY clinical trial visited the emergency department or were admitted to the hospital at five U.S. hospitals between March 24, 2020 and May 4, 2020. Patients were eligible for inclusion in the READY clinical trial if their first set of vital sign and lab measurements were taken within two hours of ED arrival or admission, and if they tested positive for COVID-19 by polymerase chain reaction (PCR) during their visit (Figure 1 ). In total, 197 patients were eligible for inclusion in our study. We enrolled all eligible patients that were admitted during the study period. Upon admission of an eligible patient to the ED or hospital, data collection of available vital sign and lab measurements began automatically. The first two hours of data were used to calculate both the machine learning algorithm risk prediction score and the comparison This study is considered to be of minimal risk for human subjects as data collection was passive and did not pose a threat to the subjects involved. All patient data was maintained in compliance with the Health Insurance Portability and Accountability Act (HIPAA). The project was approved by the Pearl Institutional Review Board with a waiver of informed consent under study number 20-DASC-122, and is registered on ClinicalTrials.gov under study number NCT04390516. The model was created using the XGBoost Classifier method for fitting "boosted" decision trees in Python [13] . Gradient boosting, which XGBoost implements, is an ensemble learning technique that combines results from multiple decision trees to create prediction scores. Each tree splits the patient population into smaller and smaller groups, successively. Each branch splits the patients who enter it into two groups, based on whether their value of some feature is above or below some threshold. For instance, a branch might divide patients according to whether they are male or female, then on the female branch whether their creatinine is above or below 0.97 mg/dl, the average creatinine level for women. If J o u r n a l P r e -p r o o f creatinine is above average, then the patient will continue to travel down the higher risk branch; if the creatinine value is absent, the algorithm will choose the default branch that results in more correctly classified patients in the training data; this may default to the low risk or high risk branch depending on the training data. This may default to the low risk or high risk branch depending on the training data. After some number of branches, the tree ends in a set of "leaves." Each patient falls into exactly one leaf, according to the values of his or her measurements. Table 1) . Missing values were left as "Not a Number" or empty placeholders, which are valid inputs to the model. Model prediction scores were therefore able to be calculated in the presence of missing data without imputing missing measurements. Specifically, each node in the decision tree has a default direction that should be traversed in the event that the feature in that node is missing. Imputation of missing measurements was therefore not performed. In total, 197 patients who received a positive diagnosis of COVID-19 were included in the study. Of these patients, 10 were placed on mechanical ventilation within 24 hours of the algorithm's prediction. Compared to the general patient population, those who tested positive for COVID-19 were likely to be older, more likely to be male, and more likely to receive an in-hospital diagnosis of acute respiratory distress syndrome (ARDS) or pneumonia ( Table 1) . Additional clinical information is presented in Supplementary [14] . The algorithm achieved higher sensitivity The READY study is the first clinical trial of a machine learning algorithm for the prediction of ventilation requirements among COVID-19 patients. We found that the ML algorithm predicted the need for mechanical ventilation within 24 hours among COVID-19 patients with high sensitivity and specificity. This work builds upon our prior work developing algorithms to predict patient outcomes including sepsis [15] , acute kidney injury [16] , mortality [17] , and patient stability and decompensation [18] . While machine learning algorithms have been applied to retrospective COVID-19 patient data, no equivalent algorithms have yet been validated in a prospective setting, despite urgent need. The high sensitivity and specificity achieved by the algorithm demonstrate that it is capable of accurate discrimination between COVID-19 patients at high risk versus low risk of requiring ventilation within 24 hours. The high sensitivity, in particular, suggests the algorithm is unlikely to provide false negative classifications and that patients in need of mechanical ventilation are therefore unlikely to be missed by the algorithm. Further, the algorithm's improvements in sensitivity as compared to the traditional scoring system show that the algorithm is capable of detecting 16% more patients who will be in need of mechanical ventilation; this is a meaningful improvement that can allow for effective patient triage and resource allocation. The algorithm also achieved this increase in sensitivity while demonstrating a higher specificity as compared to MEWS. This suggests the algorithm will produce a reduced false positive rate, which may enable more efficient allocation of clinician time and of resources. Physicians have reported difficulty in predicting the disease course of hospitalized COVID-19 patients, as well as difficulties in the identification of patients at high risk of rapid decompensation [19, 20] . Without the benefit of timely warnings, rapid and unexpected deterioration in patient conditions come with the high risks of emergency transfers to the ICU and emergency intubations. Emergency intubations, in particular, have welldocumented risks [7, 21, 22] , with at least one complication occurring in 22-54% of all intubations performed in critically ill patients [23] . Cook et al found that intubations in the ICU are associated with a more than 4-fold higher risk of death or brain damage as compared to intubations performed in the operating room; this may be attributable to a lack of preparedness due to the increased need for emergency intubations in the ICU setting [24] . Complications related to intubation are more likely in patients with limited pulmonary reserve, in patients with poor physiological status, and in patients for whom preoxygenation was not possible [22] . Receiving advance notice of patients for whom deterioration is more likely may allow care teams to better prepare for intubation procedures and minimize risk to the patient. Further, early identification of patients for whom ventilation will be required may allow physicians to minimize the risk of patient self-inflicted lung injury (P-SILI). Vigorous breathing and associated high transpulmonary pressures in patients with respiratory distress may contribute to the development of P-SILI [25] . Early intubation of patients requiring mechanical ventilation, when performed with sedation and physician control of mechanical power applied to the lung (determined by transpulmonary pressures and other ventilator-setting determined variables), may minimize the risk of P-SILI due to vigorous spontaneous breathing [26, 27] . Accurate and early predictions of risk of patient deterioration may improve patient triage procedures and resource allocation. The model predicted the need for mechanical ventilation using only routinely available labs and vital sign data. Demographic data was not required as in similar work [8] . Of note, the measurements used as inputs to our model were taken during the first two hours after ED arrival or hospital admission. Our model was also able to generate predictions in the absence of certain inputs. Because the algorithm was developed from real world EHR data that contained missing values, we do not anticipate missing values to have significantly affected the output of the model. This is because some data are missing may be the result of clinicians who may have deemed that it was not important to measure that particular vital sign or lab value. This can provide useful information about the patient in the form of "informative missingness" [28] . This model could therefore be used to identify which patients should be considered for direct admission to an area of more intensive monitoring, even if they appear stable at admission, to prevent emergency transfers and minimize patient morbidity. It is possible that patients at a high risk of requiring mechanical ventilation within 24 hours have progressed further along their disease course as compared to patients who were at low risk or, alternatively, are experiencing a more intense host response to the virus. High-risk patients may therefore benefit more from supportive or immunologic therapies than their low-risk counterparts, who may need only antiviral medications. Effective discrimination between these two groups may therefore have broad implications for future research into patient care beyond triage and admission decisions [29] . J o u r n a l P r e -p r o o f [34] . Several studies have explored the potential utility of machine learning for diagnosing and detecting COVID-19, largely using imaging data [35, 36] , though the area of patient decompensation prediction remains less explored. This study builds upon existing evidence about the ability of algorithms to successfully provide clinical decision support [15] [16] [17] [18] . However, there are several limitations to this study. First, while we included patients from several medical centers in our sample, the total sample remained relatively small and the outcome of mechanical ventilation within 24 hours of model prediction was rare in our sample. Building models in the emerging stages of a pandemic is difficult due to data limitations and uncertainty in the data. This is true of ultimately diagnosed with COVID-19. The focus of this study was to validate the performance of the predictive algorithm and our study protocol therefore did not directly examine physician response to algorithm alerts. Therefore, we cannot draw conclusions on the effect of patient alerts to influence clinician actions, or on patient outcomes. 19 patients within 24 hours of their initial hospital encounter. We found that our algorithm achieved significantly higher sensitivity (0.90) than MEWS, a scoring system commonly used to assess patient status and assign levels of care while maintaining a higher specificity (p < 0.05). This accurate advance warning of the need for mechanical ventilation of COVID-19 patients is important, as physicians have reported difficulty with predicting which patients are at high risk of rapid respiratory decompensation. Inadequate lead time and subsequent emergency intubation of critically ill patients is associated with significant known risks, including peri-intubation hypoxia, hypotension, arrhythmia, and cardiac arrest. Accurate advance warning can help improve COVID-19 patient outcomes and our algorithm is capable of detecting 16% more patients who will require invasive mechanical Validation of physiological scoring systems in the accident and emergency department Early Warning System Scores for Clinical Deterioration in Hospitalized Patients: A Systematic Review Comparing Rapid Scoring Systems in Mortality Prediction of Critically Ill Patients With Novel Coronavirus Disease Projecting hospital utilization during the COVID-19 outbreaks in the United States Centers for Disease Control and Prevention Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period Guidelines for the management of tracheal intubation in critically ill adults Development and Validation of a Clinical Risk Score to Predict the Occurrence of Critical Illness in Hospitalized Patients With COVID-19 Research methods: The concise knowledge base Threats to validity in retrospective studies Generalizing about Public Health Interventions: A Mixed-Methods Approach to External Validity External Validity: The Next Step for Systematic Reviews? XGBoost: A Scalable Tree Boosting System Clinical tests: sensitivity and specificity. Contin Educ Anaesth Crit Care Pain Effect of a machine learningbased severe sepsis prediction algorithm on patient survival and hospital length of stay: a randomised clinical trial Prediction of Acute Kidney Injury With a Machine Learning Algorithm Using Electronic Health Record Data Multicenter validation of a machinelearning algorithm for 48-h all-cause mortality prediction Discharge recommendation based on a novel technique of homeostatic analysis COVID-19 Respiratory Failure: Targeting Inflammation on VV-ECMO Support Extracorporeal Membrane Oxygenation in the Treatment of Severe Pulmonary and Cardiac Compromise in COVID-19: Experience with 32 patients Tracheal intubation in the ICU: Life saving or life threatening? Emergent airway management outside of the operating room -a retrospective review of patient characteristics, complications and ICU stay Strategies to improve first attempt success at intubation in critically ill patients Fourth National Audit Project. Major complications of airway management in the UK: results of the Fourth National Audit Project of the Royal College of Anaesthetists and the Difficult Airway Society. Part 2: intensive care and emergency departments Patient self-inflicted lung injury: implications for acute hypoxemic respiratory failure and ARDS patients on noninvasive support Mechanical Ventilation to Minimize Progression of Lung Injury in Acute Respiratory Failure Management of COVID-19 Respiratory Distress Recurrent Neural Networks for Multivariate Time Series with Missing Values Treatment of SARS-CoV-2: How far have we reached? Drug Discov Ther Comparison of CRB-65 and quick sepsis-related organ failure assessment for predicting the need for intensive respiratory or vasopressor support in patients with COVID-19 Acute Physiology and Chronic Health Evaluation II Score as a Predictor of Hospital Mortality in Patients of Coronavirus Disease Machine Learning to Predict Mortality and Critical Events in COVID-19 Positive New York City Patients. medRxiv Validating a Widely Implemented Deterioration Index Model Among Hospitalized COVID-19 Patients. medRxiv Forecasting Models for Coronavirus Disease (COVID-19): A Survey of the State-of-the-Art Deep Transfer Learning Based Classification Model for COVID-19 Disease Classification of COVID-19 patients from chest CT images using multi-objective differential evolution-based convolutional neural networks Finding an accurate early forecasting model from small dataset: A case of 2019-ncov novel coronavirus outbreak Composite Monte Carlo decision making under high uncertainty of novel coronavirus epidemic using hybridized deep learning and fuzzy rule induction