key: cord-1004668-409bic22 authors: Castro, Victor M.; Sacks, Chana A.; Perlis, Roy H.; McCoy, Thomas H. title: Development and External Validation of a Delirium Prediction Model for Hospitalized Patients With Coronavirus Disease 2019 date: 2021-03-05 journal: J Acad Consult Liaison Psychiatry DOI: 10.1016/j.jaclp.2020.12.005 sha: 5dd97bfbf76086c139ab1a9ea18c3e79a2fa5f48 doc_id: 1004668 cord_uid: 409bic22 BACKGROUND: The coronavirus disease 2019 pandemic has placed unprecedented stress on health systems and has been associated with elevated risk for delirium. The convergence of pandemic resource limitation and clinical demand associated with delirium requires careful risk stratification for targeted prevention efforts. OBJECTIVES: To develop an incident delirium predictive model among coronavirus disease 2019 patients. METHODS: We applied supervised machine learning to electronic health records data available at the start coronavirus disease 2019 inpatients admissions at three hospitals to build an incident delirium predictive model. We validated this model in three different hospitals. Both hospital cohorts included academic and community settings. RESULTS: Among 2907 patients across 6 hospitals, 488 (16.8%) developed delirium. Applying the predictive model in the external validation cohort of 755 patients, the c-index was 0.75 (0.71–0.79) and the lift in the top quintile was 2.1. At a sensitivity of 80%, the specificity was 56%, negative predictive value 92%, and positive predictive value 30%. Equivalent model performance was observed in subsamples stratified by age, sex, race, need for critical care and care at community vs. academic hospitals. CONCLUSION: Machine learning applied to electronic health records available at the time of inpatient admission can be used to risk-stratify patients with coronavirus disease 2019 for incident delirium. Delirium is common among patients with coronavirus disease 2019, and resource constraints during a pandemic demand careful attention to the optimal application of predictive models. 1. At a sensitivity of 80%, the specificity was 56%, negative predictive value 92%, and positive predictive value 30%. Equivalent model performance was observed in subsamples stratified by age, sex, race, need for critical care and care at community vs. academic hospitals. Conclusion: Machine learning applied to electronic health records available at the time of inpatient admission can be used to risk-stratify patients with coronavirus disease 2019 for incident delirium. Delirium is common among patients with coronavirus disease 2019, and resource constraints during a pandemic demand careful attention to the optimal application of predictive models. The neuropsychiatric consequences of coronavirus disease 2019 (COVID-19) are increasingly evident and include a consequential incidence of delirium. [1] [2] [3] [4] [5] [6] [7] [8] Delirium is a heterogenous neuropsychiatric syndrome characterized by acute changes in cognition and awareness, leading to fluctuations in attention, memory, or consciousness. [9] [10] [11] Delirium, regardless of underlying illness, is associated with multiple serious adverse clinical and functional outcomes, including increased rates of medical comorbidity, longer hospital stays, higher health care costs, and higher risk of postdischarge mortality. [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] Although treatment of delirium is symptomatic, multicomponent prevention is possible. [25] [26] [27] [28] [29] [30] Given the consequence and potential preventability of delirium, recognition is essential; however, in routine care, delirium is underrecognized. [31] [32] [33] [34] Systematic screening is an important approach to recognition of prevalent delirium, whereas risk stratification for incident delirium is a key component of a targeted delirium prevention program. [35] [36] [37] An electronic health record (EHR)-based approach is one means of both retrospectively studying delirium and, through clinical prediction, routing prevention efforts toward those who are vulnerable to delirium. [38] [39] [40] [41] Although these efforts are hampered by variability in delirium occurrence rates across patient and provider characteristics, 31, [42] [43] [44] delirium can be studied in the EHR to provide insights into both epidemiology and biology. 38, [45] [46] [47] [48] [49] [50] [51] COVID-19-associated delirium recapitulates the challenges of delirium in the context of any other underlying illness-including a more challenging posthospital course, under-recognition, and treatment uncertainty. [52] [53] [54] [55] Pandemic circumstances compound the challenges inherent in routine delirium management and prevention. The convergence of increased difficulty of implementing a multicomponent prevention program in the context of respiratory illness isolation precautions juxtaposed with the increased need for prevention in the setting of increase in demand for hospital beds creates a confluence of need for precision targeting of delirium prevention efforts. Recognizing both the clinical novelty of COVID-19associated delirium and the unique dynamism of prevention efforts in the context of a pandemic, we sought to extend previous work on EHR-based delirium prediction and present an EHR-based machine learning approach to predict COVID-19-associated delirium. The aim of this study is to develop a statistical model that can be applied to EHR data to predict the probability of a hospitalized patient with COVID-19 developing delirium. To achieve this aim, we extracted health records data from the EHR of 3 hospitals and applied a basic form of machine learning that both selects the most important variables and defines a regression model through which those variables can be mathematically combined to produce a risk score. After developing the model by identify the most predictive variables and the weights associated with each, we applied it to health records data drawn from 3 separate hospitals and evaluated how well it identified delirium in this second "testing" group. That evaluation consisted of 2 separate notions of performance -the ability to rank patients as higher and lower risk and to predict the risk in a given patient accurately. The study used 2 cohorts-one for model development and the second for model evaluation-each composed of 1 academic medical center (AMC) and 2 community hospitals. These cohorts were assigned in a cluster randomized fashion to ensure each cohort had 1 of the 2 AMCs. Although drawn from different hospitals, data handling and cohort definitions were the same in both cohorts. Both cohorts included all adults hospitalized between March 1, 2020 and May 31, 2020 at the 6 hospitals with polymerase chain reaction-confirmed severe acute respiratory syndrome coronavirus within 5 days of admission. For all individuals in the study, all clinical data, including vital signs, diagnostic codes, laboratory results, and medications, up through and including the hospital encounter, were extracted from the hospitals' EHRs and used to generate a study data mart. 56 Registration and demographic data were also extracted including body mass index at the time of hospitalization, lifetime smoking status, and zip code as area deprivation index. 57, 58 The study protocol was approved by the Partners Health Care Human Research Committee. No participant contact was required in this study which relied on secondary use of data produced by routine clinical care, allowing waiver of requirement for informed consent as detailed by 45 CFR 46.116. For laboratory tests and vital signs, we considered the first instance of each feature that occurred during the admission, inclusive of any time in the emergency department before admission. As an aggregate measure of comorbidity, Charlson comorbidity index was calculated using diagnostic codes from all available records before hospitalization. 59 To represent differential risk associated with specific co-occurring medical conditions, prior diagnostic codes were collapsed to the second level of the Healthcare Utilization Project Clinical Classification Software hierarchy. 60 As with our prior approach to encoding clinical history, the logtransformed count of the total number of prior diagnostic codes within each category was added as a possible feature. 61, 62 As a wide range of laboratory tests were available, only those tests available in 80% of individuals were included in subsequent analysis as continuous measures. This limited subset of continuous values was constrained at the 99th percentile to diminish the impact of extreme values but otherwise used in native units without transformation. In addition to continuous laboratory values, laboratory-specific reference flags of normality or abnormality were included as logical predictors. Vital signs, body mass index, smoking status, age, race, and ethnicity were all taken from predefined structured fields within the health record and those features which are numerical were analyzed in their clinical units (e.g., kg/m 2 for body mass index). Consistent with prior work, 63,64 medications were encoded as at the UMLS RxNorm ingredient level as the log-transformed count of ingredient orders and prescriptions over the 30 days before admission. 65, 66 Recognizing the particular importance of dementia as a risk factor for delirium in hospitalized patients, a previously published specific dementia feature was calculated for descriptive and secondary analysis. 67 Finally, the primary outcome of delirium at any point during the studied COVID-19 hospitalization was ascertained based on a previously published EHR definition, ported using the Centers for Medicare & Medicaid Services General Equivalence Mappings. 39, 68 Because delirium can be under reported in coded data, an alternative delirium definition which includes natural language processing of provider authored notes to identify patients who were delirious during their hospital course was used as a secondary outcome definition for sensitivity analysis on the main result. 31, 38 Study Design and Analysis Solely for purposes of description, the 2 cohorts of patients were pooled. This pooled cohort was characterized using appropriate summary statistics with differences between the delirium group and nondelirium group tested through univariate comparison (i.e., chi square test for binary variables, Student's t-test for continuous measures). For the primary predictive analysis, 2 distinct cohorts were used (one for model development or training, the other for independent validation). Each cohort included a single academic medical center and 2 community medical centers. All model development occurred in the training cohort. Once the final model was specified and fitted, that model's performance was characterized in the independent testing cohort. To identify a compact model that nevertheless considered the full range of available predictor features, we used L1-penalized regression-also known as the least absolute shrinkage and selection operator (Lasso). 69 This form of linear regression allows selection of the most relevant features for clinical prediction from among a large set of potential predictors and ultimately produce a simple linear model that can be readily reimplemented from tabled coefficients. Model fitting used all individuals at the training sites through median imputation of missing data. In the testing sites, participants with complete data were considered. The accuracy of the final model was characterized using conventional quantile-by-quantile comparison of predicted and observed outcome rate using the Hosmer-Lemeshow method, 70, 71 area under the receiver operating characteristic curve discrimination, 72-74 evaluation of cumulative probability distributions, 75, 76 decision curve analysis, [77] [78] [79] and analysis of the classification confusion matrix. 80 Optimization of Youden's index was used for reproducible selection of high risk thresholds that balance sensitivity and specificity except where otherwise noted. 81 Given the novelty of COVID-19, wide range of technical capacity across medical centers, and the wide range of area under the receiver operating characteristic curves (AUCs) found in delirium prediction studies, 47 we elected to take 2 approaches to developing references for comparisons: raw age and a logistic regression including only age and dementia history. First, and primarily, we used raw age in years on admissionolder age is reliably identified as a strong risk factor for delirium 82-84 -as a trivial risk ranking for comparison. Use of age for risk stratification is within any clinician's instantaneous ability to apply on the fly in the ward as it requires only knowledge of birthdays. Although simple, age has merit as age offers much greater granularity of risk stratification than systems based on one, or a small number of, categorical traits (e.g., history of a given diagnosis) without requiring any computational work to implement. As a secondary comparator, we developed a logistic regression model with 2 predictors: prior diagnosis of dementia and age. This 2 feature regression model has the advantage of calibrated predictions over raw age rank but requires a smaller number of feature to be engineered than the full Lasso model does, beyond number of features, the implementation effort is equivalent. All analysis used R, version 4. The training cohort included 2152 patients; of whom, 1151 were treated at the academic medical center ( Table 1 ). The testing cohort included 755 patients; of whom, 406 were treated at the academic medical center. The training cohort included 345 cases of delirium for a 16.0 case rate. The testing cohort included 143 cases of delirium for an 18.9 case rate. Pooling the 2 otherwise separate cohorts for descriptive purposes only ( Table 2 ) yields an average age of 62.9 years in a cohort which is 52.8% men (n = 1536) and 51.7% white (n = 1503). In the pooled cohort, delirium cases had an average age of 71.5 years, whereas those without a diagnosis of delirium had an average age of 61.1 years (Supplemental Figure 1 ). In the pooled cohort, 35.5% (n = 173) of delirium cases required care in an intensive care unit (ICU), whereas only 19.9% (n = 482) of cases without a diagnosis of delirium required ICU care. Similarly, 28.3% (n = 138) of the pooled cohort delirium cases had a prior diagnosis of dementia, whereas only 7.9% (n = 190) of those who did not develop delirium had a diagnosis of dementia. We used L1-penalized regression to train a delirium prediction model based on admission patient characteristics, vital signs, laboratory values, and medication and diagnostic history in the training cohort which consisted of 1 academic medical center and 2 community hospitals. Of the 783 features entered into the Lasso, 34 had nonzero coefficients. Although coefficients of a penalized model are of limited explanatory use, the final selected variables of the full model are tabled in Supplemental Table 1 and clinically notable for the inclusion of a wide range clinically plausible factors including central nervous system diagnostic codes, antibiotic orders, and age. We then applied the resulting model in the independent testing sample. This wholly independent testing sample was used to evaluate In the independent test set, AUC for the delirium prediction model was 0.75 (95% confidence interval 0.71-0.79; Figure 1 To contextualize the quality of the primary predictive model developed here, we treated age on admission as a naïve risk score. Age on admission produced an AUC of 0.65 (0.60-0.70). The optimal high-risk age cut point occurred at 63 years of age. Using 63 years of age as the risk threshold produced a sensitivity of 0.76 (0.68-0.83) and specificity of 0.46 (0.42-0.50). A sensitivity of 0.80 was achieved with a risk cut point of 61 years of age which produced a specificity of 0.42 (0.38-0.46). Age naturally lends itself to ranking higher and lower-risk patients, whereas it does not have a natural calibration for risk of delirium at a given age and thus only discrimination was evaluated as context. To contextualize the benefit of additional feature engineering work, a logistic regression model using age on admission and prior diagnosis of dementia was fitted in the training set and used to predict delirium risk in the independent test set. In the independent test set, the model was calibrated to the observed case rate (X 2 (23) = 23.0, P = 0.46) in quantile-by-quantile comparison and resulted in an AUC of 0.68 (0.63-0.72). The optimal high-risk age cut point by In secondary analysis of the primary predictive model stratified by patient characteristic, to understand whether it was likely to perform differentially in clinical cohorts with different patient distributions, AUCs were similar across subgroups (Figure 3 ). Among patients who were less than 65 years of age, the AUC was 0.77 (0.69-0.85), whereas among those who were 65 years of age or older, the AUC was 0.69 (0.63-0.75; Figure 3A ). Among male patients, the AUC was 0.74 (0.68-0.80), whereas among female patients, the AUC was 0.76 (0.70-0.82; Figure 3B ). Among black patients, the AUC was 0.76 (0.68-0.84), among white patients, the AUC was 0.71 (0.64-0.79), and among those of any other race, the AUC was 0.80 (0.73-0.86; Figure 3C ). Among patients who required management in an ICU, the AUC was 0.67 (0.59-0.76), whereas the AUC was 0.77 (0.71-0.82) among those who did not require ICU care ( Figure 3D ). The AUC was 0.74 (0.69-0.80) among those patients cared for at a community hospital, whereas the AUC was 0.75 (0.68-0.81) among those patients cared for at the testing set academic medical center ( Figure 3E ). Given the potential for rapidly evolving circumstances in the setting of pandemic illness and possible need to implement crisis standards of care, a decision curve analysis considered the full range of risk thresholds under both opt-in (Supplemental Figure 3 ) and opt-out models (Supplemental Figure 4 ) for the full model in the independent testing set. In this study of individuals with COVID-19, we developed a risk prediction model using clinical data available at time of inpatient hospitalization which yielded promising discrimination and calibration in an independent cohort. The AUC of 0.75 reached in this independent evaluation cohort with COVID-19 falls squarely in the previously reported range (0.52 to 0.94) of delirium prediction captured in a recent systematic review of the topic. 47 Given the previously described risk of predictive model bias, 85, 86 it is also important to note that model performance was consistent across patient characteristics (age, sex, and race) and clinical context (ICU vs. ward and tertiary care vs. community hospital). The full machine learning model consistently outperformed raw age as a simple means risk ranking for comparison. However, the naïve age model's AUC of 0.65 is similarly within the band of previously reported delirium prediction models which raises the possibility that age may be of differing importance as a delirium risk factor in different underlying disease states. This possibility warrants further focused research as widely varying delirium rates would need to be considered. 31, 42, 44 Although the direct use of age as a trivial means of ranking higher and lower risk between patients comes at the expense of a prediction calibration, and thus, the use of age as a predictive reference is limited, Supplemental Figure 1 provides a clinically applicable intuition for it of potential epidemiological value to the consulting psychiatrist caring for patients with COVID-19. As a middle ground between uncalibrated direct use of raw age and the full predictive model, the logistic regression using only age and dementia history produces intermediate results with an AUC of 0.68. The full model and the limited age and dementia-only models had more similar sensitivities and their respective optimized cut points (0.73 for the full model vs 0.79 for the limited model), whereas the specificity of the full model was superior (0.69 vs 0.46). We note multiple limitations in the present work. The data are drawn from an open system, and as such, we cannot exclude the possibility of a patient receiving medical care-before the studied COVID-19 admission-outside the studied network of hospitals. These missing data likely limit the predictive accuracy of this model which is-as a matter of applicabilitylimited to before, and early, hospital facts. The delirium outcome was based on health records, not a gold standard reference diagnosis or structured screening approach. The literature suggests delirium is under recognized; as such, the use of recognition through routine care as an outcome likely biases results toward the null, that is, the noncase group is likely to contain true cases that were not recognized at the time. Important work remains to be performed characterizing the true rate of delirium, as established by reference diagnosis, among patients with COVID-19 as well as establishing both the feasibility and accuracy of screening during pandemic contact minimization efforts. Finally, because the model building process used L1-penalized regression, the resulting coefficients cannot be interpreted as coefficients in a conventional prespecified regression model. On the other hand, the list of variables selected and operation of the model are wholly inspectable such that many of the objections to black box artificial intelligence do not apply. Beyond limitations, the present approach may not maximize predictive accuracy as it is possible that a focused ICU prediction effort would be more accurate than the pooled equivalency found here. 48 The present work has important practical strengths. In the setting of a pandemic, the value of zero-contact EHR-based risk stratification increases. Contactless evaluation eliminates the need for patient exposure and with it any associated infection exposure risk or consumption of scarce personal protective equipment. Identification of low-risk patients has the potential to further minimize exposure risk. At the other end of the risk spectrum, identification of highrisk individuals targets both staff exposure and associated personal protective equipment utilization to those patients with the greatest potential to benefit from multicomponent delirium prevention efforts (efforts which may in turn reduce length of stay and free scarce acute care resources in the setting of pandemic surge). Concretely, selecting a high-risk cutoff by maximization of Youden's index and applying this model in the test cohort would have allowed 72.7% of cases to be intervened on through contact with only 38.9% of the cohort. As the selection of risk threshold is independent of risk scoring, application of a model of this kind could be modified in real time to track shifting resource scarcity-from scarce acute care space which might favor broader intervention to scarce staff and protective equipment which might favor a more stringent intervention threshold. We note in particular 2 aspects of the present approach which were motivated by the desire for simplicity, scalability, and generalizability. First, the data considered for prediction were limited to those available shortly after hospital admission. This allows early risk stratification, at the cost of missing the opportunity to retarget individuals as high risk based on data that became available over the course of a hospitalization. The practical importance of this limitation, and associated predictive accuracy penalty, has been previously noted. 89 Second, the model used is a simple linear model. The advantage of this approach is ease of reimplantation in the production transactional EHR (a practical requirement if value is to be realized); however, limiting to simple linear models likely sacrifices optimal predictive accuracy and thus further work on more sophisticated approaches to prediction is warranted. Although the linear model is algorithmically simple to reimplement (encode the features as described, apply the coefficients from Supplemental Table 1 , and sum), the number of features and lack of statistical equivalence to conventional predefined explanatory linear models limits direct clinical interpretation of the result model. Regardless of the approach taken, careful application and validation within a new application context are critical components of predictive medicine. 90, 91 Finally, we note that the choice to evaluate the model in independent clinical sites-instead of pooling and randomly dividing-is a strength of the present work as past experience suggests that stable calibration and accuracy across clinical sites is a more significant challenge than random partitions within a pooled group of sites or time-based cohorts. 63, 64, 67, 92 CONCLUSION Machine learning applied to EHR facts available on inpatient admission can be used to predict the subsequent emergence of delirium among patients hospitalized with COVID-19. This prediction, developed in a cohort drawn from 3 hospitals and validated in a separate cohort of 3 different hospitals, was found to be well calibrated to observed occurrence rates, discriminate better than direct use of patient age, and demonstrate clinical utility over a wide range of risk thresholds. Supplementary data to this article can be found online at https://doi.org/10.1016/j.jaclp.2020.12.005. Funding: This study was funded by the National Institute of Mental Health (grant numbers 1R01MH120991 and 5R01MH116270). The sponsors had no role in study design, writing of the report, or data collection, analysis, or interpretation. Sudden and complete Olfactory Loss function as a possible Symptom of COVID-19 Magnetic Resonance Imaging Alteration of the brain in a patient with coronavirus disease 2019 (COVID-19) and Anosmia Neurologic features in Severe SARS-CoV-2 infection Neurologic Manifestations of hospitalized patients with coronavirus disease 2019 in Wuhan, China Neurological features of COVID-19 and their treatment: a review COVID-19-Associated hyperactive intensive care Unit delirium with Proposed pathophysiology and treatment: a case report Spectrum of Neurological Manifestations in covid-19: a review Delirium as the first clinical presentation of the coronavirus disease 2019 in an older adult Delirium pathophysiology: an updated hypothesis of the etiology of acute brain failure Pathoetiological model of delirium: a comprehensive understanding of the neurobiology of delirium and an evidence-based approach to prevention and treatment Association of delirium with Long-term cognitive Decline: a meta-analysis Outcome of delirium in critically ill patients: systematic review and metaanalysis The importance of delirium: economic and societal costs One-year health care costs associated with delirium in the elderly population The cost of ICU delirium and coma in the intensive care Unit patient What does delirium cost? An economic evaluation of hyperactive delirium Poorer outcomes and greater healthcare costs for hospitalised older people with dementia and delirium: a retrospective cohort study Delirium in older medical inpatients and subsequent cognitive and functional status: a prospective study Prognosis of delirium in elderly hospital patients Delirium predicts 12-month mortality Long-Term effects of postoperative delirium in patients Undergoing cardiac operation: a systematic review Persistent delirium predicts greater mortality Delirium is a robust predictor of morbidity and mortality among critically ill patients treated in the cardiac intensive care unit Effectiveness of multicomponent nonpharmacological delirium interventions: a metaanalysis Haloperidol and Ziprasidone for treatment of delirium in critical illness Effect of the Tailored, Family-Involved hospital Elder Life program on postoperative delirium and function in older adults: a randomized clinical trial Evaluating the effects of the pharmacological and nonpharmacological interventions to manage delirium symptoms in palliative care patients: systematic review Pharmacological interventions for prevention and management of delirium in intensive care patients: a systematic overview of reviews and meta-analyses Pharmacological treatments of nonsubstance-withdrawal delirium: a systematic review of prospective trials Underreporting of delirium in Statewide claims data: Implications for clinical care and predictive modeling Undiagnosed delirium is frequent and difficult to predict: results from a prevalence survey of a tertiary hospital Unrecognized delirium is prevalent among older patients Admitted to general medical wards and Lead to higher mortality rate Delirium in the intensive care unit: an under-recognized syndrome of organ dysfunction Validation of the Stanford Proxy test for delirium (S-PTD) among critical and noncritical patients Does this patient have delirium?: value of bedside instruments Assessment scales for delirium: a review Enhancing delirium case definitions in electronic health records using clinical free Text Effect of delirium motoric subtypes on administrative documentation of delirium in the surgical intensive care unit A chart-based method for identification of delirium: validation compared with interviewer ratings using the confusion assessment method Evaluation of algorithms to identify delirium in administrative claims and drug utilization database Delirium in an adult acute hospital population: predictors, prevalence and detection Estimating patients' risk for postoperative delirium from preoperative routine data -Trial design of the PRe-Operative prediction of postoperative DElirium by appropriate SCreening (PROP-DESC) study -a monocentre prospective observational trial Characterizing and predicting rates of delirium across general hospital settings Genome-wide association identifies a novel locus for delirium risk Delirium misdiagnosis risk in psychiatry: a machine learning-logistic regression predictive algorithm Systematic review of prediction models for delirium in the older adult inpatient Risk prediction models for delirium in the intensive care unit after cardiac surgery: a systematic review and independent external validation Delirium prediction in the intensive care unit: comparison of two delirium prediction models Multinational development and validation of an early prediction model for delirium in ICU patients Evaluation of emergency department derived delirium prediction models using a hospital-wide cohort COVID-19: ICU delirium management during SARS-CoV-2 pandemic Delirium: a missing piece in the COVID-19 pandemic puzzle Functional and cognitive outcomes after COVID-19 delirium COVID-19 inpatients with psychiatric disorders: real-world clinical recommendations from an expert team in consultation-liaison psychiatry Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2) Introduction of an area deprivation index measuring patient Socioeconomic status in an Integrated health system: Implications for population health Area deprivation and widening inequalities in US mortality, 1969-1998 Validation of a combined comorbidity index Agency for healthcare research and quality Efficient genome-wide association in biobanks using topic modeling identifies multiple novel disease loci Polygenic loading for major depression is associated with specific medical comorbidity Stratification of risk for hospital admissions for injury related to fall: cohort study Validation of a risk stratification tool for fallrelated injury in a state-wide cohort Utilizing RxNorm to support practical computing applications: capturing medication history in live electronic health records The Unified medical language system: an informatics research collaboration Stratifying risk for dementia onset using large-scale electronic health record data: a retrospective cohort study The ICD-10 general equivalence Mappings. Bridging the translation gap from ICD-9 Regression shrinkage and selection via the Lasso Applied Logistic Regression, 3 edition A tutorial on calibration measurements and calibration models for clinical prediction models Signal detection theory and psychophysics An introduction to ROC analysis The meaning and use of the area under a receiver operating characteristic (ROC) curve Evaluating Kolmogorov's distribution EDF statistics for Goodness of Fit and Some comparisons Using relative utility curves to evaluate risk prediction Decision curve analysis: a novel method for evaluating prediction models. Med Decis Mak Int Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers Prediction modeling methodology Index for rating diagnostic tests Mapping the delirium literature through Probabilistic topic modeling and network analysis: a computational Scoping review Risk factors for incident delirium among older people in acute hospital medical units: a systematic review and meta-analysis A systematic review of risk factors for delirium in the ICU Potential biases in machine learning algorithms using electronic health record data Dissecting racial bias in an algorithm used to manage the health of populations Development and validation of PRE-DELIRIC (PREdiction of DELIRium in ICu patients) delirium prediction model for intensive care patients: observational multicentre study Recalibration of the delirium prediction model for ICU patients (PRE-DELIRIC): a multinational observational study Prediction of ICU delirium: validation of Current delirium predictive models in routine clinical Practice Prediction models -development, evaluation, and clinical application Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal Assessment of time-Series machine learning methods for Forecasting hospital discharge Volume Enclave teams for their support in making EHR data available to study the COVID-19 pandemic.