key: cord-264942-9u2k5ctm authors: Lusczek, E. R.; Ingraham, N. E.; Karam, B.; Proper, J.; Siegel, L.; Helgeson, E.; Lotfi-Emran, S.; Zolfaghari, E. J.; Jones, E.; Usher, M.; Chipman, J.; Dudley, R. A.; Benson, B.; Melton, G. B.; Charles, A.; Lupei, M. I.; Tignanelli, C. J. title: Characterizing COVID-19 Clinical Phenotypes and Associated Comorbidities and Complication Profiles date: 2020-09-14 journal: medRxiv : the preprint server for health sciences DOI: 10.1101/2020.09.12.20193391 sha: doc_id: 264942 cord_uid: 9u2k5ctm Background: There is limited understanding of heterogeneity in outcomes across hospitalized patients with coronavirus disease 2019 (COVID-19). Identification of distinct clinical phenotypes may facilitate tailored therapy and improve outcomes. Objective: Identify specific clinical phenotypes across COVID-19 patients and compare admission characteristics and outcomes. Design, Settings, and Participants: Retrospective analysis of 1,022 COVID-19 patient admissions from 14 Midwest U.S. hospitals between March 7, 2020 and August 25, 2020. Methods: Ensemble clustering was performed on a set of 33 vitals and labs variables collected within 72 hours of admission. K-means based consensus clustering was used to identify three clinical phenotypes. Principal component analysis was performed on the average covariance matrix of all imputed datasets to visualize clustering and variable relationships. Multinomial regression models were fit to further compare patient comorbidities across phenotype classification. Multivariable models were fit to estimate the association between phenotype and in-hospital complications and clinical outcomes. Main outcomes and measures: Phenotype classification (I, II, III), patient characteristics associated with phenotype assignment, in-hospital complications, and clinical outcomes including ICU admission, need for mechanical ventilation, hospital length of stay, and mortality. Results: The database included 1,022 patients requiring hospital admission with COVID-19 (median age, 62.1 [IQR: 45.9-75.8] years; 481 [48.6%] male, 412 [40.3%] required ICU admission, 437 [46.7%] were white). Three clinical phenotypes were identified (I, II, III); 236 [23.1%] patients had phenotype I, 613 [60%] patients had phenotype II, and 173 [16.9%] patients had phenotype III. When grouping comorbidities by organ system, patients with respiratory comorbidities were most commonly characterized by phenotype III (p=0.002), while patients with hematologic (p<0.001), renal (p<0.001), and cardiac (p<0.001) comorbidities were most commonly characterized by phenotype I. The adjusted odds of respiratory (p<0.001), renal (p<0.001), and metabolic (p<0.001) complications were highest for patients with phenotype I, followed by phenotype II. Patients with phenotype I had a far greater odds of hepatic (p<0.001) and hematological (p=0.02) complications than the other two phenotypes. Phenotypes I and II were associated with 7.30-fold (HR: 7.30, 95% CI: (3.11-17.17), p<0.001) and 2.57-fold (HR: 2.57, 95% CI: (1.10-6.00), p=0.03) increases in the hazard of death, respectively, when compared to phenotype III. Conclusion: In this retrospective analysis of patients with COVID-19, three clinical phenotypes were identified. Future research is urgently needed to determine the utility of these phenotypes in clinical practice and trial design. The data source for this study included EHR reports from 14 U.S. Midwest hospitals and 60 primary care clinics. Patient and hospital-level data were available for 7,538 patients with PCR-confirmed COVID-19. Of these, 1,022 required hospital admission and were included in this analysis. The database included all comorbidities reported since March 29, 1997 for each patient and prior to their COVID-19 diagnosis. The database also included home medications, laboratory values, clinic visits, social history, and patient demographics (age, gender, race/ethnicity, language spoken, zip code, socioeconomic status indicators Table 3 ). All comorbidities were identified based on ICD-9, ICD-10, or problem list documentation within the electronic health record. An indicator variable was created for each comorbidity to denote the presence of the selected ICD-9, ICD-10, or problem list documentation at any time in the medical record. To facilitate analysis, comorbidities were grouped by organ system into the following categories: cardiac, respiratory, hematologic, metabolic, renal, hepatic, autoimmune, cancer, and cerebrovascular disease. We selected 30 in-hospital complications measured during each patient s hospital stay for COVID-19 categorized into the following systems: cardiovascular, respiratory, hematologic, renal, hepatic, metabolic, and infectious (Supplemental Table 4 ). If applicable, complications could span multiple organ system variables. For example, ventilator associated pneumonia was included in both infectious and respiratory complications. Additional clinical outcomes included hospital length of stay (LOS), need for intensive care unit (ICU) admission, need for mechanical ventilation, and mortality. Mortality was defined as any in-hospital or out-of-hospital death based on death certificate data. All complications and outcomes were followed for a minimum of 2 weeks following hospital admission. The overall rate of missingness of the 33 variables used for phenotyping, which included the first vitals and labs recorded for each inpatient within 72 hours of admission, was 19% (range 0% -50% Figure 5 ). While phenotypes II and III overlay substantially, phenotype I is more clearly defined in the right-hand side of the score plot of the first two principal components (Figure 1) . Notably, this figure shows that distinctions between phenotypes are primarily driven by variation in PC1 as opposed to PC2. The variable contributions to PC1 Differences across phenotypes with respect to patient demographics, admission vitals and labs, complications, comorbidities, and clinical outcomes are presented in Table 1 . Patients with All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. When grouping comorbidities by organ system, cardiac (p <0.001), respiratory (p =0.002), hematologic (p <0.001), and renal (p <0.001) comorbidities were found to be significantly associated with phenotype. Cancer, hepatic, autoimmune, cerebrovascular, and metabolic comorbidities were not significantly associated with phenotype (Table 1 Clinical phenotypes I and II were associated with increased odds of respiratory Table 2 ). There was a trend towards increased odds of hematologic complications among patients with phenotype I (I: OR: 2.11, 95% CI: 0.99-4.48, p =0.05) compared to III. Phenotype was associated with hepatic complications (p <0.001); however, while All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Phenotype II had a 2.57-fold (HR: 2.57, 95% CI: 1.10-6.00, p=0.03) increase in the hazard of death compared to Phenotype 3. We performed a sensitivity analysis to assess the impact of mortality as a competing risk by fitting the LOS model before and after removing the 127 patients who died. The estimated effect sizes were similar between these two models (data not shown). This is one of the first studies to report on clinical phenotypes associated with COVID-19. We identified three clinical phenotypes for patients with COVID-19 on hospital presentation. Most patients presented with phenotype II, which is associated with a moderate course and an approximately 10% mortality. A subset of patients presented with the more severe phenotype I, which is associated with a staggering 27% mortality. Patients with cardiac, hematologic, and renal comorbidities were most likely to be characterized by phenotype I. Surprisingly, respiratory comorbidities appeared less related to phenotypes I or II and were most associated with phenotype III, which had the most indolent course. Despite this indolent course, patients with phenotype III had the highest rate of readmission which is likely in part due to the high survival rate. This also suggests patients with pre-existing respiratory comorbidities, while not at highest risk for mortality, may be at highest risk for long term sequalae following COVID-19. Patients that presented with phenotype I were most associated with the development of respiratory, hematologic, renal, metabolic, hepatic, and infectious complications. Surprisingly, cardiovascular complications did not significantly differ between phenotypes. Elucidating patient risk factors and severe COVID-19 disease markers may allow early treatment implementation that may impro e the patient s outcome. Multiple studies have documented COVID-19 risk factors; however, most have done so from a homogenous lens. For example, a prospective cohort study from New York City identified that the most considerable risks for hospital admission were age, male sex, heart failure, chronic kidney disease, and high BMI.22 A large observational study conducted in the UK reported that increasing age, male gender, comorbidities such as cardiac disease, chronic lung disease, chronic kidney disease, and obesity were associated with higher mortality in COVID-19 positive patients admitted to the hospital.14 A study from China found that increased odds of in-hospital death due to COVID-19 were associated with older age, higher SOFA score and D-dimers > 1.0 µg/mL on admission.23 Another retrospective study reported that patients with severe COVID-19 disease and diabetes had increased leucocytes, neutrophils count, and increased C-reactive protein (CRP ), D-dimers, fibrinogen levels.24 A systematic review and meta-analysis found that the biomarkers associated with increased mortality include higher CRP, higher D-dimers, increased creatinine, and lower albumin levels.25 However it is well known that patients do not have a singular natural history of disease. Multiple studies including this study found that only half of patients suffer a primarily respiratory disease. 26, 27 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Similar to our analysis, they identified three distinct clinical phenotypes. Their low mortality cluster which they called cluster 1 was very similar to our phenotype III with a predominance of females, lower mortality rate, lower D-dimer and CRP levels. Similarly, their high mortality cluster was predominantly male, with elevated inflammation markers on ICU presentation. In this study, we not only characterized three clinical phenotypes, but extended findings outside of the ICU by characterizing the association of comorbidities with clinical phenotype and the association of clinical phenotypes with in-hospital complication and clinical outcomes. Phenotype I can be termed the Ad erse phenot pe and as associated ith the worst clinical outcomes. LDH, Absolute Neutrophil Count, D-dimer, AST, and CRP were most influential in phenotype I determination. The strong association of RDW with phenotype I was interesting. RDW was strongly associated with genetic age which is hypothesized to be a risk factor in Covid-19.30 As people age, variability in red blood cell volumes increases. Similarly, Gamma Gap, a marker of immunoglobulin levels, was elevated in all three phenotypes (median > All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. . https://doi.org/10.1101/2020.09.12.20193391 doi: medRxiv preprint Gap. In this scenario elevated Gamma Gap was likely an indicator of systemic inflammation and has been associated in other inflammatory disease processes with prognosis. Other groups have previously reported on the importance of the Absolute Neutrophil to Absolute Lymphocyte count, here we noted that ANC/ALC was lowest for phenotype III and highest for phenotype I, in line with previous reports. Patients with cardiac, hematologic and renal comorbidities were most prone to develop phenotype I. Phenotype I was associated with numerous complications ( hematologic, hepatic, metabolic, renal, respiratory, and infectious) when compared to other phenotypes. It is interesting to note despite a higher rate of baseline cardiac comorbidities phenotype I was not associated with increased cardiac complications. Phenotype III as associated ith the best clinical outcomes and can be termed the Fa orable Phenot pe . Surprisingly, patients with phenotype III had a very high rate of respiratory comorbidities and the best clinical outcomes. What is most surprising is despite the lowest complication rate and mortality, this phenotype was associated with a greater than 10% rate of hospital readmission. It is possible that patients pre-existing respiratory comorbidities predisposed them to longer term sequelae which may have resulted in this readmission rate, although additional studies are needed to better elucidate these findings, specifically controlling for differences in survival. Patients with respiratory comorbidities such as asthma and COPD routinely use medications which may be protective in SARS-CoV-2 pathogenesis which may explain this protective effect. For example, our group has previously identified reduced mortality in COVID-19 for patients with asthma treated with beta2-agonists. 16 Patients with phenotype III were more likely to use inhaled steroids, nasal fluticasone, albuterol, and antihistamines. Ultimately, a deeper investigation into clinical phenotypes and associated genomic, transcriptomic, and proteomic is needed. The ability to classify patients into clinical phenotypes can facilitate the linkage of exome data to better understand SARS-CoV-2 pathogenesis and natural history. All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. . https://doi.org/10.1101/2020.09.12.20193391 doi: medRxiv preprint Understanding the COVID-19 severity, the biomarkers, and the risk factors is paramount during the COVID-19 pandemic. Our study has several limitations, including that this is a retrospective study and therefore results may be biased or subject to residual confounding. Second, patients were followed for All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. Relative Risk ratios of comorbidities of phenotypes I and II compared to the reference group phenotype III. Cumulative distribution functions (CDF) for a randomly selected imputed dataset are shown. A range of phenotypes (2-7) were considered, and the optimal choice of phenotypes is 3. All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. . https://doi.org/10.1101/2020.09.12.20193391 doi: medRxiv preprint The relative change in delta area under the cumulative distribution function is shown for the range of phenotypes (k=2-7) for a randomly selected imputed dataset. The optimal choice of phenotypes is 3. Abbreviations: CDF (cumulative distribution function) A consensus matrix heatmap is shown for a randomly selected imputed dataset clustered into 3 phenotypes. The heatmap allows visualization of consensus cluster assignments to evaluate cluster stability. Darker shades of green indicate higher stability. A consensus matrix heatmap is shown for a randomly selected imputed dataset clustered into 4 phenotypes. The heatmap allows visualization of consensus cluster assignments to evaluate cluster stability. Darker shades of green indicate higher stability. The choice of 4 clusters shows less stability than 3 clusters (see Supplemental Figure 3 ). The proportion of variance explained by each principal component is summed over all principal components. For example, PC1 and PC2 cumulatively explain 20% of the variation in the dataset. Abbreviations: PC1 (principal component 1); PC2 (principal component 2) All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. . https://doi.org/10.1101/2020.09.12.20193391 doi: medRxiv preprint Chord diagram illustrates the prevalence of comorbidities (% observed) for the three clinical phenotypes. Chord diagram illustrates the prevalence of complications (% observed) for the three clinical phenotypes. Chord diagram illustrates the prevalence of clinical outcomes (% observed) for the three clinical phenotypes. Abbreviations: ICU (intensive care unit); Vent (mechanical ventilation); Readmit (readmission to hospital or ICU); ECMO (extracorporeal membrane oxygenation). Tables: Table 1 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. . https://doi.org/10.1101/2020.09.12.20193391 doi: medRxiv preprint preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. Score Plot: PC2 vs. PC1 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. . https://doi.org/10.1101/2020.09.12.20193391 doi: medRxiv preprint All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. . https://doi.org/10.1101/2020.09.12.20193391 doi: medRxiv preprint preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 14, 2020. . https://doi.org/10.1101/2020.09.12.20193391 doi: medRxiv preprint Figure 3 Text All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Figure 4 Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72314 Cases From the Chinese Center for Disease Control and Prevention Immunomodulation in COVID-19 Fact Versus Science Fiction: Fighting Coronavirus Disease 2019 Requires the Wisdom to Know the Difference Dexamethasone in Hospitalized Patients with Covid-19 -Preliminary Report. The New England journal of medicine Transmission, Diagnosis, and Treatment of Coronavirus Disease Acute respiratory failure in COVID-19: is it "typical Respiratory mechanics and gas exchanges in the early course 14 On using multiple imputation for exploratory factor analysis of incomplete data Observational Study of Metformin and Risk of Mortality in Patients Hospitalized with Covid-19. medRxiv Gender Differences in Patients With COVID-19: Focus on Severity and Mortality Racial/Ethnic Disparities in Hospital Admissions from COVID-19 and Determining the Impact of Neighborhood Deprivation and Primary Language. medRxiv Comorbidity measures for use with administrative data circlize implements and enhances circular visualization in R R: A language and environment for statistical computing. R Foundation for Statistical Computing Factors associated with hospital admission and critical illness among 5279 people with coronavirus disease 2019 in New York City: prospective cohort study Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Dynamic changes of D-dimer and neutrophil-lymphocyte count ratio as prognostic biomarkers in COVID-19 Predictors of mortality in hospitalized COVID-19 patients: A systematic review and meta-analysis Association of Cardiac Injury With Mortality in Hospitalized Patients With COVID-19 in Wuhan, China Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study Understanding the Renin-Angiotensin-Aldosterone-SARS-CoV-Axis: A Comprehensive Review Antihypertensive drugs and risk of COVID-19? COVID-19 severity is predicted by earlier evidence of accelerated aging Clinical phenotypes of critically ill COVID-19 patients The gamma gap predicts 4-year all-cause mortality among nonagenarians and centenarians The authors thank Eric Murray and Fairview IT for collection of data Author Contribution Concept and design: All authors Acquisition, analysis, or interpretation of data: Lusczek, Proper, Siegel, Helgeson, Usher, Tignanelli Drafting of the manuscript: All authors Critical revision of the manuscript for important intellectual content: All authors