key: cord-0951818-tr1l31vq authors: Spotnitz, M. E.; Hripcsak, G.; Ryan, P. B.; Natarajan, K. title: Characterizing Post-Acute Sequelae of SARS-CoV-2 Infection across Claims and Electronic Health Record Databases date: 2021-03-24 journal: nan DOI: 10.1101/2021.03.19.21253756 sha: a89de4f74e4e2a8df1833a674ee7152a2193b703 doc_id: 951818 cord_uid: tr1l31vq Structured Abstract Importance: Post-acute sequelae of SARS-CoV-2 infection (PASC) is emerging as a major public health issue. Objective: We characterized the incidence of PASC, or related symptoms and diagnoses, for COVID-19 and influenza patients. Design: Retrospective cohort study. Setting: Our data sources were the IBM MarketScan Commercial Claims and Encounters (CCAE), Optum Electronic Health Record (EHR) and Columbia University Irving Medical Center (CUIMC) databases that were transformed to the Observational Medical Outcome Partnership (OMOP) Common Data Model (CDM) and were part of the Observational Health Sciences and Informatics (OHDSI) network. Participants: The COVID-19 cohort consisted of patients with a diagnosis of COVID-19 or positive lab test of SARS-CoV-2 after January 1st 2020 with a follow up period of at least 30 days. The influenza cohort consisted of patients with a diagnosis of influenza between October 1, 2018 and May 1, 2019 with a follow up period of at least 30 days. Intervention: Infection with COVID-19 or influenza. Main Outcomes and Measures: Post-acute sequelae of SARS-CoV-2 infection (PASC), or related diagnoses, for COVID-19 and influenza patients. Results: In aggregate, we characterized the post-acute experience for over 440,000 patients who were diagnosed with COVID-19 or tested positive for SARS-COV-2. The long term sequelae that had a higher incidence in the COVID-19 compared to Influenza cohorts were altered smell or taste, myocarditis, acute kidney injury, dyspnea and alopecia. Additionally, the long term incidences of respiratory illness, musculoskeletal disease, and psychiatric disorders for the COVID-19 population were higher than expected. Conclusions and Relevance: The long term sequelae of COVID-19 and influenza may be different. Further characterization of PASC on large scale observational healthcare databases is warranted. The global pandemic of SARS-CoV-2 infection and impact of COVID-19 disease has resulted in major morbidity and mortality worldwide. While substantial research has sought to characterize the disease natural history and the acute management of COVID-19, comparatively fewer studies have focused on the post-acute sequelae of SARS-CoV-2 infection (PASC) [1] [2] [3] [4] [5] [6] . Much of the publicly available data on PASC come from case reports and single-institution prospective cohort studies [7] [8] [9] [10] [11] [12] [13] . Patient reported symptom data have shown that prolonged fatigue, headache, dyspnea and anosmia are PASC symptoms 14 . However, there are fewer claims and electronic health record (EHR) data about the incidence of PASC symptoms. Therefore, the public knowledge about PASC symptom presentation is evolving. The National Institutes of Health has encouraged the study of PASC on large-scale observational databases in order to better understand the condition and its public health impact 15 . In alignment with that effort, we present data on PASC patients from EHR and claims databases with an aim to characterize the natural history of patients with SARS-CoV-2 who developed symptoms or diagnoses related to PASC. Additionally, we characterized related long-term sequelae of influenza to provide a comparison. Three observational health databases were used for the analysis: a private-payer administrative claims database (IBM MarketScan Commercial Claims and Encounters-CCAE), a database of inpatient and outpatient electronic health records (Optum© de-identified Electronic Health Record Dataset-Optum EHR), and an electronic health record system from an academic medical center (Columbia University Irving Medical Center-CUIMC). All databases were transformed to the Observational Medical Outcome Partnership (OMOP) Common Data Model (CDM), and described further in Supplementary Appendix#1. We defined a COVID-19 cohort as patients with a diagnosis of COVID-19 or positive lab test of SARS-CoV-2 after January 1 st 2020 with a follow up period of at least 30 days. An influenza cohort was defined as patients with a diagnosis of influenza between October 1, 2018 and May 1, 2019 with a follow up period of at least 30 days. We performed characterizations of COVID-19 and influenza patients who had at least one of the following symptoms or diagnoses that were related to Post-Acute Sequelae of SARS-CoV-2 Infection (PASC), according to the Centers for Disease Control and Prevention (CDC) 15 : altered smell or taste, myocarditis, acute kidney injury, dyspnea, alopecia, tachycardia, chest pain, lung disorder, myalgias, dementia or cognitive Impairment, malaise, fatigue, stress disorder, depression, anxiety, joint pain, mood changes, cough, rash, and fever. We calculated the number of patients who had each and any of these diagnoses between 30 and 180 days after the index event. Diagnoses were based on ICD-10-CM and SNOMED codes, while lab tests included LOINC codes. The full list of codes for the symptoms and diagnoses (Supplementary Appendix #2) and our calculation of relative risk (Supplementary Appendix#3) are provided in the supplementary information. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 24, 2021. ; In aggregate, we characterized the post-acute experience for over 440,000 patients who were diagnosed with COVID-19 or tested positive for SARS-COV-2. We identified 119,510 patients with COVID-19 in the Optum EHR database. Of those, 42,991 (36.28%) had at least one long term sequela. In the IBM MarketScan CCAE database, we identified a total of 306,142 patients with COVID-19, 74,320 (24.28%) of whom had at least one long term sequela. In the CUIMC database, 6,198 (27.52%) of 22,524 patients of patients with COVID-19 had a PASCrelated observation within the following 6 months. Table 1 shows the number of patients from the COVID-19 and influenza cohorts in each database, and subgroups of patients with all or any PASC diagnoses. Five PASC diagnoses had higher relative risk in COVID-19 compared to Influenza patients: altered smell or taste, myocarditis, acute kidney injury, dyspnea and alopecia. Additionally, the proportions of patients who had post-acute diagnoses or symptoms of lung disorder, chest pain, depression, anxiety or joint pain in the COVID-19 cohort were greater than 2% in each database. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 24, 2021. ; Table 1 : The sizes of the COVID-19 and influenza cohorts for the Optum-EHR, IBM MarketScan CCAE and CUIMC 2020q4 databases are shown. The numbers and percentages of patients who had any or all post-acute sequelae of SARS-CoV-2 (PASC) infection diagnoses between 30 and 180 days after the index event are shown. Those subgroups are ranked in descending order of relative risk for COVID-19 compared to influenza patients. The red cells show data for the group of symptoms and conditions that had the highest weighted relative risk, yellow cells show data for the next to highest group, and green cells show the lowest group. N, number; %, percentage; EHR, Electronic Health Record; CCAE, Clinical Claims and Encounters; CUIMC, Columbia University Irving Medical Center; 2020q4, 2020 4 th Quarter; RR, Relative Risk. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 24, 2021. ; We have reported one of the largest studies about the incidence of PASC, and related symptoms or conditions, in a cohort of patients infected with COVID-19 using claims and EHR data. The incidences of some outcomes were greater for the COVID-19 cohort compared to the influenza cohort. Therefore, long term sequelae of COVID-19 may be different from influenza. For example, we expect that the long-term anosmia prevalence will likely be greater for COVID-19 patients. Additionally, we observed symptoms and diagnoses in the COVID-19 cohort at nonnegligible rates. Given the global prevalence of the COVID-19 pandemic, those conditions could have a major public health impact. Specifically, the global burden of respiratory illness, musculoskeletal disease, and psychiatric disorders may increase as a consequence of the COVID-19 pandemic. These findings suggest that PASC is likely composed as a heterogeneous constellation of continuing symptoms that do not resolve, rare but unusual symptoms, and prevalent serious symptoms. Alternative approaches to characterizing PASC have used self-reported symptom data. Although those approaches are comprehensive, self-reported data may be different from patient assessments by healthcare providers. We expect that the differences in these kinds of data may help explain why the incidences of outcomes in our study are lower than incidences reported in patient self-assessment studies 9,10,13 . A limitation of our analysis is that we did not validate our phenotypes, and therefore measurement error is possible; however, we used the CDC description for PASC in developing our phenotype 15 . Also, patient attrition to primary care sites out of network may have contributed to bias in the EHR database. We have demonstrated the feasibility of characterizing the natural history of post-acute COVID-19 infections on both EHR and claims databases. The implications of our analysis may lead to public health interventions that can reduce the global burden of long-term sequelae from the COVID-19 pandemic. We have presented one of the largest characterizations of post-acute sequelae of SARS-CoV-2 (PASC) to date, examining data from multiple disparate populations. Electronic healthcare record data can be used to characterize PASC; additional replication of our analysis on other databases could strengthen our findings. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China Clinical characteristics of covid-19 in New York City Clinical characteristics of coronavirus disease 2019 in China Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Deep phenotyping of 34,128 adult patients hospitalised with COVID-19 in an international network study Characterization and clinical course of 1000 patients with coronavirus disease 2019 in New York: retrospective case series Autonomic dysfunction in 'long COVID': rationale, physiology and management strategies Postacute inpatient rehabilitation for COVID-19 Persistent Symptoms in Patients After Acute COVID-19 Long-COVID': a cross-sectional study of persisting symptoms, biomarker and imaging abnormalities following hospitalisation for COVID-19 Sequelae in Adults at 6 Months After COVID-19 Infection Respiratory and Psychophysical Sequelae Among Patients With COVID-19 Four Months After Hospital Discharge Attributes and predictors of long COVID