key: cord-0739819-sifsoo7n
authors: Fu, Julie; Reid, Sonya A.; French, Benjamin; Hennessy, Cassandra; Hwang, Clara; Gatson, Na Tosha; Duma, Narjust; Mishra, Sanjay; Nguyen, Ryan; Hawley, Jessica E.; Singh, Sunny R. K.; Chism, David D.; Venepalli, Neeta K.; Warner, Jeremy L.; Choueiri, Toni K.; Schmidt, Andrew L.; Fecher, Leslie A.; Girard, Jennifer E.; Bilen, Mehmet A.; Ravindranathan, Deepak; Goyal, Sharad; Wise-Draper, Trisha M.; Park, Cathleen; Painter, Corrie A.; McGlown, Sheila M.; de Lima Lopes, Gilberto; Serrano, Oscar K.; Shah, Dimpy P.
title: Racial Disparities in COVID-19 Outcomes Among Black and White Patients With Cancer
date: 2022-03-28
journal: JAMA Netw Open
DOI: 10.1001/jamanetworkopen.2022.4304
sha: c6f965cd7961651d4eaa841f25314037f4f65ae1
doc_id: 739819
cord_uid: sifsoo7n

IMPORTANCE: Non-Hispanic Black individuals experience a higher burden of COVID-19 than the general population; hence, there is an urgent need to characterize the unique clinical course and outcomes of COVID-19 in Black patients with cancer. OBJECTIVE: To investigate racial disparities in severity of COVID-19 presentation, clinical complications, and outcomes between Black patients and non-Hispanic White patients with cancer and COVID-19. DESIGN, SETTING, AND PARTICIPANTS: This retrospective cohort study used data from the COVID-19 and Cancer Consortium registry from March 17, 2020, to November 18, 2020, to examine the clinical characteristics and outcomes of COVID-19 in Black patients with cancer. Data analysis was performed from December 2020 to February 2021. EXPOSURES: Black and White race recorded in patient’s electronic health record. MAIN OUTCOMES AND MEASURES: An a priori 5-level ordinal scale including hospitalization intensive care unit admission, mechanical ventilation, and all-cause death. RESULTS: Among 3506 included patients (1768 women [50%]; median [IQR] age, 67 [58-77] years), 1068 (30%) were Black and 2438 (70%) were White. Black patients had higher rates of preexisting comorbidities compared with White patients, including obesity (480 Black patients [45%] vs 925 White patients [38%]), diabetes (411 Black patients [38%] vs 574 White patients [24%]), and kidney disease (248 Black patients [23%] vs 392 White patients [16%]). Despite the similar distribution of cancer type, cancer status, and anticancer therapy at the time of COVID-19 diagnosis, Black patients presented with worse illness and had significantly worse COVID-19 severity (unweighted odds ratio, 1.34 [95% CI, 1.15-1.58]; weighted odds ratio, 1.21 [95% CI, 1.11-1.33]). CONCLUSIONS AND RELEVANCE: These findings suggest that Black patients with cancer experience worse COVID-19 outcomes compared with White patients. Understanding and addressing racial inequities within the causal framework of structural racism is essential to reduce the disproportionate burden of diseases, such as COVID-19 and cancer, in Black patients.

The novel SARS-CoV2 virus and its resulting illness, COVID-19, has led to a global pandemic resulting in over 12 million cases worldwide and over 3 million cases in the United States (US).1 Initial reports implicate age, sex, and comorbid conditions as critical factors in determining the outcome from this illness. Most studies assessing outcomes of patients with cancer and COVID-19 have been limited by small sample size. One of the early reports from Wuhan, China reviewed 28 COVID-19-infected cancer patients with more than half experiencing severe outcomes and death in 28% of patients.2 It is postulated that cancerdirected treatment may be associated with severe events. These observations underscored the severity of COVID-19infected patients with cancer and led to recommendations on COVID-19 screening and avoidance or dose modification of immunosuppressive treatments in these patients.2 Albeit limited, data from the US has corroborated worse outcomes following COVID-19 in patients with cancer.3,4 Based on recent disease-tracking dashboards, COVID-19 has been reported to disproportionately affect Blacks at higher rates compared to Non-Hispanic Whites (NHW).5 Blacks also have higher rates of hospitalization and death after contracting COVID-19. 6 (a) Describe all statistical methods, including those to be used to control for confounding Objective 1 will assess differences in baseline demographic, socioeconomic, comorbidities, clinical characteristics (including status of cancer and anti-cancer treatment), ECOG performance status, and severity of presentation of COVID-19 between each of the racial group comparisons. After checking for the accuracy, integrity, and distribution of the data, all characteristics and outcomes will be presented using descriptive statistics. We will provide the median and interquartile range (IQR) for continuous variables. Counts and percentages will be used to describe the binary and categorical variables.

1b. Descriptive table restricted to hospitalized patients: laboratory measurements 2a. Primary outcome measure will be ordinal variable and secondary outcome will be 30-d all-cause mortality for multivariable modeling. All a priori variables (but not baseline severity) and significant interactions will be included in the final MV model.

We will use the e value to quantify sensitivity to unmeasured confounding.

We will perform analysis based on inverse probability of treatment weighted (IPTW) methods. First, we will estimate propensity scores from a logistic regression model for which the outcome is a binary indicator of non-Hispanic Black versus non-Hispanic White race and prespecified covariates. For each patient, a weight will be calculated equal to the reciprocal of the probability of "receiving the treatment" (that is, race) that was "actually received," which will be estimated from the propensity score model. Next, we will use graphics and summary statistics to evaluate the propensity score model. The empirical distributions of the propensity scores will be stratified by race will be plotted, to evaluate their overlap between groups. Mean propensity scores will be calculated stratified by race across quintiles of the propensity scores in the overall cohort, to evaluate balance in the propensity scores between groups. Unweighted and weighted absolute standardized mean differences for demographic and clinical characteristics at COVID-19 diagnosis between non-Hispanic Black and non-Hispanic White patients will be calculated, to evaluate whether the two groups were balanced on their observed characteristics; an absolute standardized mean difference <0.1 indicated balance. Finally, we will estimate IPTW differences in COVID-19 severity between non-Hispanic Black and non-Hispanic White patients from an ordinal logistic regression model that included an offset for (log) follow-up time. Betweengroup IPTW differences in 30-day all-cause mortality will be estimated from both a logistic regression model (to estimate odds ratios) and a modified Poisson regression model (to estimate relative risks). All models will include race as the sole covariate, weighted by the reciprocal of the probability of "receiving the treatment" (that is, race) that was "actually received," and will use a robust (a.k.a. sandwich) variance estimator to account for the uncertainty due to estimation of the weights (and for the modified Poisson model, to account for misspecification of the variance structure). Results will be reported as odds ratios (or relative risks) with 95% confidence intervals.

Proportional odds assumption will be tested 2b.

Simple summary table stratified by race that gives n (%) for: 1. clinical systemic complications (see appendix I) 2. total hospitalization 3. total mechanical ventilation 4. total ICU admission 5. overall Death (b) Describe any methods that will be used to examine subgroups and interactions

We will also examine interaction between 1. race and all comorbidities (cardio, pulmonary, renal, diabetes), and 2. race and cancer status 3. race and obesity to understand the synergistic impact of these factors on mortality.

(c) Explain how missing data will be addressed Multiple imputation will be used to impute missing and unknown data for all variables included in the analysis, with some exceptions: unknown ECOG performance score and unknown cancer status will not be imputed and treated as a separate category in analyses; and laboratory values will not be imputed.

Imputation will be performed on the largest dataset possible (that is, after removing test cases and other manual exclusions, but before applying specific exclusion criteria). At least 10 imputed datasets will be used. 

Adjusted odds ratios (ORs) for COVID-19 severity were estimated from multivariable ordinal logistic regression models. 2 Because the ordinal outcome was assessed over patients' total follow-up period, the model included an offset for (log) follow-up time. Adjusted ORs and relative risks (RRs) for 30-day mortality were estimated from logistic and modified Poisson regression models, respectively. 3 In addition to models minimally adjusted for age and sex, we included all pre-specified covariates in fully adjusted models, given a sufficient number of events (and corresponding degrees of freedom) to enable full multivariable models. Coefficients and standard errors from models with different levels of adjustment, variance inflation factors, and clinical judgement were used to assess model stability. Exploratory analyses with smoothing splines were used to determine the association of age (as a continuous variable) with outcomes, which appeared non-linear (eFigure 2). Linear and quadratic terms for age (centered at 40 years) provided an adequate fit. All other covariates were categorical and were adjusted for using indicator variables for each category other than the reference category. These specifications reflected the assumed functional form for covariates. Note that these unweighted models quantified conditional differences in outcomes between non-Hispanic Black and non-Hispanic White patients, conditional on covariate values.

Upon revision, we performed analyses based on inverse probability of treatment weighted (IPTW) methods. 4 While some authors advocate for the use of methods based on causal inference to assess disparities, 5 others do not recommend these methods when the exposure of interest is intrinsic and not modifiable, which therefore does not allow a meaningful definition for counterfactual outcomes. 6 Because race as recorded in medical records and utilized in this analysis is a social and political construct, it is in theory a modifiable risk factor. 7 First, we estimated propensity scores from a logistic regression model for which the outcome was a binary indicator of non-Hispanic Black versus non-Hispanic White race and the minimum sufficient adjustment set of covariates 5 including age, sex, region of patient residence, smoking status, obesity, cardiovascular and pulmonary comorbidities, renal disease, diabetes mellitus, type of malignancy, ECOG performance status, cancer status, timing and modality of anticancer therapy, and month of COVID-19 diagnosis, region of patient's residence, and calendar time, and without (primary) and with (sensitivity) insurance (with missing or unknown included as an "unknown" category). For each patient, a weight was calculated equal to the reciprocal of the probability of "receiving the treatment" (that is, race) that was "actually received," which was estimated from the propensity score model. Next, we used graphics and summary statistics to evaluate the propensity score model. 8 The empirical distributions of the propensity scores stratified by race were plotted, to evaluate their overlap between groups. Mean propensity scores were calculated stratified by race across quintiles of the propensity scores in the overall cohort, to evaluate balance in the propensity scores between groups. Unweighted and weighted absolute standardized mean differences for demographic and clinical characteristics at COVID-19 diagnosis between non-Hispanic Black and non-Hispanic White patients were calculated, to evaluate whether the two groups were balanced on their observed characteristics; an absolute standardized mean difference <0.1 indicated balance.

Finally, we estimated IPTW differences in COVID-19 severity between non-Hispanic Black and non-Hispanic White patients from an ordinal logistic regression model that included an offset for (log) follow-up time. Between-group IPTW differences in 30-day all-cause mortality were estimated from both a logistic regression model (to estimate odds ratios) and a modified Poisson regression model (to estimate relative risks). 3 All models included race as the sole covariate, were weighted by the reciprocal of the probability of "receiving the treatment" (that is, race) that was "actually received," and used a robust (a.k.a. sandwich) variance estimator to account for the uncertainty due to estimation of the weights (and for the modified Poisson model, to account for misspecification of the variance structure). Results were reported as odds ratios (or relative risks) with 95% confidence intervals. Note that these weighted models quantified marginal differences in outcomes between non-Hispanic Black and non-Hispanic White patients.

We evaluated the proportional odds assumption by fitting a set of univariable logistic regression models for all possible cut points of the ordinal COVID-19 severity outcome, with:

Death from any cause: 4 Received mechanical ventilation: 3 Admitted to an intensive care unit: 2 Admitted to the hospital: 1 No complications: 0

That is, for each covariate, we fit a univariable logistic regression model with an offset for (log) follow-up time for each of the four binary outcomes of 4 versus <4, 3 versus <3, 2 versus <2, and 1 versus 0. 9 From each logistic regression model, we obtained the estimated logits (i.e., the log odds of the outcome) for all levels of the covariate. The estimated logits obtained from the 4 versus <4, 3 versus <3, and 2 versus <2 models were compared to those obtained from the 1 versus 0 model via subtraction, plotted, and visually inspected. If the proportional odds assumption was satisfied, then these logit differences would be similar (that is, "proportional") across all covariate levels. There did not appear to be systematic violations of the proportional odds assumption (eFigure 3), including for race; there was a suggestion that the assumption might not be satisfied for Eastern Cooperative Oncology Group (ECOG) performance status.

Missing or unknown data for prognostic factors and other covariates could arise due to respondent non-response for optional survey questions or a response of unknown; an unknown category was provided for all survey questions. Therefore, we assumed that any missing or unknown data were, at worst, missing at random (i.e., missingness depends on observed data only); these missing or unknown data were imputed as described below. However, unknown ECOG performance status and cancer status could be due to the lack of reassessment after initiating new anti-cancer therapy, mixed findings on scans, and lack of surveillance, among other reasons. Therefore, unknown status could be related to unobserved data (that is, missing not at random), and not appropriate to impute. Instead, unknown ECOG performance status and unknown cancer status were included as "unknown" categories.

Multiple imputation using additive regression, bootstrapping, and predictive mean matching was used to impute missing and unknown data. 10 0.5396 0.5177 a Percentiles (i.e., quintiles) of propensity scores in the total cohort. a Propensity scores were estimated from a logistic regression model for race that included age, sex, region of patient residence, smoking status, obesity, cardiovascular and pulmonary comorbidities, renal disease, diabetes mellitus, type of malignancy, Eastern Cooperative Oncology Group performance status, cancer status, timing and modality of anti-cancer therapy, and month of COVID-19 diagnosis. ECOG, Eastern Cooperative Oncology Group. a Weighted absolute standardized mean differences were weighted by the reciprocal of the probability of "receiving the treatment" (that is, race) that was "actually received," which was estimated from a propensity score model for race that included age, sex, region of patient residence, smoking status, obesity, cardiovascular and pulmonary comorbidities, renal disease, diabetes mellitus, type of malignancy, Eastern Cooperative Oncology Group performance status, cancer status, timing and modality of anti-cancer therapy, and month of COVID-19 diagnosis, and without (primary) and with (sensitivity) insurance. Targeted  Endocrine  Immunotherapy  Local  Other  Cyto  812  812  613  686  527  Targeted  -741  590  713  465  Endocrine  --502  591  368  Immunotherapy  ---452  188  Local  ----330 

The COVID-19 and Cancer Consortium. A systematic framework to rapidly obtain data on patients with cancer and COVID-19: CCC19 governance, protocol and quality assurance

Estimation of the probability of an event as a function of several independent variables

A modified Poisson regression approach to prospective studies with binary data

The central role of the propensity score in observational studies for causal effects

Disparities in defining disparities: Statistical conceptual frameworks

Causal inference based on counterfactuals

Epidemiologic analysis of racial/ethnic disparities: some fundamental issues and a cautionary example

Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies

Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis

Multiple imputation using chained equations: Issues and guidance for practice

a Non-analytic records are case reports that did not meet data-quality metrics (eTable 1). Quality score Definition 0No problems identified 1 1 minor problem 2 2 minor problems 3 3 minor problems or 1 moderate problem 4 4 minor problems or 1 moderate problem and 1 minor problem 5 5 minor problems or 1 moderate problem and 2 minor problems or 1 major problem 6As above with additional problems Minor problems were valued at 1 point, moderate problems at 3 points, and major problems at 5 points. Reports with a quality score of >4 were excluded from the analysis. Data presented as n (%). a Respondents were instructed to report the earliest measured laboratory measurements during COVID-19 course. Except for low absolute lymphocyte count (ALC), which was centrally defined as ALC < 1500/µL, ascertainment of upper and lower limits of normal was left to the discretion of respondents. Laboratory measurements were summarized among hospitalized patients only due to common clinical practice to avoid a laboratory blood draw for outpatients. b Low absolute lymphocyte count is defined as less than 1500/uL. c As defined by the reporting institution's normal laboratory value ranges.

Total Non-Hispanic Black Non-Hispanic White N a n (%) N a n b (%) N a n b (%) ARDS, Acute respiratory distress syndrome; NOS, not otherwise specified. a Number of patients with non-missing data. b Groups with fewer than 5 patients were masked (i.e., <5) to minimize the risk of reidentification as per CCC19 policy. c These are collected as separate complications but given the difficulty in radiographically distinguishing pneumonia from pneumonitis, they are combined here.