key: cord-305297-ync3dhyz authors: Flanders, W. Dana; Flanders, William D.; Goodman, Michael title: The Association of Voter Turnout with County-level COVID-19 Occurrence Early in the Pandemic date: 2020-07-01 journal: Ann Epidemiol DOI: 10.1016/j.annepidem.2020.06.011 sha: doc_id: 305297 cord_uid: ync3dhyz PURPOSE The ongoing coronavirus disease 2019 (COVID-19) severely impacted both health and the economy. Absent an effective vaccine, preventive measures used, some of which are being relaxed, have included school closures, restriction of movement, and banning of large gatherings. Our goal was to estimate the association of voter turnout with county-level COVID-19 risks. METHODS We used publicly available data on voter turnout in the March 10 primary in three states, COVID-19 confirmed cases by day and county, and county-level census data. We used zero-inflated negative binomial regression to estimate the association of voter turnout with COVID-19 incidence, adjusted for county-level population density and proportions: over age 65 years, female, Black, with college education, with high school education, poor, obese, and smokers. RESULTS COVID-19 risk was associated with voter turnout, most strongly in Michigan during the week starting 3 days postelection (risk ratio, 1.24; 95% confidence interval, 1.16-1.33). For longer periods, the association was progressively weaker (risk ratio 0.98-1.03). CONCLUSIONS Despite increased absentee-ballot voting in the primary, our results suggest an association of voter turnout in at least one state with a detectable increase in risks associated with and perhaps due to greater exposures related to the primary. Although coronavirus-induced epidemics occur periodically, 1 (COVID-19) has had especially severe public health impact. 2, 3 Compared to earlier coronavirus epidemics, like those caused by severe acute respiratory syndrome (SARS) and the Middle East respiratory syndrome (MERS), a greater proportion of COVID-19 cases are infectious while still asymptomatic. 4, 5 As the result, unlike SARS and MERS, which were mainly associated with nosocomial spread, the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) which causes COVID-19 is more easily transmitted in the community. 6, 7 Based on experience with other infectious disease epidemics, community transmission typically displays pronounced spatial heterogeneity that depends on two key factors: where people live and how they move or gather. 8, 9 The latter consideration is the main justification for social distancing measures, like school and university closures, cancelation of planned events, and restriction of movement. [10] [11] [12] [13] Despite these measures, population gatherings, including family events like funerals and birthdays, facilitate SARS-CoV-2 transmission in various settings. 14, 15 While the current wave of the COVID-19 pandemic is expected to subside, lower level transmission will likely continue and a second wave in the fall is possible. 16 If so, it could directly affect the presidential elections, scheduled for November 3, 2020. 17 In planning for the November election, it may be helpful to consider the recent experience in the states that held primary elections on March 10, 2020, as planned. Data from these states offer an opportunity to investigate the potential impact of the reduction in social distancing that might be caused by in-person voting. The impact, if any, may inform the administrative and logistical measures that need to be considered for November. To estimate the impact of elections on county-level COVID-19 incidence, we considered voter turnout differences in the March 10 primary, by county, across the three states for which the required information was available. The purpose of the current analysis is to investigate whether the increase in cases that was seen in all states after the election, was greater in counties with higher voter turnout after accounting for other relevant county-level population characteristics. Cases: We obtained information on confirmed COVID-19 cases by county from two sources -USAFacts 18 (Table 1 ) and the Johns Hopkins University Center for Science and Engineering project 19 (JHU-CSSE; Table 1 ). Both sources report confirmed case counts by county and day, using data from the US Centers for Disease Control and Prevention (CDC) and from state and local governments. (More recently, CDC began reporting both confirmed and probable cases, but these changes do not affect the present report). Although numbers from USAFacts and JHU-CSSE differ (see Table 3 ), results using either source led to the same conclusions, and therefore the main analyses use the JHU-CSSE data. We restricted analyses to the three states with primary elections on March 10, 2020 and with the required information as of 05.05.20: Michigan, Mississippi, and Missouri. Although they also held a primary on March 10, we excluded Idaho because not all counties had completed tallies; Washington because voters cast ballots by mail; and, North Dakota because it has caucuses, not primary elections. Voter turnout and Covariates: Information on voter turnout, used as a surrogate of in-person voting in Missouri and Mississippi, was gathered from CNN's Election 2020 Primary database. CNN provides the results of the Republican and Democratic primaries at the county level. County-level turnout in Missouri and Mississippi is used as a surrogate, albeit imperfect, measure of in-person voting and the potential for increased exposure to COVID-19. In Michigan we had counts of by mail and in-person voting by county that we obtained from the Michigan Bureau of Elections. Most of the demographic covariates were gathered from county-level US Census data. County-specific obesity and smoking prevalence estimates were obtained from the CDC Behavioral Risk Factor Surveillance Survey (Table 1) . Statistical Analyses: To estimate the association between voter turnout on March 10 and countylevel COVID-19 infection risk, we considered the time period during which excess cases, if any, would be expected to occur. Therefore, we considered published estimates concerning the incubation period, time from infection to symptom onset and the interval from onset of symptoms to the development of dyspnea or hospitalization. These estimates are relevant because CDC, the main information source for JHU-CSSE (and USAFacts), linked some cases with date reported, not necessarily with symptom onset date. Estimates of the median incubation period range from about 5 20-24 to 6.5 or more days. 25 Based on this information, in the main analyses we considered COVID-19 cases reported through March 10 to be "pre-vote" and considered cases reported on March 13 and after as "post-vote". We calculated case counts for each of a series of risk periods, ranging in length from 5 to 12 days, as indicated in Tables 3 and 4 . We chose these risk periods to allow for the development of symptoms (incubation period) plus time for obtaining and reporting a test result after symptom onset. Collectively, the risk periods considered (e.g., Tables 3 and 4) could include poll-associated cases that were tested as soon as 3-4 days and as long as 15 days after voting. These periods allow for the median incubation period (5-6 days) plus up to an additional 10 days for testing and reporting, because, for part of this time, CDC apparently reported counts by reported date (not necessarily the symptom onset date). 28 Cases reported on March 11 and March 12 were not included in the analyses because they could have become infected pre-vote. In sensitivity analyses, we considered alternative risk periods to define "post-vote" cases as described under "alternative" outcomes. To control for potential confounding of the voter turnout-COVID-19 association, we generated a list of variables that were suspected risk factors for COVID-19 infection. The list included state and 10 county-level population characteristics: density, pre-vote COVID-19 risk, percent female, percent black, percent older than 65 years, percent "poor" (living below the Federal poverty line), percent obese, percent smokers, percent with a college education, and percent with a high school education. From this list we selected a priori, the most important risk factors for COVID-19 that may also influence county-level voter turnout; these were, in addition to state, county population density, percent female, percent black, percent older than 65, and percent poor. Descriptive statistics were computed to characterize the distributions of the counts of postpolling cases, voter turnout, and covariates. Simple linear regression was used to describe the association of voter turnout with the post-vote COVID -19 risk ( Figure 1 ). The primary analyses utilized a zero-inflated negative binomial regression model to account for a more-than-expected number of counties with a 0 count and because of improved fit compared to Poisson models (based on corrected AIC and Pearson's chi-square). The logarithm of the population size was an offset in all models. Alternative outcomes: We reasoned that if the association of voter turnout with COVID-19 occurrence in the post-vote period was attributable to uncontrolled confounding, then that association should persist even if we redefined the outcome as cases occurring in risk periods (e.g. in early April) that did not overlap substantially with the time interval of interest (the incubation period plus some allowance for testing and reporting). That is, if confounding (e.g. behavioral patterns, pre-existing disease and other risk factors) explained the observed association, we would expect the association to persist, likely not substantially weaker, even long after the polling. In contrast, if the association were causal, we would expect a meaningfully weaker or no association for intervals that started well after the incubation period. The online supplement further discusses "alternative" outcomes, their similarity to and relationship with negative control outcomes that motivates them. 29 Events during the alternative risk periods are "alternative outcomes" rather than "negative control outcomes", because these "alternative outcomes" could still be weakly affected by voting patterns. These alternative outcomes are the COVID-19 counts during each of six periods beginning April 3, more than three weeks after the elections. In Michigan, we use the by-mail voter turnout as a negative control exposure, as discussed in the Online Supplement 29-31 . Zero-inflated models allow for the possibility that a separate statistical process can account for an excess of counties with no cases -more than expected under a negative binomial distribution alone. In the JHU-CSSE data, only one county (Missouri) reported pre-vote cases and it also had post-vote cases, making pre-vote cases a strong predictor of not being in the zero-class (part of the zero-inflated model). Therefore, we excluded that county from the main analyses, although when it was included in supplemental analyses, results were much like those obtained without it (Supplemental Table 3C ). The results of each analysis are expressed as a risk ratio (RR) representing the average difference in post-election COVID-19 risk per one percent difference in voter turnout. Risk ratios are accompanied by 95% confidence intervals (CI) and corresponding p-values. Regression diagnostics include assessing correlation between the independent variables, residual analyses, sensitivity analyses and identification of potentially influential points. Analyses were done using SAS version 9.4 (SAS Institute, Cary, NC) and R version 4.0. To assess sensitivity of results to model specification and residual confounding, we conducted two main types of sensitivity analyses (please see Online Supplement). First, we used a different model in which we defined ordinal outcomes by grouping county-specific COVID-19 risks and assumed a multinomial distribution with a cumulative logit link. Second, we allowed for within state correlation of county rates and controlled for additional covariates using a random-effects, zero-inflated negative binomial model. COVID-19 counts obtained from the two sources, JHU-CSSE and USAFacts, were highly correlated (e.g., r>0.99) but not identical. For example, from March 13 to March 23, 1668 cases were reported using JHU-CSSE data and 704 using USAFacts data. Our results are similar regardless of the data source so we primarily focus on the JHU-CSSE data. Only St. Louis County, MO had reported cases COVID-19 cases prior to March 10 in the JHU-CSSE data. In the post-election risk periods (e.g., March 13-March 23), Michigan had more cases than Missouri and Mississippi, averaging more than 15 per county. States had different county-level population densities ( Table 2) Voter turnout also varied noticeably across counties ( Table 2) , often higher in Michigan and Mississippi counties than in Missouri counties. Voter turnout was associated with COVID-19 risks in the unadjusted analyses ( Figure 1 ). After adjustment for population density, the association differed by state Tables 3A-3D (see next paragraph for test results). The association was strongest in Michigan, especially for the more restricted intervals soon after the election and was progressively weaker for longer intervals (Table 3B ). The association was weaker in Missouri than in Michigan, and essentially null in Mississippi (Tables 3C-3D) , where most of the risk ratios were 1.0 or less for both the main and the alternative outcomes. After additionally controlling for percent female, percent over age 65 years, percent black and percent below the poverty line and state, voter turnout was associated with risk, considering all states together (Table 4A) . However, the association with turnout varied by state (for most of the main outcomes, the p-value for interaction was less than 0.05 by likelihood ratio test). Therefore, we present state-specific results. With control for these same a priori variables, Michigan turnout was associated with higher COVID-19 risk in most of the risk periods soon after the election (Table 4B) . Many risk ratios were as large or larger after this additional adjustment, although confidence intervals were wider. Nevertheless, the lower bound of the confidence interval was greater than one for most risk periods, and the previously observed pattern, with risk ratios tending to be closer to 1.0 for longer at-risk intervals, persisted after the additional adjustment. In Mississippi, the association was consistent with the null (Table 4D) , and in Missouri risk ratios for the main outcome ranged from 0.99 to over 1.1 (Table 4C ). In contrast to Michigan, the RRs tended to be larger with the longer at-risk intervals (e.g., March 13-March 23). To assess the pattern of risk, we grouped counties into tertiles based on voter turnout. In Michigan, risk increased monotonically in most intervals with the tertile of voter turnout (Supplemental Table 1 ). In Missouri and Mississippi patterns were not monotonic but consistent with the no association (Online Supplement). The patterns were consistent with an approximate, linear relationship supporting use of a linear term to represent voter turnout in the models (further detail in the Online Supplement). Alternative Outcomes: In the alternative outcome analyses, we defined post-voting cases as those occurring in several periods starting on April 3. Controlling for the same variables as in the a priori model, each of the risk ratios was close to 1.0 (bottom rows of Tables 4A-4C) and all were consistent with no association. Results of the supplemental analyses were generally consistent with the main results (Online Supplement). For example, for the March 13-March 20 risk period in Michigan using ordinal logistic regression the odds ratio relating voter turnout and COVID-19 risk, was 1.13 (95% CI: 0.97, 1.32 p=0.11). In other words, for this risk period, the estimated odds that a county had a particular risk level or higher, relative to those odds for a county where the voter turnout was about 1 percentage point lower was 1.13, consistent with a substantial positive association. Alternatively, we adjusted for additional covariates (proportions with college education, with a high school education, who smoke and who are obese) using random effects, zero-inflated, negative binomial models and obtained risk ratios consistent with those in Tables 4A-4B. (Some supplemental analyses, however, involved models with relatively few observations per parameter, so we view them as secondary; Online Supplement). Our results differ by state. For Michigan, to a lesser extent for Missouri and not at all for Mississippi, they suggest that counties with a larger voter turnout had higher COVID-19 risks over an approximate one-to two-week period beginning a few days after the voting. Our negative control exposure analyses do not permit a definite conclusion. Initial analyses identified positive associations of risk with the negative control which could indicate confounding, but the associations disappeared or reversed with exclusion of one influential county (please see Online Supplement). Results are compatible with either some residual confounding or with circumstantial events (cases) in a single county and no residual confounding. It is important to consider possible reasons that might explain the heterogeneity of the association. One possibility is that Mississippi, where no association was seen, and Missouri, where a weak or minimal association was seen, had few infectious cases on March 10. If so, we would expect to see little or no association with voting in these states. This hypothesis is plausible because the number of cases in the JHU-CSSE data was 5 to 10+ fold greater in Michigan than in the other two states during the first weeks after the election. However, we cannot dismiss other possibilities such as greater confounding in Michigan, use of overall voter turnout as a surrogate for in-person voting in Mississippi and Missouri, differences in social distancing, or the role of chance. Our analyses for Michigan suggest a substantial increase in risk may be associated with higher voter turnout for part of the two-week risk period after the election, perhaps as much as 20% (median RR, for the main outcomes in Table 4B , is 1.2). We view this estimate as very approximate, due to fairly wide confidence intervals, more moderate RRs in some models (Online Supplement), and its model dependency. Our results also suggest no increase in Mississippi and little, if any, increase in Missouri. The increase in risk we observed in Michigan was predominantly restricted to the 1-2 weeks after March 13, lending strength to our findings. Importantly the increase was small or absent, either when we included additional days or defined outcome using periods starting in April (alternative outcome analyses). Although the chain of transmission attributable to the voting would be expected to continue after the early post-vote period, the effect of voting should be less apparent because its contribution relative to other factors influencing community spread should diminish. We expect the increases would become more diffuse and less county-specific with increasing time, a pattern compatible with our observations. In analogy with negative control outcomes 29 , the weak to null association with the alternative outcomes provides support for the interpretation that residual confounding was not important. An important limitation of our study is the absence of individual-level data and contact tracing. However, our purpose was not to assess the association between an individual's participation in voting and COVID-19 risk. Rather, we sought to assess population-level differences in COVID risks that can be attributed to different extents of population participation in a single event. We acknowledge that without individual-level data, we cannot adjust for patterns of risk within counties that depend on the joint distribution of covariates or for unknown or unmeasured confounders that differ across counties. 32,33 A second limitation is that voter turnout imperfectly measures the proportion of the population going to the polls and our direct measure of in-person voting is only in Michigan. (However, this difference in type of information may have had a modest impact: when we used overall voter turnout in place of in-person turnout in Michigan, the risk ratios for the main outcomes were roughly 25% closer to the null, but still meaningfully elevated; data not shown). A further limitation is that particularly for state-specific estimates, our ability to control for confounders was limited by the number of observations (counties). These limitations notwithstanding, the weak to absent associations with the alternative outcomes in Michigan and the pattern of effects, with greater increases observed soon after the election followed by smaller increases as the intervals of study expanded provides some evidence, although indirect, that residual confounding or ecologic bias may not be important threats to validity of the observed results. In Missouri, the lack of a consistent pattern and an association of turnout with an alternative outcome in supplemental analyses (Online Supplement) suggest that the observed associations there may reflect chance, residual confounding or some other phenomenon driving the epidemic. In summary, we reiterate concerns noted in an open letter from multiple public health officials to the US Senate and House of Representatives 34 that going to the polls in an election can be associated with increased risk of SARS-CoV-2 transmission. Although our study of the March 10 primary elections had limitations, the results are consistent with the concern that higher inperson voter turnout may have led to increases in local risk of infection in at least one state. Depending on the situation as the next vote nears, voters may wish to consider taking advantage of absentee ballots or other available voting options. Table 4A . Risk ratios for all states measuring the association of voter turnout with covid-19 risk. We defined the outcome (case-count) using different at-risk periods and adjusted for population density, state and demographic covariates † At-risk period used to define the case count The SARS-CoV-2 outbreak: what we know An interactive web-based dashboard to track COVID-19 in real time. The Lancet Infectious diseases Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72 314 cases from the Chinese Center for Disease Control and Prevention Clinical characteristics of 2019 novel coronavirus infection in China Presymptomatic Transmission of SARS-CoV-2-Singapore COVID-19, SARS and MERS: are they closely related? Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases The reproductive number of COVID-19 is higher compared to SARS coronavirus Differential mobility and local variation in infection attack rate What is a Hotspot Anyway? The American journal of tropical medicine and hygiene COVID-19 control in China during mass population movements at New Year. The Lancet Only strict quarantine measures can curb the coronavirus disease (COVID-19) outbreak in Italy, 2020. Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin Timing of Community Mitigation and Changes in Reported COVID-19 and Community Mobility -Four Association of public health interventions with the epidemiology of the COVID-19 outbreak in Wuhan Community Transmission of SARS-CoV-2 at Two Family Gatherings COVID-19 -the role of mass gatherings. Travel medicine and infectious disease Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period Mass casualty event scenarios and political shifts: 2020 election outcomes and the U.S. COVID-19 pandemic CSSE: JHU-CSSE. Time series summary (csse_covid_19_time_series) Analyses ‡ Using Covid-19 Case Counts from JHU-CSSE 1 Counts of confirmed covid-19 cases from GitHub managed by Adjusted for state, logarithm of population density, proportion female, proportion over age 65 years, proportion black and proportion living below the poverty line, in count model. The zero-inflation model included the intercept only ‡ RR is the risk ratio, CI is the 95% confidence interval Risk ratios in Michigan, measuring the association of voter turnout with covid-19 risk. We defined the outcome (case-count) using different at-risk periods and adjusted for population density, state and demographic covariates † At-risk period used to define the case count Analyses ‡ Using Covid-19 Case Counts from JHU-CSSE 1 Counts of confirmed covid-19 cases from GitHub managed by Adjusted for logarithm of population density, proportion female, proportion over age 65 years, proportion black and proportion living below the poverty line, in count model. The zero-inflation model included the intercept only ‡ RR is the risk ratio, CI is the 95% confidence interval Risk ratios in Missouri, measuring the association of voter turnout with covid-19 risk. We defined the outcome (case-count) using different at-risk periods and adjusted for population density, state and demographic covariates † At-risk period used to define the case count Analyses ‡ Using Covid-19 Case Counts Counts of confirmed covid-19 cases downloaded from GitHub managed by Adjusted for logarithm of population density, proportion female, proportion over age 65 years, proportion black and proportion living below the poverty line, in count model. The zero-inflation model included the intercept only. ‡ RR is the risk ratio Risk ratios in Mississippi, measuring the association of voter turnout with covid-19 risk. We defined the outcome (case-count) using different at-risk periods and adjusted for population density, state and demographic covariates † At-risk period used to define the case count Analyses ‡ Using Covid-19 Case Counts 95% CI=(0.91, 1.04) Counts of confirmed covid-19 cases from GitHub managed by Adjusted for logarithm of population density, proportion female, proportion over age 65 years, proportion black and proportion living below the poverty line, in the count model. The zero-inflation model included the intercept only ‡ RR is the risk ratio, CI is the 95% confidence interval Repeating much of what I put in the cover letter, we found the reviews helpful, and have addressed all of the comments as detailed in our response to reviewers. We are pleased to submit our revised manuscript (AEP-D-20-00199) for consideration.The main new thing is that, since our original submission, we have obtained (Michigan only), in-person voter turnout as well as by-mail voter turnout. This is important because it substantially reduces or eliminates measurement error due to use of overall turnout as a surrogate for in-person voting. We now include state-specific results which parallel the all-state-combined analyses. Although the main (all states) analyses remained only slightly changed, even when using the in-person turnout in Michigan. However, it then became clear that there were differences between the states. The results in Michigan are consistent with a substantial effect of voter turnout on risk, whereas in Mississippi they suggest no effect and in Missouri they are consistent with either a small effect or no effect. In the discussion, we note how this heterogeneity could be a reflection of a greater number of infectious cases in Michigan (where they did have more cases over the weeks following the election), and recognize other possibilities as well. We continue to think our manuscript provides important information about potential risks associated with in-person voting. Please let us know if we can provide further information.