key: cord-347079-1zbsbcdd authors: Silverman, Justin D.; Hupert, Nathaniel; Washburne, Alex D. title: Using influenza surveillance networks to estimate state-specific prevalence of SARS-CoV-2 in the United States date: 2020-06-22 journal: Sci Transl Med DOI: 10.1126/scitranslmed.abc1126 sha: doc_id: 347079 cord_uid: 1zbsbcdd Detection of SARS-CoV-2 infections to date has relied heavily on RT-PCR testing. However, limited test availability, high false-negative rates, and the existence of asymptomatic or sub-clinical infections have resulted in an under-counting of the true prevalence of SARS-CoV-2. Here, we show how influenza-like illness (ILI) outpatient surveillance data can be used to estimate the prevalence of SARS-CoV-2. We found a surge of non-influenza ILI above the seasonal average in March 2020 and showed that this surge correlated with COVID-19 case counts across states. If 1/3 of patients infected with SARS-CoV-2 in the US sought care, this ILI surge would have corresponded to more than 8.7 million new SARS-CoV-2 infections across the US during the three-week period from March 8 to March 28, 2020. Combining excess ILI counts with the date of onset of community transmission in the US, we also show that the early epidemic in the US was unlikely to have been doubling slower than every 4 days. Together these results suggest a conceptual model for the COVID-19 epidemic in the US characterized by rapid spread across the US with over 80% infected patients remaining undetected. We emphasize the importance of testing these findings with seroprevalence data and discuss the broader potential to use syndromic surveillance for early detection and understanding of emerging infectious diseases. The ongoing severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2) pandemic continues to cause substantial morbidity and mortality around the world [1, 2] . Regional preparation for the pandemic requires estimating the growth rate of the epidemic, the timing of the epidemic peak, the demand for hospital resources, and the degree to which current policies may curtail the epidemic, all of which benefit from accurate estimates of the true prevalence of the virus within a population [3] . Confirmed cases are thought to be underestimates of true prevalence due to some unknown combination of patients not reporting for testing, testing not being conducted, and false-negative test results. Estimating the true prevalence of SARS-CoV-2 would inform the scale of upcoming surges in hospital demand, the proportion of individuals who remain susceptible to contracting the disease, and estimates of key epidemiological parameters such as the epidemic growth rate and the fraction of infections that are subclinical. The current literature suggests that the predominant symptoms associated with COVID-19 are fever, cough, and sore throat; that is, patients often present with an influenza-like illness (ILI) yet test negative for influenza [4, 5] . As COVID-19 often presents with similar symptoms to influenza, existing surveillance networks in place for tracking influenza could be used to help track COVID-19. Outpatient ILI surveillance has proven to be a useful tool for assessing the impact of influenza [6, 7] . When combined with the number of providers and patients in a given region, ILI surveillance allows estimation of influenza prevalence and severity [8, 9, 10, 11, 12, 13, 14] . Studies of outpatient ILI have repeatedly demonstrated that confirmed influenza case rates underestimate disease burden, likely due to preferential testing of more severe cases [8, 14, 9, 13 ] . Together these features suggest that ILI surveillance could provide a crucial tool for estimating COVID-19 prevalence within the US. Here, we quantified the baseline prevalence of non-influenza ILI in the US over the past 10 years and identified a recent surge of non-influenza ILI starting the first week of March, 2020. This surge of excess ILI correlated with known patterns of SARS-CoV-2 spread across states within the US yet was orders of magnitude larger than the number of confirmed COVID-19 cases reported by the end of March. Using influenza surveillance networks to estimate statespecific prevalence of SARS-CoV-2 in the United States admitted to the hospital to decrease. However, although the daily number of ILI visits to emergency departments across New York City increased in March 2020, the proportion of those patients who went on to be admitted also increased by as much as 3-fold compared to the baseline rate prior to March ( fig. S3A ). This observation suggests that patients with mild ILI presented less often to hospital emergency departments. Such a decrease in care-seeking behavior for mild ILI, if similar across the US, could deflate the estimated size of the ILI surge in the later weeks of March by a factor of approximately 3. If non-ILI patients were less likely to seek medical care, then we would expect that the number of patients complaining of other symptoms not typically associated with COVID-19 (for example vomiting) would also decrease compared to prior years. In the month of March, the daily number of patients presenting with vomiting decreased by as much as a factor of 3 compared to the baseline rate in prior years (fig . S3B ). Assuming that all non-ILI conditions were similarly decreased during March, this would suggest that our estimates of the ILI surge could be inflated by as much as a factor of 3. This assumption is conservative as it assumes that even individuals with severe conditions (such as severe trauma) would avoid seeking health-care in response to COVID-19 at the same rate as those with more mild conditions such as vomiting. However, the potential 3-fold decreased care-seeking behavior for non-ILI conditions cancels out the potential 3-fold decreased care-seeking behavior of mild ILI, suggesting that our estimates of prevalence based on the ILI surge may be insensitive to recent changes in care-seeking behavior (fig. S3C). Overall these estimates suggest a conceptual model in which health care utilization for both mild ILI and non-ILI conditions declined at similar rates as COVID-19 increased in the US. To estimate the proportion and magnitude of the March 2020 US ILI surge attributable to SARS-CoV-2 infections, we made the following three assumptions: (1) that the patient population reported by sentinel providers is representative of their state each week; (2) that changes in care-seeking behavior of ILI patients is occurring at a similar rate as that of other non-ILI patients; and (3) that the total number of patients in the US who require medical care over the course of a year has not substantially changed since 2018. Our first assumption is common and underlies prior studies which have used ILI to estimate influenza prevalence [8, 14] . Our second assumption is supported by our New York City analysis which suggests that both mild ILI and non-ILI conditions have seen similar changes in healthcare seeking behavior. Our third assumption is based on the observation that the increasing need for health-care between March 8 and March 28, 2020 due to COVID-19 is likely small compared to the approximately 1 billion outpatient encounters that occur annually [18, 19] . These assumptions together with surveys describing the average number of patients seen by providers [19] , the number of providers in each state [20] , and the total number of outpatient visits per year [21, 18] , allowed us to estimate that, if outpatient clinics remained open during the COVID-19 epidemic, we would expect that there would have been approximately 2.8 million patient encounters with ILI due to COVID-19 between March 8 to March 28, 2020 (95% credible set 2.6 million to 3.0 million). Not all patients infected with SARS-CoV-2 will present to a health-care provider with ILI. Although we cannot directly measure the rate of such sub-clinical cases, a number of prior studies on asymptomatic rates of COVID-19 and the careseeking behavior of ILI patients in the US suggest a lowerbound on the subclinical rate of patients with ILI. A recent study of passengers on the Diamond Princess cruise-ship accounted for a right-censoring of patients sampled and estimated that 18% of patients infected with SARS-CoV-2 are asymptomatic for the course of their infection (95% credible set 16% to 20%). This estimate likely represents an underestimate given that the majority of passengers were over 60 years old, a demographic thought to have a lower asymptomatic rate than younger individuals [22] . Beyond asymptomatic individuals, a large study of adult health-care seeking behavior in the United States found that, of a random sample of over 17,000 individuals with ILI, 40% of those went on to seek health care [23] . Together these additional contributions from sub-clinical cases correspond to a mean clinical rate of 32% (the overall rate at which SARS-CoV-2 cases seek medical care) and a lower bound of 8.7 million SARS-CoV-2 infections between March 8th and March 28th (95% credible set 8.0 million to 9.4 million). Prevalence estimates for each state within this time-period are shown in fig. S4 . We define the syndromic case detection rate as the number of confirmed COVID-19 cases in a week divided by the size of the ILI surge that week. The syndromic case detection rate varied by state and over time ( fig. S5 ). Our estimated syndromic case detection rates increased over the month of March; this was expected given increases in testing capacity across the US since the February 28 detection of community transmission in Washington State. For the week ending March 14, COVID-19 cases in the states with the highest estimated syndromic case detection rate (Washington, Nevada, and Michigan) only captured approximately 1% of ILI surges in those states. In the last week of the month ending on March 28, the syndromic case detection rate across the US increased to 12.5% (95% credible interval 9.5%-18.3%). The true prevalence of SARS-CoV-2 is unknown at the time of this writing. However, if we assume the excess noninfluenza ILI is almost entirely due to SARS-CoV-2, an assumption that becomes more valid as SARS-CoV-2 becomes more prevalent, we can use the excess non-influenza ILI to define lower bounds on the exponential growth rate of the US SARS-CoV-2 epidemic. By estimating the number of patients visiting clinics for COVID-19 in the US in March, we can also identify the mutual dependence of exponential growth rates, the rate of sub-clinical infections, and the time between the onset of infectiousness and a patient reporting as ILI (Fig. 2) . Using stochastic Susceptible, Exposed, Infectious and Recovered (SEIR) simulations of US COVID-19 epidemics with a January 15 start date [24] , we find that an initial epidemic doubling time longer than 4 days is unlikely to explain the ILI surge. Doubling times longer than 4 days fail to produce enough infected individuals to match the observed excess ILI. Doubling time faster than 4 days can explain the observed excess ILI with a clinical rate that depends on the growth rate. Here, we define the clinical rate as the proportion of infected individuals who present to a health care provider. In keeping with our sub-4 day doubling times, we found that across the entire US, new deaths due to COVID-19 doubled every 3.01 days over the month of March (±0.001, p-value of test that doubling rate is less than 4 days approximately 0). If there was only a 1-day lag from onset of infectiousness to presentation with ILI and the entirety of the first week of the US ILI surge is comprised of patients with COVID-19, then an epidemic starting January 15th and growing at the rate of deaths in the US would imply a 12% clinical rate ( Fig. 2A) . A four-day lag between the onset of infectiousness and presentation with ILI yields a clinical rate of 25% among the 87% of simulations which could account for the ILI surge. The 25% overall clinical rate estimated from a January 15 start date and the doubling time of US COVID-19 deaths is in close agreement with the 32% clinical rate we estimated independently based on a 18% asymptomatic rate and 40% symptomatic clinical rate. Although our epidemic model suggests the first week of the ILI surge is consistent with the US epidemic start date and growth rate, the ILI surge across the US peaked the week ending March 21, much earlier than our epidemic models, suggesting the epidemic in the US differed from the SEIR model through some combination of factors. Such factors could include successful interventions, even faster decreases in care-seeking than observed in New York, heterogeneity in susceptibility [25] , or an early epidemic doubling faster than every 3 days. Faster growth rates require lower clinical rates to explain the ILI surge. Epidemic curves growing at the rate of deaths in Italy, doubling every 2.65 days, could better match the curvature of the ILI surge by peaking around mid to late March, but would imply a clinical rate of 4.7% the second week of March with a 4-day lag between onset and recorded as ILI ( Fig. 2B and C). If the entirety of the ILI surge was attributable to COVID-19, the slowest-possible doubling time for the US epidemic which can explain the ILI surge would be a doubling time of of 4 days. Any evidence of significant secondary introductions, super-spreading, or rapid transmission events in early transmission chains will decrease these estimated clinical rates [26] . Evidence of slow initial spread would increase the estimated clinical rates. Last, estimating the infection fatality rate from the ILI surge requires knowing the clinical rate and the delay from clinical presentation with ILI to death. If patients present with ILI at the onset of their illness, exhibit a 16 day median lag between onset and death [27] , and have a 32% clinical rate as estimated from the 18% asymptomatic rate and 40% clinical rate of symptomatic COVID-19 cases, then the observed ILI surge corresponds to an infection fatality rate of 0.29%. We stress that estimating the infection fatality rate from this ILI surge is highly sensitive to both the lag from presentation with ILI to death and the clinical rate ( fig. S6 ). Consequently, the ILI surge is compatible with fatality rates ranging from 0.07% to 1.4% depending on the unknown sub-clinical rate and lag from presentation with ILI to death. Under the CDC planning scenarios specifying a 4-day lag from onset of symptoms to presentation to the doctor with ILI [28] and a 15 day lag from onset to death, the resulting 11-day lag from ILI to death produces IFR estimates of 0.57% (0.51-0.68% 95% credible set) for the unadjusted ILI surge and 0.19% (0.17-0.22% 95% credible set) for the ILI surge adjusted to account for asymptomatic and subclinical cases. We use outpatient ILI surveillance data from around the US to estimate the prevalence of SARS-CoV-2. We found a clear, anomalous surge in ILI outpatients during the COVID-19 epidemic that correlated with the progression of the epidemic in multiple states across the US. The surge of non-influenza ILI outpatients was much larger than the number of confirmed case in each state, providing evidence of large numbers of probable symptomatic COVID-19 cases that remained undetected. This result is also consistent with ILI excess observed in France in late-February/early-March [29] . Additionally, this finding predicts that the slowest epidemic doubling time that could explain the ILI surge would be 4 days, and that this rate could only be achieved with unusually fast early transmission or super-spreading events and a clinical rate near 100%. Consistent with this prediction, we found that deaths due to COVID-19 within the US doubled every 3.0 days and note that this empirical growth rate for the US epidemic can account for the ILI surge with a 25% clinical rate assuming a 4 day lag from the onset of infectiousness to presentation as an outpatient with ILI. Together, these results suggest that SARS-CoV-2 spread rapidly throughout the US since its January 15th start date and was likely accompanied by a large undiagnosed population of potential COVID-19 outpatients with presumably milder distribution of clinical symptoms than estimated from prior studies of SARS-CoV-2+ inpatients. Excess ILI appears to have peaked during the week starting on March 15th, leading the observed ILI dynamics to diverge from the overall epidemic dynamics implied by the growth rate of COVID-19 deaths in the US. If the ILI dynamics were proportional to the epidemic curve then the two could be related via a constant subclinical rate. However, the changing ratio between COVID-19 prevalence estimated by the ILI surge and the epidemic curves parameterized by the growth rate of US deaths suggests additional mechanisms may be behind the ILI slowdown. Mechanisms which can explain the difference between our simulated epidemic curves and the ILI surge include effective social distancing, disproportionate reductions in ILI care-seeking behavior relative to non-ILI care-seeking behavior, or heterogeneity in susceptibility or contact structure not captured in our SEIR model [25] . Our empirical estimate of the size of the ILI surge has several potential limitations. First, the observed ILI surge may represent more than just SARS-CoV-2 infected patients. A second epidemic of a non-seasonal pathogen that presents with ILI could confound our estimates of ILI due to SARS-CoV-2. However, this seems unlikely as additional viral surveillance through the US Centers for Disease Control and Prevention (CDC) suggests that between March 8 to March 28 other monitored respiratory viruses were at low prevalence [30] . Nonetheless, were our approach to be used during winter months, additional steps would be needed to account for concomitant non-influenza seasonal pathogens. Additionally, our assumption that changes in health-care seeking behavior are similar between mild ILI and non-ILI condition may be incorrect. Although this assumption was supported by New York City emergency department surveillance data, it is possible that differential health-care seeking would be present in other locations or in the outpatient setting. Last, it is also possible that our use of ILI data has underestimated the prevalence of SARS-CoV-2 within the US. Although early clinical reports focused on cough and fever as the dominant features of COVID-19 [5] , other reports have documented digestive symptoms as the complaint affecting up to half of patients with laboratory-confirmed COVID-19 [31] , and alternative presentations, including asymptomatic or unnoticeable infections, could result in underestimation of SARS-CoV-2 prevalence. Additionally, our models have several limitations. First, we assumed that ILI prevalence within states can be scaled to case counts at the state level. This is based on the assumption that the average number of cases seen by sentinel providers in a given week is representative of the average number of patients seen by all providers within that state in a given week. Errors in this assumption would cause proportional errors in our estimated case counts and syndromic case detection rate. Second, our US-wide SEIR models vary by growth rate alone and as such may not capture important heterogeneity in susceptibility or transmission as well as regional variation, intervention-induced changes in transmission, or clustering of infection outbreaks. Our models were used to illustrate that the ILI surge is consistent with an estimated growth rate and start date for the US epidemic and to specify the mutual dependency of growth rate, the lag between the onset of infection and presentation to a doctor, and clinical rates. Finer models with regional demographic and case-severity compartments are needed to translate our range of estimated prevalence, growth rate, and clinical rates into actionable models for public health managers. Last, our method of calculating the infection fatality rate relied on assumptions about the clinical rate and the delay from patients recorded as ILI to death. Our clinical rate required using patterns of care-seeking for typical seasonal causes of ILI as did our delay from ILI to death; consequently, neither should be relied on as a definitive source for COVDI-19 and estimating the clinical rate and delay from ILI to death for COVID-19 specifically will reduce the large uncertainty around our ILIestimated infection fatality rates. Despite these potential limitations, the ILI surge identified in syndromic surveillance time-series allowed early estimates of COVID-19 prevalence, estimates that were not possible from confirmed case data due to early logistical delays in SARS-CoV-2 testing in the US. Our prevalence estimates are supported by a serosurvey conducted in New York State. We estimated that over 8.3% of New York State residents were infected by SARS-CoV-2 by March 28; on April 23, 2020, New York State announced that 14% of residents had evidence of past infection by SARS-CoV-2 by March 29 at which time the cumulative PCR-confirmed case counts totaled only 0.3% of New York's population [32] . Although an ILI surge tightly correlated with COVID-19 case counts across the US and consistent with the New York State serology strongly suggests that SARS-CoV-2 has potentially infected millions in the US, further laboratory confirmation of our hypotheses are still needed to guide public health decisions. Our findings make testable predictions that one would find relatively high seroprevalence in other states that have already seen an ILI surge and that seroprevalence of individuals infected in March across states is proportional to relative sizes of the states' ILI surges. A study of ILI patients from mid-March who were never diagnosed with COVID-19 could produce a focused test of our predictions about the number and regional prevalence of undetected COVID-19 cases presenting with ILI during that time. If seroprevalence estimates beyond New York State continue to corroborate our prevalence estimates from syndromic surveillance, this would strongly suggest lower case severity rates for COVID-19 than were assumed in late March by comparing PCR-confirmed case counts to deaths. Further corroboration of our estimates of the magnitude of the ILI surge would suggest ILI and other public time-series of outpatient illness allow early and reliable estimates of crucial epidemiological parameters for rapidly unfolding, novel pandemic diseases. As not all novel pandemic diseases are expected to present with influenza-like symptoms, surveillance of other illnesses that commonly present in the outpatient setting could provide a vital tool for rapidly understanding and responding to novel infectious diseases. The goal of our study was to use publicly available data to estimate the number of patients seeking care for non-influenza ILI in excess of seasonal trends during the three weeks spanning March 8 to March 28, 2020 and then use this ILI surge to estimate COVID-19 incidence in March and parameterize epidemiological model growth rates and clinical rates. The ILI surge detection above produced an excess proportion of patients visiting outpatient providers for non-influenza ILI in each week and each state. To scale up the proportion of patients to a national number of COVID-19 cases, we estimated the number of patients per sentinel provider in the CDC dataset, normalized that number of patients per provider to a number of patients per doctor, and scaled that up by an estimated number of practicing doctors in the US. The result was an estimated number of COVID-19 patients visiting doctors in each state for each week -we called this our "unadjusted" ILI surge. The unadjusted ILI surge is an under-estimate of COVID-19 prevalence due to only clinical infections, those that seek medical care. We accounted for both asymptomatic infections and symptomatic but sub-clinical infections to produce an "adjusted" ILI surge as our final estimate of COVID-19 incidence in each state and each week. We then used the unadjusted and adjusted ILI surges to estimate syndromic case detection and fatality rates. We also used the unadjusted ILI surge as an empirical observation to evaluate epidemiological modelling of COVID-19 growth rates and clinical rates in the US. Throughout our methods, we use i to denote the index state i and let t index week t (with t=0 referring to October 3, 2010; the start of state-specific ILINet surveillance). Since 2010 the CDC has maintained ILINet for weekly influenza surveillance. Each week approximately 2,600 enrolled providers distributed throughout all 50 states as well Within the ILINet dataset, New York City and New York were summed into a combined New York variable representing both New York City and the surrounding state. Due to incomplete data in one or more of the data-sources described above the Virgin Islands, Puerto Rico, The Commonwealth of the Northern Mariana Islands, and Florida were excluded from subsequent analysis. In addition, to match the weekly reporting of ILI from ILINet, daily cumulative confirmed COVID-19 cases were converted to weekly counts of new cases by ( ) To subtract influenza signal from it y we assumed that the population of patients with ILI within a state are the same population that are potentially tested for influenza. This assumption allows us to calculate the number of non-influenza ILI cases as Mean imputation based on neighboring states was used to address missing values in laboratory influenza quantification. To assess the impact of this model for extracting noninfluenza ILI signal, we calculated COVID-19 prevalence without first removing signal from influenza, we found little change in our prevalence estimates ( fig. S7 ). This likely reflects that influenza also demonstrates strong seasonal patterns that can be addressed as discussed below. We To account for variation in the number of total patients, we modeled it y as binomial distributed. To account for correlation in non-influenza ILI over time, we use a Gaussian Process model which assumes that weeks that are closer together will have more similar levels of non-influenza ILI. The following model reflects these modeling choices: Where  refers to a Gaussian process. We made the following prior specifications: We set the bandwidth parameter for the squared exponential kernel as ρ=3 representing a strong local correlation in time that died off sharply beyond 3 weeks, α=1 representing a signal to noise ratio of approximately 1, ν=1 and ξ=1 representing weak prior knowledge regarding the overall scale of variation in the latent space. collected using the function basset from the R package stray [34] ; a total of 4000 such samples were collected, for each state, in this analysis. We defined the prevalence of non-influenza ILI in excess of normal seasonal variation as To investigate whether our results were sensitive to the above model specification, we alternatively used the sample mean and variance from years 2010-2018 as an estimate of typical seasonal non-influenza ILI. Despite not accounting for the binomial count structure of ILI data or correlations in the proportion ILI between weeks, this simpler model resulted in nearly identical prevalence estimates ( fig. S8 ). Still, we used the GP-derived estimates throughout this paper due to their better accounting for the known binomial count and week-to-week correlation structure of ILI-causing pathogen prevalence. To exclude variation attributable to unseasonably high rates of other ILI causing viruses (such as the outbreak of RSV in Washington state in November-December 2019) we only investigated * it y for weeks after March 7th 2020 as only these later weeks had high correlation to the COVID-19 confirmed case rate ( fig. S2 ). As new COVID-19 case counts it z  represent the number of confirmed cases in an entire state and ILINet data represents the number of cases seen by a select number of enrolled providers, we had to estimate scaling factors i w to enable comparison of ILINet data to confirmed case counts at the state level. Let * it π denote the probability that a patient with ILI in state i has COVID-19 as estimated from ILINet data. Let i p denote the population of state i and let i b denote the number of primary care providers per 100,000 people in state i. We translated the inferred proportion of individuals with ILI due to COVID-19 to the state level by considering the average number of patients seen across all providers in the state in a 5-day work-week. In addition, we added a discount factor λ=0.55 to calibrate these estimates with prior reports regarding the total number of outpatient visits per year [18] . This yielded our estimated number of COVID-19 cases (excess ILI at the state level) as where m=20.2 is the mean number of patients seen by physicians per day [19] . To account for the contribution of sub-clinical SARS-CoV-2 infections we used a recent analysis of cohort surveillance from the Diamond Princess [35] . Monte-Carlo simulations were used to propagate error from our uncertainty regarding potential asymptomatic infections affecting the clinical rate Assuming that the majority of SARS-CoV-2 testing within the US has been directed by patient symptoms [36] , the pool of newly diagnosed SARS-CoV-2+ patients is a subset of the pool of SARS-CoV-2+ patients who are identified as having ILI. Therefore, we calculated the probability that a SARS-CoV-2+ patient with ILI who seeks medical care will be iden- The exact lag from an outpatient being recorded as ILI to death is unknown, but estimated lag times from onset to death and from hospitalization to death [27] can be used to understand the range of implied infection fatality rates from the ILI surge. We calculated the infection fatality rate implied by the ILI surge as a function of the unknown lag from patients being recorded as ILI and death, and we repeat this calculation for both the raw and subclinical rate adjusted ILI estimates. For a lag of l days from ILI reporting to death, the infection fatality rate was estimated by dividing the magnitude of the adjusted or raw ILI surge by all new deaths occurring within the dates (2020/03/08 + l, ..., 2020/03/28 + l). A plot of the fatality rate by lag for raw and unadjusted ILI surges revealed a large range of fatality rates compatible with the ILI surge and highly sensitive to the estimate of lag and clinical rates. One study [27] estimated a median 11.2 days from hospitalization to death and 16.1 days from symptom onset to death. For the raw ILI surge estimate, 11 day and 16 day lag times would produce median infection fatality rate estimates of 0.57% and 0.89%, respectively, without adjusting for any subclinical infections; for the subclinical-adjusted ILI surge estimate, these lag times would produce median infection fatality rate estimates of 0.19% and 0.29%, respectively. As of April 6, 2020, deaths from SARS-CoV-2 epidemic were still growing nearly exponentially as evidenced by nearly linear growth on a log y axis. Early in the epidemic, estimating exponential growth rates by Poisson regression with a log link function produces accurate estimates of the true growth rate [37] , and so we estimated growth rates for the US and Italy by Poisson generalized linear models predicting new deaths using date as a quantitative explanatory variable. US COVID-19 deaths from March 5, 2020 to April 1, 2020, were summed by date to calculate national-level statistics. Initially, April 2-5 were included but were found to have anomalously high leverage and were hence excluded from our analysis. We applied the same procedure to COVID-19 deaths in Italy, focusing on deaths from February 24 until March 12. We used the slope from Poisson regression as the estimated exponential growth rate, which yielded a US growth rate of The following SEIR models [41, 42] combined with persistence of high loads of SARS-CoV-2 that can be cultured up to 7 days after symptom onset [43] , resulting in our use of a 7.3-11 day 95% credible interval for the infectious period. Finally, we parameterized β to ensure I(t) grew with a specified exponential growth rate early in the epidemic. We ran a total of 2,000 simulations for each of the two growth rate distributions (US and Italy) analyzed. Growth rates were drawn at random from a normal distribution with standard deviation of 0. Figure S1 . Excess ILI for each US state. Figure S2 . Excess ILI correlates strongly with patterns of newly confirmed COVID-19 cases. Figure S3 . Surveillance data from New York City emergency departments. Figure S4 : Prevalence of SARS-CoV-2 infections between March 8 and March 28, 2020. Figure S5 . Syndromic case detection rates by state. Figure S6 : Estimating the infection fatality rate (IFR) of COVID-19 based on the unadjusted ILI surge. Figure S7 : Investigating model sensitivity when ILI is modeled without first removing signal from influenza. Figure S8 . Investigating model sensitivity when seasonal trends in non-influenza ILI are identified using an alternative statistical model. 1. An early surge of ILI visits across the US. The proportion of patients presenting with ILI that could not be explained by influenza or typical seasonal variation (that is, excess ILI) is shown for four states (blue line and ribbons represent the posterior median as well as 95% and 50% credible sets; results from all analyzed states are shown in fig. S1 ). ILI that could not be attributed to influenza was calculated based on influenza laboratory surveillance data (2019-2020 flu season shown in red, prior seasons are shown in black). A time-series model was used to infer seasonal variation of non-influenza ILI. Excess ILI was then calculated as the difference between non-influenza ILI from 2019-2020 and the seasonal baseline of non-influenza ILI. Excess ILI after March 7th is highlighted in darker blue as these data correlated strongly with observed COVID-19 case counts ( fig. S2 ). . Epidemiological models were either stochastic (simulated via tau-leaping) or deterministic (solved by numerical integration). In addition to our raw estimates of the ILI surge size (unadjusted), we provide adjusted prevalence estimates accounting for sub-clinical cases by assuming an 18% asymptomatic rate and a 40% rate of health-care seeking of symptomatic ILI patients (adjusted). Epidemic trajectories were simulated using an SEIR model (black lines). The increasing gap between ILI prevalence estimates and SEIR trajectories (orange) suggest the presence of additional factors such social distancing, changes in care-seeking behavior, or heterogeneity in susceptibility or transmission. (C) More generally, the size of the clinical population estimated from ILI data imposes a dependence between epidemic doubling time, the clinical rate, and the lag between onset of infectiousness and ILI reporting. Combinations of these three variables that are consistent (black) or inconsistent (gray) are shown as well as a smoothed estimate of clinical rate as a function of doubling time. China Novel Coronavirus Investigating and Research Team, A Novel Coronavirus from Patients with Pneumonia in China World Health Organization Fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the SARS-CoV-2 epidemic Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: A descriptive study Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Surveillance for influenza-United States, 1997-98, 1998-99, and 1999-00 seasons Update: Influenza activity in the United States during the 2017-18 season and composition of the 2018-19 influenza vaccine Do family physicians make good sentinels for influenza? Spanish Influenza Surveillance System, Estimating the burden of seasonal influenza in Spain from surveillance of mild and severe influenza disease Estimating the burden of influenza-associated hospitalizations and deaths in Chile during 2012-2014 Burden of influenza-associated outpatient influenza-like illness consultations in China Detection of excess influenza severity: Associating respiratory hospitalization and mortality data with reports of influenza-like illness by primary care physicians Influenza surveillance in New Zealand in 2005 Estimating influenza disease burden from population-based surveillance data in the United States National Ambulatory Medical Care Survey: 2016 National Summary Tables United Health Foundation, America's Health Rankings analysis of Special data request for information on active state licensed physicians The U.S. health system in perspective: A comparison of twelve industrialized nations Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China Self-reported influenza-like illness during the 2009 H1N1 influenza pandemic-United States Washington State 2019-nCoV Case Investigation Team, First case of 2019 novel coronavirus in the United States Epidemic size and probability in populations with heterogeneous infectivity and susceptibility Superspreading and the effect of individual variation on disease emergence High Contagiousness and Rapid Spread of Severe Acute Respiratory Syndrome Coronavirus 2. Emerg Influenza-like illness, the time to seek healthcare, and influenza antiviral receipt during the 2010-2011 influenza season-United States Excess cases of influenza-like illnesses synchronous with coronavirus disease (COVID-19) epidemic, France Clinical characteristics of COVID-19 patients with digestive symptoms in Hubei, China: A descriptive, cross-sectional, multicenter study Cumulative incidence and diagnosis of SARS-CoV-2 infection in New York Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship Coronavirus Test: What You Need to Know Estimating initial epidemic growth rates Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: Estimation and application Presymptomatic Transmission of SARS-CoV-2 -Singapore SARS-CoV-2 viral load in upper respiratory specimens of infected patients Temporal dynamics in viral shedding and transmissibility of COVID-19 Virological assessment of hospitalized patients with COVID-2019 Competing interests: NH and JDS declare that they have no competing interests. ADW owns Selva Analytics LLC. Data and materials availability: All data associated with this study can be found in the paper or supplementary materials