key: cord-0295507-lfdfavqr
authors: Granek, R.; Tsori, Y.
title: Comparison of the inhomogeneous SEPIR model and data from the COVID-19 outbreak in South Carolina
date: 2021-08-23
journal: nan
DOI: 10.1101/2021.08.15.21262074
sha: d77480a9780b443deff5fbecbef00b3c5f904c11
doc_id: 295507
cord_uid: lfdfavqr

During the COVID-19 pandemic authorities have been striving to obtain reliable predictions for the spreading dynamics of disease. We recently developed an in-homogeneous multi-"sub-populations" (multi-compartments: susceptible, exposed, pre-symptomatic, infectious, recovered) model, that accounts for the spatial in-homogeneous spreading of the infection and shown, for a variety of examples, how the epidemic curves are highly sensitive to location of epicenters, non-uniform population density, and local restrictions. In the present work we tested our model against real-life data from South Carolina during the period May 22 to July 22 (2020), that was available in the form of infection heat-maps and conventional epidemic curves. During this period, minimal restrictions have been employed, which allowed us to assume that the local reproduction number is constant in time. We accounted for the non-uniform population density in South Carolina using data from NASA, and predicted the evolution of infection heat-maps during the studied period. Comparing the predicted heat-maps with those observed, we find high qualitative resemblance. Moreover, the Pearson's correlation coefficient is relatively high and does not get lower than 0.8, thus validating our model against real-world data. We conclude that our model accounts for the major effects controlling spatial in-homogeneous spreading of the disease. Inclusion of additional sub-populations (compartments), in the spirit of several recently developed models for COVID-19, can be easily performed within our mathematical framework.

Infectious disease spreading models are largely based on the assumption of perfect and continuous "mixing", similar to the one used to describe the kinetics of spatially-uniform chemical reactions. In particular, the well-known susceptible-exposed-infectious-recovered (SEIR) model, builds on this homogeneousmixing assumption. Some extensions of SEIR-like [1-5] and other models [6] [7] [8] [9] [10] [11] [12] [13] [14] that account for spatial variability employed mainly diffusion processes for the different sub-populations (where the term "subpopulation" refers to people under a certain stage of the disease, otherwise termed "compartment"). Yet, clearly, while such processes can effectively describe wildlife motion in some systems, they fail to describe the (non-random) human behavior [15] . To mimic the heterogeneity of human behavior more realistically [16] , recent extensions employed diffusion processes of the sub-populations that are limited to contact networks [17] . However, one of the most important artifacts for the application of diffusion process for human population is its unrealistic tendency to spread all populations to uniformity (be it in real space or on contact networks). Moreover, these models do not involve naturally a spatial dependence of the population density -which could influence the spread -nor a spatial dependence of infection spreading parameters, which are required to model geographically local quarantine. Thus, implementation of such dependence using homogeneous models requires a division of the geographic region into multiple number of patches [17] .

We have recently advanced a general mathematical framework for epidemiological models that can treat the spatial spreading of an epidemic [18] . The framework accounts for possible spatial in-homogeneity of all associated "compartments", which we termed "sub-populations", such as the infectious and the susceptible sub-populations. To describe the spatial spreading of COVID-19, we have used the general framework in a 5-compartment model: susceptible, exposed, pre-symptomatic, infectious, and recovered (SEPIR). We used the SEPIR model, with parameters typical for COVID-19, to examine several scenarios of COVID-19 spreading, including the effect of localized lockdowns. However, the validation of our model against the COVID-19 spatial spreading in a specific country is still missing.

Vaccination against COVID-19 is now ongoing in many countries, despite complications associated with shortage of supply, anti-vaccination movements, and vaccination program during a strongly bursting epidemics. Most strategies use age group and risk factor prioritization, ignoring the density variation of susceptible vs. recovered (hence naturally immune) sub-populations. However, under an outbreak it does make send to vaccinate first in regions where the outbreak is expected to be stronger, thereby avoiding a catastrophe. Our in-homogeneous SEPIR model can easily be generalized to predict the outcome of vaccination strategies that involve such variation.

In this publication we wish to validate our in-homogeneous SEPIR model against real world data. For is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted August 23, 2021. ; https://doi.org/10.1101/2021.08.15.21262074 doi: medRxiv preprint this purpose we have chosen the state of South Carolina, which made COVID-19 infection heat-maps publicly available. Information for the density variation across the country is also readily obtained from NASA public resources. As we have shown [18] , these data are critical for the prediction of realistic infection heat-maps (based on an initial heat-map) and for testing them against those obtained in real-life situations.

The validation of our model will allow to generalize -to the in-homogeneous regime -various advanced multi-compartment homogeneous models, that have been proposed in the past year to treat COVID-19.

We briefly review the key features of our in-homogeneous SEPIR model, described in detail in Ref. [18] . To model the spread of COVID-19 and other epidemics, the population is divided into five compartments, termed as "sub-populations": susceptible, exposed, pre-symptomatic (that are infectious), infectious (that are symptomatic), and recovered. This choice, and similar approaches, have proven as effective generalizations of the classical SEIR models. The basic chain of infection dynamics is as follows. Healthy individuals, and those that are not immune, are initially susceptible. They can become exposed when in contact with pre-symptomatic (that are also infectious) or infectious-symptomatic people. These individuals stay in the exposed stage for a certain period of time γ −1 0 during which they are not infectious to others. After this initial incubation period they become pre-symptomatic and also infectious to healthy people. They stay as pre-symptomatic for another period of time, denoted by γ −1 1 , after which they become symptomatic, remaining infectious. This sickness stage lasts for a period of γ −1 2 time. At the end of this period, these individuals become recovered and also immune. According to the literature (concerning here only with the parent, wild-type, SARS-COV-2 virus), the mean time for the appearance of symptoms is 5 days, setting γ −1 0 + γ −1 1 ≈ 5 days. It is accepted that people become infectious about 3 days before appearance of symptoms, that is, γ −1 1 = 3 days, and hence γ −1 0 ≈ 2 days [19] . The time constants γ −1 1 and γ −1 2 , describing the transitions from pre-symptomatic (infectious) to symptomatic (infectious), and from symptomatic-infectious to recovered, respectively, must obey γ −1 1 + γ −1 2 = τ I , where τ I ≈ 16.6 is the average infectious period. It follows that γ −1 2 ≈ 13.6 days.

The most unique feature of our model, that is absent in most epidemiological models, is its ability to account naturally for spatial in-homogeneity. The geographical area under consideration is divided into a lattice that can be square or hexagonal, with inter-node spacing δ. At each node the sum of the five subpopulations equals to the nodal population number (varying from node to node), i.e. the total number of people living in the area corresponding to that node. The spatial dependence of the number of people in each sub-population is accounted for by inclusion of infection kinetics between neighboring nodes ("inter- is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted August 23, 2021. ; https://doi.org/10.1101/2021.08.15.21262074 doi: medRxiv preprint actions"). On a scale much longer than the inter-node spacing, all densities become continuous. In this limit, inter-nodal interactions give rise to diffusion-like infection terms in the rate equations. After proper rescaling, the model equations can be written as [18] 

The variables h, b, w, f , and r denote the number area density of susceptible (healthy), exposed, presymptomatic, symptomatic, and recovered sub-populations, respectively. They are dimensionless due to scaling by the average population density in the region under study. These variables depend on the spatial location x (scaled by δ). Their sum obeys

where n(x) is the total (fixed) population density at location x. We emphasize that the above rate equations describe diffusion of the epidemic and not of people; at any point x the sum h(x)+b(x)+w(x)+f (x)+r(x) remains constant in time.

The spatially-dependent parameters k(x) and D k (x) originate both from infections within each node (influencing k alone), and from infections between neighboring nodes (influencing both k and D k ) [18, 20] .

k describes the rate at which susceptible people (h) become exposed (b) when they meet asymptomatic (w) or symptomatic (f ) people. It is given by k = R 0 /τ I , where R 0 is the well known basic reproductive number. The epidemic diffusion coefficient is estimated to be D k ≈ k/5 for a square lattice. Our model thus consists of five nonlinear partial differential equations in two spatial dimensions. They can be solved numerically with given initial conditions and a given total population density n(x), as we present next using real-world data from South Carolina.

In the year 2020 the COVID-19 epidemic has spread in many places around the world is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted August 23, 2021. ; https://doi.org/10.1101/2021.08.15.21262074 doi: medRxiv preprint provides areal maps of infected people density with daily resolution [22] . The "movie clips" appear in the DHEC website and on youtube and are a source of data. These movie clips were used to extract the density of infected people as a function time. The second source of data is the "Worldometer" website

[23], including various total (integrated) numbers, such as the number of active cases, daily new cases, and accumulated cases. The third data source we used is NASA's Socioeconomic Data and Applications Center (SEDAC) [24] , from which we obtained the South Carolina population density n(x).

We restrict ourselves to the time window between May 22, 2020 (initial time t = 0) and July 22, 2020; see Fig. 1 for the observed initial (May 22, 2020) and final (July 21, 2020) COVID-19 infection heat-maps.

During this period no major South Carolina government orders were issued [25] . After a long period of restrictions, a series of "opening" executive orders commenced on April 21 and ended with the following four relevant opening orders: (i) on May 8, (ii) on May 11, (iii) on May 22 [26] , and (iv) on June 12 [27] .

Whereas the orders on May 22 and June 12 can possibly have some influence on the transmission of COVID-19 during the period of study May 22-July 22 (noting that infection data may lag the actual infection events up to 7-10 days), we believe they are of minor consequence. The next SC government executive order is on July 10 and is of restriction type; Due to the data lag, it may affect the published infection data just at the very last few days of the studied period. Therefore, it appears that we can assume that R 0 remained almost constant during the studied period and time-dependent adjustments are not required. Conversely, this two months period is a very strong test of our model, as common predictions using homogeneous models usually require parameter adjustments after a few weeks at most.

Infection data were extracted from the DHEC movies as follows: The mp4 movie was divided into frames, each corresponds to a period of one day. Each frame was read as an image, cropped to the relevant area, and digitized to red-blue-green (RGB) pixel triplets. We cleaned the images of text labels, roads and noise. Each pixel corresponds to a unique location x in our model. We obtained the pixel area 3.9 km 2 by counting the number of pixels within the South Carolina borders and dividing by the state area, 82, 930 km 2 .

The RGB values of the pixels were converted to number of infected people in several consecutive steps as follows. First, a pixel's RGB triplet was converted to a hue-saturation-value (HSV) triplet. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted August 23, 2021. To solve the set of coupled rate equations (1), we divided the two-dimensional space to nodes. Each node corresponded to one pixel in the clip. The gradient and divergence operators were discretized according to standard finite difference formulations. To propagate the equations in time we used an explicit scheme with a sufficiently small time differential.

The claculation of the Pearson's correlation coefficient (CC) [32] , between observed and predicted heatmaps, was performed in several steps. Hue color triplets were calculated from the raw RGB format of the movie clip. If a pixel's hue value was c, this value was transformed to c = 1 − c, so that red hues have higher values. The very faint baseline bluish hue ≈ 0.3, which was due to the superposition of the background map and the DHEC data, was removed in all places where c was smaller than 0.7. Finally, the infected matrix I data (x) was defined as c and rescaled such its values would lie between 0 and 1. For the simulation, I sim (x) was defined as the sum f + r + w, scaled to lie in the range between 0 and 1. To follow the color truncation employed in the DHEC movie clip, I sim (x) was further modified such that all values above 0.25 were set to be I sim = 0.25. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint 

where the index i runs over all pixels (x coordinates), 0 ≤ i ≤ N , and N is the total number of pixels in the images.Ī data andĪ sim are the averages of I data and I sim , respectively, and σ I data and σ I sim are their standard deviations.

from the period under study. The top two panels in Fig. 3 are the same as those appearing in Fig. 1 and, as explained in the Methods section, serve to define our initial conditions for the simulations. The visual similarity between the South Carolina heat-maps and our simulations for the subsequent dates, including the very late date of July 21, is evident. In some respect our predictions are better than the observations, since our numerical scheme does not allow, as it should, the spreading of infection to regions where human population is missing or very low. In contrast, the apparent, observed, heat-maps include such internal errors (presumably due to an image processing algorithm that is used to create low resolution images, thereby avoiding conflict with privacy issues).

To make this comparison more quantitative, we calculated the Pearson's CC, see Methods section for details. In Table 1 we report the Pearson's CC values, ρ(I data , I sim ), for all the six dates corresponding to the heat-maps presented in Figs. 3 and 4 . The initial conditions (top pair in Fig. 3 ) may be regarded as a check on this numerical criterion where theoretically the Pearson's CC should be very close to unity.

Indeed we find ρ(I data , I sim ) = 0.972, where the very samll difference from unity is associated with the numerical error emerging from the image processing algorithm. For the subsequent dates, the CC values are slowly descending -which is quite reasonable as errors should grow in time -yet remain close to unity.

For July 21 (2020), that is 60 days after initiation of the simulation, we find ρ(I data , I sim ) ≈ 0.8 , which is still quite high, recalling that we ran our simulation for 60 days without any intermediate tweaking of the model parameters. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted August 23, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted August 23, 2021. 

In this paper we have tested our in-homogeneous SEPIR model against real spatial data from South

Carolina and found remarkable agreement of the infection heat-maps that is visible to the eye and confirmed by the large Pearson's CC between the model predicted and observed heat-maps. This suggests that our approach can be employed for the COVID-19 pandemics in other parts of the world. This will require extended and high resolution data on the geographic spread of the pandemic, in order to accurately set the initial conditions of the five sub-populations associated with our model. In addition, the geographic . CC-BY-NC 4.0 International license It is made available under a perpetuity.

is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted August 23, 2021. Fig. 3 for relative times t = 36, t = 48, and t = 60 (dates June 27, July 9, and July 21, respectively).

[2] Y. Mammeri, "A reaction-diffusion system to better comprehend the unlockdown: Application of seir-type model with diffusion to the spatial spread of covid-19 in france," 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint [9] "Be-codis: A mathematical model to predict the risk of human diseases spread between countries. validation and application to the 2014-15 ebola virus disease epidemic, ivorra, benjamin and ramos, angel m (2014).

https://arxiv.org/abs/1410.6153,"

[10] "Application of the be-codis mathematical model to forecast the international spread of the 2019-20 wuhan coronavirus outbreak, ivorra, benjamin and ramos, angel m (2020). https://www.researchgate.

net/profile/Benjamin_Ivorra/publication/338902549_Application_of_the_

Be-CoDiS_mathematical_model_to_forecast_the_international_spread_of_ the_2019-20_Wuhan_coronavirus_outbreak/links/5e40746e458515072d8dce67/ Application-of-the-Be-CoDiS-mathematical-model-to-forecast-the-international-spread pdf," is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

The copyright holder for this this version posted August 23, 2021. ; https://doi.org/10.1101/2021.08.15.21262074 doi: medRxiv preprint

the covid-19 epidemic and implementation of population-wide interventions in italy

Temporal dynamics in viral shedding and transmissibility of covid-19

Adequacy of seir models when epidemics have spatial structure: Ebola in sierra leone

Spatial heterogeneity can lead to substantial local variations in covid-19 timing and severity

Cross-diffusion-induced patterns in an sir epidemic model on complex networks

Epidemiological model for the inhomogeneous spatial spreading of covid-19 and other diseases

Temporal dynamics in viral shedding and transmissibility of covid-19

Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences

Heatmaps progression in time of south carolina

these heat-maps correspond to cumulative infections density

Gridded population of the world, version 4 (gpwv4): Population density, revision 11. columbia university, center for international earth science information network, ciesin

Impact of opening and closing decisions by state

The governor of south carolina issued an executive order, which declares a new state of emergency, allows bowling alleys to open immediately, and lifts restrictions on the occupancy of retail establishments

The governor issued an executive order, which declares a new state of emergency, allows bowling alleys to open immediately, and lifts restrictions on the occupancy of retail establishments

Estimation Without Representation: Early Severe Acute Respiratory Syndrome Coronavirus 2 Seroprevalence Studies and the Path Forward

Prevalence of asymptomatic sars-cov-2 infection: a narrative review

Nationwide seroprevalence of antibodies against sars-cov-2 in israel

Infection units: A novel approach to the modeling of covid-19 spread