key: cord-0704850-p61mrumk authors: El Deeb, O. title: Spatial autocorrelation and the dynamics of the mean center of COVID-19 infections in Lebanon date: 2020-10-25 journal: nan DOI: 10.1101/2020.10.21.20217398 sha: 8a1d4f8587f7e07e73179dc0f38a8683ab3d6eff doc_id: 704850 cord_uid: p61mrumk In this paper we study the spatial spread of the COVID-19 infection in Lebanon. We inspect the spreading of the daily new infections across the 26 administrative districts of the country, and implement Moran's I statistics in order to analyze the tempo-spatial clustering of the infection in relation to various variables parameterized by adjacency, proximity, population, population density, poverty rate and poverty density, and we find out that except for the poverty rate, the spread of the infection is clustered and associated to those parameters with varying magnitude for the time span between July (geographic adjacency and proximity) or August (population, population density and poverty density) through October. We also determine the temporal dynamics of geographic location of the mean center of new and cumulative infections since late March. The results obtained allow for regionally and locally adjusted health policies and measures that would provide higher levels of public health safety in the country. the spatial mechanisms of this spread and its dependence on proximity, demographics and social characteristics of infected areas. Spatial analysis provides a better understanding of the routes of transmission of infections [4] , consequently, it allows the decision-makers to draft and implement eective health and mitigation measures to reduce risks associated with the pandemic. In Lebanon, the rst case was registered on February 21, 2020 [5] and by October 12, 54624 cases and 466 deaths were registered [6] . The rst few weeks witnessed a relatively rapid increase but it sharply declined as a result of the strong mitigation measures enforced by the beginning of March. The lift of the international travel ban and the partial easing of measures led to the revival of higher spread rates since July. Only 1788 cases were registered by July 1, 2020 before a sharp rise from July through October. The cases were mainly concentrated in Beirut, its suburbs and its neighboring areas in Mount Lebanon. On August 4, a huge explosion rattled the port of Beirut and destroyed thousands of houses and buildings in the surrounding areas. People were rushed into hospitals, with thousands of injuries recorded on that day [7] . On such a horrible incident, hundreds of volunteers and civil defense teams were involved in rescue work for several days. The social distancing measures were largely neglected in such an emergency situation. The spread accelerated in the upcoming weeks, with sharp rise in Beirut and its surroundings and with a national widespread reaching all regions and major towns and cities [8] . Related Literature : Spatial autocorrelation is the statistical analysis of data studied in space or in space-time aiming for the identication and estimation of spatial processes [9, 10] . It has been implemented to study and analyze the spread of various diseases and infections including cancer, diabetes, SARS, inuenza virus, COVID-19, etc... [11, 12, 13, 14, 15] . Recent studies also inspected the eect of city size, population, transportation systems and demographics on the disease spread and its mortality rate [16, 17, 18, 19, 20] . The determination of the mean center of a population (centroid) was discussed in [21, 22, 23] and extending the concept to the determination of the mean center of wealth and infections allowed for a spatial analysis of the temporal dynamics of wealth distribution, economic growth and infectious diseases [24] . The dynamics of the outbreak of COVID-19 in Lebanon and its reproduction number dynamics were studied in [25, 26, 27] . In this paper, we study the clustering and spatial progression of new infections in Lebanon by applying the methods of spatial autocorrelation with dierent model parameterizations of geographic, demographic and social variables including adjacency, prox-2 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint 0 -10 11 -100 101 -1,000 1,001 -10,000 imity, population, population density, poverty rate and poverty density. Locating the mean center of the epidemic spread as a function of time is used to analyze the temporal geographic development of the spread. The obtained results provide a solid basis for the concerned policy makers to draw well-grounded and scientically based local and regional measures that would contribute to controlling the infection spread. The paper is organized as follows: in section 2 we introduce the implemented analytic mathematical and statistical methods and tools. Results are presented and discussed in section 3, and section 4 concludes the paper. . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 25, 2020. ; https://doi.org/10.1101/2020.10.21.20217398 doi: medRxiv preprint Moran's I index is an inferential statistic used to measure the spatial autocorrelation based both on locations and feature values simultaneously. It is dened as [9] : where W ij represents dierent types of adjacency between region i and region j, corresponding to dierent models of infectious spread. N is the number of regions under consideration and X i represents the number of new daily infections in district i.X is the average number of new daily infections per region, and it is given byX = Σ i X i The z I -score associated to this statistic is dened by: where the expected value E[I] and the variance V [I] are dened in the Appendix. The z-score or the corresponding p-value of the statistic are used to reject the null hypothesis and eliminate the possibility of a random pattern leading to the obtained value of the Moran I statistic. In this paper, we take a 95% condence level corresponding to |z I | > 1.96 or equivalently to p < 0.05 in order to conrm the outcome of clustering or dispersion of our spatial data indicated by I. In this case we say that the p-value is statistically signicant, and based on the value of I we can determine the pattern of the distribution. We consider a model with six dierent cases of parameterization of the adjacency matrix W ij corresponding to geographic adjacency (case I), proximity (case II), population (case III), population density (case IV), poverty rate (case V) and poverty density (case VI). Table 1 summarizes relevant data from the Lebanese districts. In casel I, we take W ij = 1 for districts sharing common borders and W ij = 0 otherwise, while in case II we determine W ij = 1 d ij where d ij is the driving distance between the administrative centers of regions i and j. Those two cases study the eect of administrative adjacency and the distance proximity of dierent districts on the geographic clustering of new infections in Lebanon. In case III and case IV, we apply the methods used in ( [4, 28] ) to analyze the eects of 4 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 25, 2020. ; https://doi.org/10.1101/2020.10.21.20217398 doi: medRxiv preprint population and population density on the spread of the disease since the virus is carried by people and its spread is supposed to be related to their population and interaction. We sort the districts by the number of their residents (obtained from [29] ) and then by the density of their residents relative to their areas. In these two cases, districts of consecutive populations and population densities are assigned a factor of W ij = 1, and W ij = 0 otherwise. This provides a statistic about the clustering of infections according to population and population density respectively. Lastly, in cases V and VI, we introduce new parameters, namely the poverty rate and the poverty density in dierent districts and we analyze their eect on infection clustering. We sort the distritcs by their rates of poverty and poverty density [29] and assign W ij = 1 for regions of consecutive order of poverty rate or poverty density, and W ij = 0 otherwise, in a similar methodology to cases III and IV in order to infer the eect of poverty rate and density on the geographical patterns of infection spread. Mean center of infection We denote the district number of infections (new or cumulative) by X i as dened above, and the Cartesian positions of the administrative centers by (x i , y i , z i ) . Then, the Cartesian position of the weighted mean of infections − → r i is given by: As suggested by [21] , the precise position on the surface of a sphere can be determined from the normalized position vector dened by CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 25, 2020. Consequently, we can recover the spherical position of the mean center of infections by calculating the mean latitude and longitude as: The latitude and the longitude can be located and plotted on maps and geographic information systems. We employ the spherical coordinates of geographic locations of 6 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 25, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 25, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the highest among all six studied cases. The results of case V (Figure 4) show that the spatial spread cannot be attributed to adjacent ranking of poverty rates among the districts since the p-values remain above the 5% level of condence up until October 2020, hence no spatial clustering occurs. But when we consider the poverty density in case VI, we obtain positive values for Moran's I since the end of August, with p < 0.05 except for ve days. Hence, spatial clustering among regions with adjacent ranking of poverty density occurs. The maximum attained I in this case is 0.666. In comparison, we nd out that clustering of new infections occurs starting on dierent dates between July and August for all considered cases except for case V corresponding to district populations. The strongest level of spatial clustering (highest I) occurs for model IV of population density after mid-August, while clustering associated to geographic 9 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 25, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 25, 2020. ; https://doi.org/10.1101/2020.10.21.20217398 doi: medRxiv preprint In this paper we introduced the Moran's I index with its associated z-score and p-value to study the spatial autocorrelation of registered new infections of COVID-19 in Lebanon. We introduced six dierent cases of parameterization of the spread related to adjacency, proximity, population, population density, poverty rate and poverty density. We discovered that poverty rate is not statistically relevant to the spatial spread of the disease while geographic bordering, distance between district centers, number and density of residents and poverty density lead to clustering of the disease, with varying strengths and level of condence since July and August through October. We also introduced methods to determine the geographic coordinates of the mean center of the infection, and determined this center since April 2020, and plotted its variations over time up until October. The understanding of the spatial, demographic and geographic aspects of the disease spread 11 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 25, 2020. ; https://doi.org/10.1101/2020.10.21.20217398 doi: medRxiv preprint over time provides an essential basis for the relevant authorities to take more ecient decisions of local and inter and intra-regional measures, thus contributing to increased social and health safety and security in the ght against the pandemic. . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 25, 2020. ; https://doi.org/10.1101/2020.10.21.20217398 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 25, 2020. ; https://doi.org/10.1101/2020.10.21.20217398 doi: medRxiv preprint Archived: WHO Timeline -COVID-19, World Health Organization Understanding the spatial diusion process of severe acute respiratory syndrome in Beijing, Public Health Disaster Risk Management Unit Beirut explosion: What we know so far Spatial Autocorrelation Spatial Processes: Models and Applications The spatial autocorrelation of cancer mortality Spatial Analysis and Correlates of County-Level Diabetes Prevalence Understanding the Spatial Clustering of Severe Acute Respiratory Syndrome (SARS) in Hong Kong Local Spatial and Temporal Processes of Inuenza in Pennsylvania Spatiotemporal dynamics of the COVID-19 pandemic in the State of Kuwai City size and the spreading of COVID-19 in Brazil Associating COVID-19 Severity with Urban Factors: A Case Study of Wuhan Spatial and temporal dierentiation of COVID-19 epidemic spread in mainland China and its inuencing factors The role of transport accessibility within the spread of the Coronavirus pandemic in Italy A multicriteria approach for risk assessment of Covid-19 in urban district lockdown A new method for computing the mean center of population of the United States New methods of geostatistical analysis and graphical presentation Where are we? Comments on the Concept of Center of Population Is the world's economic centre of gravity already in Asia? The dynamics of COVID-19 spread: Evidence from Lebanon Forecasting the outbreak of COVID-19 in Lebanon, 2020, medRxiv Modeling and Simulation of the spread of coronavirus disease (COVID-19) in Lebanon Spatial epidemic dynamics of the COVID-19 outbreak in China Coordinate Systems and Map Projections The author has no acknowledgements. Raw data was obtained from publically available and cited resources. All used data and codes are available upon request. The expected value of Moran's I statistic is given by:while its variance is dened as:and A and B are given by:consequently, the z I -score is given by z I = I−E[I] √ V [I] .