key: cord-0851475-phh53kaz authors: Wang, Qian; Dong, Wen; Yang, Kun; Ren, Zhongda; Huang, Dongqing; Zhang, Peng; Wang, Jie title: Temporal and spatial analysis of COVID-19 transmission in China and its influencing factors date: 2021-03-09 journal: Int J Infect Dis DOI: 10.1016/j.ijid.2021.03.014 sha: 61717f70403aaceddfb9e0af649c51c21cb62f73 doc_id: 851475 cord_uid: phh53kaz OBJECTIVES: The purpose of this study was to explore the temporal and spatial characteristics of COVID-19 transmission and its influencing factors in China from January to October 2020. METHODS: About 81,000 COVID-19 confirmed case data, Baidu migration index data, air pollutants, meteorological data, and government response strictness index data were collected from 31 provincial-level regions (excluding Hong Kong, Macao, and Taiwan) and 337 prefecture-level cities. The spatio-temporal characteristics of COVID-19 were explored using spatial autocorrelation, hot spot, and spatio-temporal scanning statistics. At the same time, Spearman rank correlation analysis and multiple linear regression were used to explore the relationship between influencing factors and confirmed COVID-19 cases. RESULTS: The distribution of COVID-19 in China tends to be stable over time, with spatial correlation and obvious clustering regions. spatio-temporal scanning analysis showed that most COVID-19 high incidence months were from January to March at the beginning of the epidemic, and the area with the highest aggregation risk was Hubei Province (RR = 491.57), which was 491.57 times the aggregation risk of other regions. Among the meteorological variables, the daily average temperature, wind speed, precipitation, and new COVID-19 cases were negatively correlated. The air pollution concentration and migration index were positively correlated with new confirmed cases, and the government response strict index was strongly negatively correlated with confirmed COVID-19 cases. CONCLUSIONS: Environmental temperature has a certain inhibitory effect on the transmission of COVID-19, and the air pollution concentration and migration index have a certain promoting effect on the transmission of COVID-19. The strict government response index indicates that the greater the intensity of government intervention, the fewer COVID-19 cases will occur. At the end of 2019, a novel Coronavirus (SARS-COV-2) pneumonia outbreak appeared in Wuhan, China (Alberti and Faranda, 2020 with an expected incubation period of approximately 2-10 days (Li et al., 2020) . Due to the high continuity and widespread of the epidemic, the development of COVID-19 has drawn increasing global attention. According to the COVID-19 Programme of the National Health and Family Planning Commission (version 4), the symptoms of COVID-19 are fever, fatigue, dry cough, and some patients are accompanied by nasal congestion, runny nose, and diarrhea. Early COVID-19 outbreak in China, large population flow and gathered could increase the spread of COVID-19 and popular, a serious threat to human life. The Chinese governments quickly adopted emergency measures to reduce and prevent the spread of the virus, announced on 23 February 2020 in Wuhan to halt, halt the city public transportation, and limit the implementation of unprecedented personal mobility. One month later, the effect of isolation gradually emerged, proving that strict restrictions on population movement can play a positive role in curbing the spread of the epidemic (Manevski et al., 2020) . As of October 2020, there are more than 39.96 million cumulative J o u r n a l P r e -p r o o f confirmed cases of COVID-19 globally. However, the COVID-19 situation in China has been brought under control and significantly improved. But with the rapid increase in confirmed COVID-19 cases globally, the COVID-19 epidemic shows no signs of slowing down. It is essential to support global cooperation and concerted prevention and control of COVID-19 (Lazarus et al., 2020) . Therefore, it has become an urgent scientific issue to grasp the temporal and spatial changes of COVID-19 transmission and clarify its driving mechanism. Shortly after the outbreak of COVID-19, some scholars conducted extensive research on the epidemic from the aspects of pathogenesis, virology, biology, and clinical medicine and achieved fruitful results, providing an important scientific basis for the prevention and control of COVID-19. (J et al., 2020) explored the nonlinear relationship between ambient temperature and confirmed COVID-19 cases using a generalized additive model. The results indicated that higher temperature might not limit the transmission of this novel Coronavirus (Xie and Zhu, 2020) . explored the relationship between environmental air pollutants and newly confirmed cases of COVID-19 every day, proving that there was a statistically significant relationship between air pollution and COVID-19 infection. The results showed that short-term exposure to high concentrations of PM 2.5, PM 10, CO, NO 2, and O 3 was associated with an increased risk of COVID-19 infection (Zhu Yongjian et al., 2020) . (Sannigrahi et al., 2020) found a strong positive correlation between income/population and COVID-19 cases/deaths in the study, suggesting that J o u r n a l P r e -p r o o f these two factors may be key control variables for determining overall human casualties caused by COVID-19 in European countries (Sannigrahi et al., 2020) . (Wu et al., 2020) Based on confirmed COVID-19 cases and residents' travel by train, plane, and road, the mathematical model was used to predict the trend of the epidemic, and the results showed that about 75,815 people in Wuhan would be infected in the early stage of the epidemic (Wu et al., 2020) . (Chen et al., 2020) studied the correlation between migration index and the number of confirmed COVID-19 cases from 23 January 2020 to 12 February 2020 (Chen et al., 2020) . The results suggest that Wuhan may have played a positive role in controlling COVID-19 by blocking and activating the first-level emergency response to this major public health emergency (Saqib, 2020) . Although some scholars reveal the transmission law of COVID-19 from the perspective of geography, they consider too monotonous factors when discussing the transmission factors of COVID-19 and fail to consider the influencing factors of various aspects comprehensively. So they cannot have an in-depth understanding of the space-time pattern and influencing factors of the COVID-19 epidemic, which is of great significance for the prevention and control of the epidemic. In this study, the number of newly confirmed cases per day in in mainland China was used as the measurement index, and spatial statistics and spatio-temporal scanning methods were used to describe the spatio-temporal distribution of epidemic transmission. Secondly, traditional J o u r n a l P r e -p r o o f statistical methods were used to identify the key factors affecting the spread of the COVID-19 epidemic from the two aspects of social factors and natural factors, to provide a scientific basis for clarifying the spread law of the epidemic and formulating relevant prevention and control measures (Booth et al., 2020) . The study used data sets from five different sources, including confirmed COVID-19 case data, Baidu Migration index data, air quality and meteorological data, and government Response Strict Index data. Data sets of 31 provinces (excluding Hong Kong, Macao, and Taiwan) and 337 prefecture-level cities were collected from the National Health Commission, PRC (http://www.nhc.gov.cn/) and (https://ncov.dxy.cn/) for the cumulative and daily confirmed cases of COVID-19 in China from January to October 2020, with about 81,000 entries. ArcGIS software is used to realize the visualization of the epidemic situation, in which the coordinate system is GCS_Beijing_1954, and the projection is LAMBERT_CONFORMAL_CONIC. The migration scale index data was derived from Baidu migration data (http://qianxi.baidu.com/). This study uses the population migration scale indicators of 339 prefecture-level cities in China, including intra-city travel intensity, emigration index, and immigration index. In this study, average daily air quality data were extracted from the online air quality monitoring and analysis platform (https://www.aqistudy.cn/) during the study period, including air quality index (AQI), sulfur dioxide (SO2), nitrogen dioxide (NO2), particulate matter with aerodynamic diameter <10 um (PM10) and 2.5 um (PM2.5), carbon monoxide (CO) and ozone (03). The meteorological data comes from the China Meteorological Data Service Center (http://data.cma.cn/en), including average temperature, air pressure, precipitation, and wind speed. Government response strictness index (www.bsg.ox.ac.uk/covidtracker) is the Oxford COVID-19 government response tracking system (OxCGRT) provided by a system of measures across countries and across time, to understand the government's response in the evolution of the epidemic spread during the whole process (Sannigrahi et al., 2020) . Common measures include a series of standardized indicators such as school closures, travel restrictions, and bans on public gatherings to track government policies and interventions to contain the spread of the virus, strengthen health systems, and manage the economic consequences of these actions (Kim and Castro, 2020) . Spatial statistics and the modeling tool of ArcGIS 10.6 were used to test whether the confirmed provincial and municipal cases had significant global or J o u r n a l P r e -p r o o f local spatial autocorrelation. Spatial autocorrelation, which measures spatial autocorrelation based on feature locations and eigenvalues, can be divided into global spatial autocorrelation and local spatial autocorrelation (Eryando et al., 2020) . Use global Moran's I statistics to assess whether the cumulative number of confirmed COVID-19 cases in each region is spatially relevant. Moran's I obey normal distribution and test significance based on Monte Carlo simulation of stochastic permutation process (Briz-Redon and Serrano-Aroca, 2020).I range from -1(dissimilar value clustering) to +1(similar value clustering), with 0 indicating that there is no spatial autocorrelation. The greater the absolute value of I, the stronger the spatial autocorrelation was . Where n is the total number of samples, Of the x i , i is confirmed cases, x ̅ is the average, Wij for spatial weight matrix between i and j, is a covariance, the product of the adjacency matrix W and (x i − x ̅)(x j − x ̅) is equivalent to the calculation of the number of adjacents confirmed cases, so I value depends on the size of the i and j values for the mean deviation symbol, if the adjacent position, x i , and x j have the same number, then I is positive, or negative. The score test is carried out, and the expression of score ZI is: Local spatial autocorrelation is usually characterized by local Moran's I, and Getis-Ord Gi*.Local Moran's I is the decomposition of global Moran's I into sub-regional units. In this study, ArcGIS was used to detect the local spatial autocorrelation characteristics of COVID-19 cases and identify areas with significant-high/low aggregation. In addition, hot spot analysis (Getis-Ord Gi*) was used to identify hot spots and cold spots of COVID-19 cases at different spatial scales, with statistical significance . In addition, a retrospective spatio-temporal scanning statistical method was used to include a time dimension in the analysis and to determine when, where, and when the cluster occurred (and for how long) (Vadrevu et al., 2020) . Because COVID-19 is transmitted from person to person, and more cases are likely to occur in densely populated areas, we selected the Poisson model in SaTScan to calculate the population in each area. The base of the spatio-temporal scanning statistics is positioned around a possible center in the entire area, with the radius changing continuously from zero to a specified J o u r n a l P r e -p r o o f maximum (Vadrevu et al., 2020) . Due to the infectivity of the disease, the number of daily cases increases rapidly, and the time interval of this study is one day. The maximum spatial clustering scale was set to 15% of the risk population, and the maximum temporal clustering scale was set to 35% in order to avoid super-large (and therefore meaningless) clustering. Secondly, the Spearman rank correlation analysis method was used to analyze the relationship between confirmed COVID-19 cases and migration index, air quality and meteorological data, and government response strict index. Descriptive statistical results showed that COVID-19 outbreaks and related data did not meet the preconditions of Pearson correlation analysis and were mainly manifested as the non-Gaussian normal distribution, spatial autocorrelation, and possible nonlinear relations. In general, Spearman's rank correlation is an appropriate nonparametric estimator for estimating the correlation between two variables with unknown or non-Gaussian statistical distributions, and the relationship between these variables does not need to be linear . It is usually measured in terms of Spearman's rank correlation coefficient ρ; the formula is as follows: ArcGIS software was used to classify the cumulative number of confirmed COVID-19 cases in China into the following five categories: 1-50;51-100.101-500;501-1000;> 1000. (Fig. 1) As can be seen from Fig.1 , as of 31 January, the cumulative number of cases of COVID-19 at the beginning of the outbreak was 15, 2, and 1, respectively, in the 101-500, 501-1000, and >1000 ranges. Hubei province (7037 cases) accounted for 61.43% of the total cases, and Wuhan city (3215 cases) had the most confirmed cases, accounting for 45.69% of the total cases in Hubei province. In terms of spatial distribution, the regions with a high number of confirmed cases are mostly located around Hubei Province, indicating that the COVID-19 epidemic has been confirmed. On 29 February, the cumulative number of cases in 101-500 (29 cases), 501-1000 (7 cases), and > 1000 (8 cases), the number of cases have risen sharply, reflecting the epidemic geographical scope expanded significantly. Except for 1-50 people J o u r n a l P r e -p r o o f interval can be found all other interval percentages of the city is on the rise, indicates that the outbreak of epidemic has reached the stage. In terms of spatial distribution, it has the highest number of confirmed cases accumulated; there is a clear trend of a continuous distribution, mainly concentrated in Wuhan and economically developed cities near (for example, Shanghai and Beijing). While the number of cases with less area coverage relatively stable (for example, Qinghai, Tibet, and Xinjiang), the differences between a show that the region is increasing. Due to the closure of cities in February, the epidemic has been effectively brought under control. From 31 March to 30 April, except for Beijing, Shanghai, and Guangdong, the cases in other regions remained basically unchanged, which means that the spread of the epidemic has been initially contained. As of 31 July, the rapid increase of cases in Urumqi city, Xinjiang province (552 cases) was effectively controlled on 31 August. In this paper, the cumulative number of confirmed COVID-19 cases was taken as a variable, the spatial weight matrix based on geographical adjacency was selected, and the global Moran's I index, P test value, and Z statistical score were used to determine the number of confirmed cases with different types of neighbors (Vadrevu et al., 2020) .To investigate whether there is a spatial correlation between confirmed COVID-19 cases in China. (Fig.2) Fig.2 shows the Moran's I index and Z statistical scores of the cumulative confirmed COVID-19 cases in China's prefecture-level cities from 15 January to 1 October.On January 15, the global spatial autocorrelation (p > 0.05, Z < 1.96) was not significant, while on January 23, there was a significant global spatial autocorrelation (p < 0.05, Z > 1.96).On 31 January, solstice, and 1 October, the cumulative confirmed cases at the prefecture-level showed a significant global spatial autocorrelation (p<0.0001, Z> 9.58), indicating that the cumulative confirmed cases at the prefecture-level showed a very significant spatial dependence. In Fig.2 , the trend change characteristics of Moran's I index are presented in two stages: the first increase and then decrease from 23 January, indicating that this may be a turning point. This means that although the degree of clustering is lower than before, the global spatial correlation is still dominated by clustering characteristics and tends to develop in a decentralized manner. Different from global spatial autocorrelation, local spatial autocorrelation analysis deals with heterogeneous regions. Clustering and outlier analysis was conducted on the data of cumulative confirmed COVID-19 cases in China's prefecture-level cities from 15 January to 1 October. The results were shown in March, Anqing city in Anhui Province became an extremely important hot spot area again, and the hot spot area remained unchanged after April (Fig.4) . Spatio-temporal clustering of confirmed COVID-19 cases was explored using the spatio-temporal scanning analysis method, and a total of 6 clustering areas were detected through SaTScan software, including 17 provinces (Fig.5) . It was found that the actual number of COVID-19 cases increased abnormally compared with the theoretical number during the period from 27 January 2020 to 1 March 2020, indicating a high incidence of COVID-19 aggregation. Scanning analysis on a daily basis showed that 6 April to 19 April was a high incidence month in Heilongjiang province, 27 July to 2 August was a high incidence month in Xinjiang Province, and the rest of the high incidence months were concentrated in the early phase of the epidemic from January to March. In addition, the relative risk (RR) and likelihood ratio (LLR) of the first three aggregation regions were all >3, and all P <0.05. The aggregation region with the largest RR (491.57) was located in Hubei Province. That is, the aggregation risk of this region was 491.57 times that of other regions, and its LLR value (280453.82) was also the highest (Table 1) . Numbers and then obtains the average value of these specific policies, such as school closures, business closures, cancellation of public events, and generalized codes to represent the scope of specific policies (Sannigrahi et al., 2020) . The index is, therefore, a good indicator of the government's response to the current crisis. Table2 summarizes descriptive statistics for COVID-19 confirmed cases and government response strictness index variables. 4.79 knots = 2.46 m/s,0.14 inches =3.556 mm and 9.89 knots =5.08m/s, respectively. The average daily immigration index (IM), emigration index (EM), and inner-city travel intensity (Inner) were 0.83, 0.83, and 3.99, respectively. As the variables are not normally distributed, the Spearman rank correlation test was performed on the variables. Table 4 shows the correlation coefficient between the newly increased COVID-19 cases per day and the air pollution concentration and meteorological variables. Among the meteorological variables, the daily average temperature, wind speed, precipitation were negatively correlated with the daily new cases (Zhu Liting et al., 2020) . Maximum wind speed was positively correlated with newly confirmed cases. Air pollution indexes (CO, PM2.5, PM10, and SO2), air quality index (AQI), and migration index were positively correlated with newly confirmed cases. International travel control (C8) in the Government response stringency index was strongly negatively correlated with confirmed cases of COVID-19, with P <0.05, which was statistically significant. Table 5 assumes that air pollution concentrations and weather information data from the date of the daily report of new COVID-19 cases (lag 0, lag 03, lag 07, and lag 014, i.e., the incubation period of COVID-19) are also analyzed. Multiple linear logistic regression results showed that temperature and international travel control were strongly negatively correlated with confirmed J o u r n a l P r e -p r o o f COVID-19 cases, excluding the remaining collinearity. The maximum wind speed has a weak negative correlation. The intensity of Inner city travel is positively correlated, and its regression equation is y=-0.749* TEMP-0.300 *gust+0.434*Inner-0.774*c8 (Table 6) . China and the key influencing factors of its transmission, which will be of great significance to contain the spread of the COVID-19 epidemic, to provide a reference for the formulation of public health policies, and to promote production recovery. The mining of geospatial information can not only reveal the spatio-temporal transmission and clustering characteristics of the epidemic but also find out the spatial risk factors that have an important influence on the spread of the epidemic and identify the hot spots with high transmission risk, which is of great significance to the scientific prevention and control of the epidemic. This paper shows the spatial distribution of the cumulative confirmed cases of COVID-19, and it can be seen that the areas with more cases have an obvious continuous distribution trend, with Hubei Province as the center for diffusion, while the areas with fewer cases have a relatively stable coverage. Moran's I is usually an important index to measure spatial correlation, which J o u r n a l P r e -p r o o f can be divided into global Moran's I and local Moran's I. The global Moran's I can only indicate whether space is clustered or outlier, not where it is located, while the local Moran's I will indicate where the outlier or cluster appears. (Hafner, 2020) fitted the spatial autoregressive model for the global COVID-19 propagation process and found that the estimated spatial correlation was highly significant, which is consistent with our results. As shown in Section 3.2, the global Moran's I index, p-test value, and Z-statistical score were used to determine that there was a significant spatial association among the confirmed COVID-19 cases in China. Secondly, local Moran's I was helpful to effectively detect the significant clustering/outlier areas and significant hot spots at different spatial levels in China. Hot spot analysis was used to identify hot spots and cold spots of COVID-19 cases at different spatial scales. As far as China is concerned, the first-generation epidemic transmission pattern is from the South China seafood market to the whole city of Wuhan. In the second generation epidemic transmission mode, the epidemic spread from Wuhan to the counties and cities in Hubei Province, the key cities outside Hubei Province and abroad, and the epidemic spread into the key cities outside Hubei Province will also further spread within them. The third-generation epidemic transmission model was imported from other countries to China. The results show that the closer to the high-risk area, the greater the risk, especially at the prefecture-level city scale. This finding is helpful for governments at all levels to take effective classified prevention and control measures. This paper also has some limitations. The factors that affect the spread of the epidemic are very complex. Based on the available data, this paper constructs an indicator system that affects the multi-factors of the epidemic (Pei et al., 2020) . Other non-quantitative indicators may be ignored, which increases the inadequacy of the evaluation of the study results. Without J o u r n a l P r e -p r o o f access to detailed information (such as age, sex, medical history, and smoking status) about the patients diagnosed with COVID-19, it is impossible to determine how the underlying health problems may have contributed to COVID-19 infection, which may have weakened the study results. In addition, regional differences in a medical capacity and socioeconomic status may also affect the number of COVID-19 patients. However, attempts to analyze possible external environmental impacts are important for the protection of healthcare professionals and the containment of the COVID-19 epidemic . It may be helpful for future studies to consider the epidemiological parameters and social context of COVID-19 in more detail. Based on the multi-scale open data of the COVID-19 epidemic in 337 prefecture-level cities in China, a new geographic database was established for the study of the COVID-19 epidemic. In addition, this paper used ArcGIS spatial statistical method to analyze the spatial-temporal pattern of the COVID-19 epidemic and explore the influencing factors to analyze the outbreak and transmission of the epidemic in China from January to October 2020 and reached the following conclusions. (1) Global spatial autocorrelation was used to confirm that there was a spatial correlation among the confirmed cases of COVID-19 in China, and the correlation characteristics were firstly increased and then decreased; However, J o u r n a l P r e -p r o o f local spatial autocorrelation, the correlation characteristics tend to be stable with the passage of time, mainly composed of high/low aggregation regions. The hot spots have also stabilized over time in the surrounding provinces of Hubei (Henan, Hunan, Anhui, and Jiangxi). (2) Spatial and temporal clustering, high incidence area, and time of confirmed cases of COVID-19 were explored by using spatial and temporal scanning analysis method. Hubei Province (2020/1/27-2020/3/1) had the highest RR(491.57), that is, the clustering risk in this region was 491.57 times that of other regions. In conclusion, the transmission rate of COVID-19 in China has obvious spatial variation, and the spatio-temporal aggregation is also obvious. Research shows that the population migration, air pollution concentration, and temperature on the spread of COVID -19 played a positive role, government response strictness index, namely the related policy-making plays an inhibitory effect on COVID -19 spread, the less the greater the intensity of government intervention, the number of cases will, in order to prevent the spread and predict the new epidemic mitigation strategy played an important role. The study was supported by the National Natural Science Foundation of China (Grant No. 41661087 ). This study did not require ethical approval as the analysis was based on publicly available data. J o u r n a l P r e -p r o o f J o u r n a l P r e -p r o o f On the uncertainty of real-time predictions of epidemic growths: A COVID-19 case study for China and Italy Development of a prognostic model for mortality in COVID-19 infection using machine learning A spatio-temporal analysis for exploring the effect of temperature on COVID-19 early evolution in Spain Correlation between the migration scale index and the number of new confirmed coronavirus disease 2019 cases in China The Risk Distribution of COVID-19 in Indonesia: A Spatial Analysis Role of the chronic air pollution levels in the Covid-19 outbreak risk in Italy The Spread of the Covid-19 Pandemic in Time and Space Impacts of nationwide lockdown due to COVID-19 outbreak on air quality in Bangladesh: a spatiotemporal analysis. Air Quality Atmosphere and Health Spatiotemporal pattern of COVID-19 and government response in South Korea (as of May 31, 2020) COVID-SCORE: A global survey to assess public perceptions of government responses to COVID-19 (COVID-SCORE-10) Multivariate Analysis of Black Race and Environmental Temperature on COVID-19 in the US Impact of city lockdown on the air quality of COVID-19-hit of Wuhan city Impact of meteorological factors on the COVID-19 transmission: A multi-city study in China Modeling COVID-19 pandemic using Bayesian analysis with application to Slovene data Response of major air pollutants to COVID-19 lockdowns in China A global analysis on the effect of temperature, socio-economic and environmental factors on the spread and mortality rate of the COVID-19 pandemic The changing patterns of COVID-19 transmissibility during the social unrest in the United States: A nationwide ecological study with a before-and-after comparison Examining the association between socio-demographic composition and COVID-19 fatalities in the European region using spatial regression approach Forecasting COVID-19 outbreak progression using hybrid polynomial-Bayesian ridge regression model Spatial and temporal variations of air pollution over 41 cities of India during the J o u r n a l P r e -p r o o f COVID-19 lockdown period Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Association between ambient temperature and COVID-19 infection in 122 cities from China Meteorological impact on the COVID-19 pandemic: A study across eight severely affected regions in South America Association between short-term exposure to air pollution and COVID-19 infection: Evidence from China