key: cord-322571-8u4c2xqg authors: Sannigrahi, Srikanta; Pilla, Francesco; Basu, Bidroha; Basu, Arunima Sarkar; Molter, Anna title: Examining the association between socio-demographic composition and COVID-19 fatalities in the European region using spatial regression approach date: 2020-08-01 journal: Sustain Cities Soc DOI: 10.1016/j.scs.2020.102418 sha: doc_id: 322571 cord_uid: 8u4c2xqg The socio-demographic factors have a substantial impact on the overall casualties caused by the Coronavirus (COVID-19). In this study, the global and local spatial association between the key socio-demographic variables and COVID-19 cases and deaths in the European regions were analyzed using the spatial regression models. A total of 31 European countries were selected for modelling and subsequent analysis. From the initial 28 demographic variables, a total of 2 (for COVID-19 cases) and 3 (for COVID-19 deaths) key variables were filtered out for the regression modelling. The spatially explicit regression modelling and mapping were done using four spatial regression models such as Geographically Weighted Regression (GWR), Spatial Error Model (SEM), Spatial Lag Model (SLM), and Ordinary Least Square (OLS). Additionally, Partial Least Square (PLS) and Principal Component Regression (PCR) was performed to estimate the overall explanatory power of the regression models. For the COVID cases, the local R(2) values, which suggesting the influences of the selected demographic variables on COVID cases and death, were found highest in Germany, Austria, Slovenia, Switzerland, Italy. The moderate local R(2) was observed for Luxembourg, Poland, Denmark, Croatia, Belgium, Slovakia. The lowest local R(2) value for COVID-19 cases was accounted for Ireland, Portugal, United Kingdom, Spain, Cyprus, Romania. Among the 2 variables, the highest local R(2) was calculated for income (R(2) = 0.71), followed by poverty (R(2) = 0.45). For the COVID deaths, the highest association was found in Italy, Croatia, Slovenia, Austria. The moderate association was documented for Hungary, Greece, Switzerland, Slovakia, and the lower association was found in the United Kingdom, Ireland, Netherlands, Cyprus. This suggests that the selected demographic and socio-economic components, including total population, poverty, income, are the key factors in regulating overall casualties of COVID-19 in the European region. This study found that the demographic composition, as well as key socio-economic determinants of the country, predominantly controls the high rate of mortality and casualties caused by COVID-19. In this study, the influence of the other controlling factors, such as environmental conditions, socio-ecological status, climatic extremity, etc. have not been considered. This could be the scope for future research. The global pandemic caused by Coronavirus (COVID-19), a new genre of acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has become a global health concern for its unpredictable nature and lack of adequate medicines (WHO, 2020; Ma et al., 2020; Gorbalenya et al., 2020) . Since no medicine is available yet to treat this novel disease, the rate of mortality and casualties due to COVID-19 is unimaginably rising worldwide from its first emergence in December 2019 in Wuhan, China. However, according to WHO, 2020, the rate of COVID-19 deaths depends on the immune system of a person, as most of the J o u r n a l P r e -p r o o f in New York City's neighborhoods. This study found that occupations were substantially explaining the observed COVID-19 patterns as people with high-level social outreach and higher social interaction were more vulnerable to be infected to the virus. Several other studies have also evaluated the association between the explanatory variables such as neighborhood characteristics (Borjas, 2020); age structure (Dowd et al., 2020; Kulu & Dorey, 2020) ; psychological interventions (Duan & Zhu, 2020) ; pre-existing health records ; population flows and control measures ; the influence of social and economic ties (Mogi & Spijker, 2020) and COVID-19 cases and deaths across the globe. This study further advances the assessment of the impact of demographic and socioeconomic parameters on the spread of COVID-19 cases and deaths across Europe by adopting spatial regression-based approaches. Spatial regressions models have been used extensively in many epidemiological studies ranges across the scale . Diuk-Wasser et al., (2006) evaluated the spatial distribution of mosquito vectors for West Nile virus in Connecticut, the USA using logistic regression models. Kauhl et al. (2015) have evaluated the spatial distribution of Hepatitis C virus infections and associated determinants using Geographically Weighted Poisson Regression (GWPR) model. Kauhl et al., (2015) study also advocated the uses of Geographic Information Systems (GIS) and spatial epidemiological methods for providing viable screening interventions with identifying spatial hotspots/clusters as well as demographic and socio-economic determinants that have a strong association with the casualties caused by the virus. Linard et al., (2007) study on determining the geographic distribution of Puumala virus and Lyme borreliosis infections in Belgium found that the environmental and socio-economic factors play a crucial role in controlling the spatial variation in disease risk. Mollalo et al., (2020) performed GIS-based spatial modelling to evaluate the impact of socioeconomic, behavioural, environmental, topographic, and demographic factors on COVID-19 incidence in the continental United States and found that different explanatory variables including income inequality, median household income, the proportion of black females, and the proportion of nurse practitioners, etc. largely control the spatial distribution of COVID-19 cases in the USA. Malesios et al., (2020) study evaluated the spatiotemporal evolution patterns of the Bluetongue virus outbreak on the island of Lesvos, Greece, and found a strong spatial autocorrelation between the spread of Bluetongue virus and farms located nearby. The traditional statistical approaches, including principal component analysis (Varraso et al., 2012) , clustering (Merlo et al., 2006) , factor analysis (Meigs, 2000) , single/multiple J o u r n a l P r e -p r o o f regression (Blyth et al., 2001) , multivariate regression (Lewis and Ward, 2013) , etc. have been used extensively for epidemiological studies to identify the determinants that regulate the incidence, prevalence, and overall mortalities caused by any viruses. However, all these traditional statistical approaches are based on one fundamental assumption: that samples that have been used in these models are independent of one another Kauhl et al., 2015) . This classical and straightforward statistical assumption and ignorance of spatial dependency in parameter estimates led these approaches unreliable while addressing spatial dependencies in the observations. On the other hand, the spatial regression models (SRM), such as spatial lag model (SLM), spatial error model (SEM), spatial autoregressive model (SAM), spatial durbin model (SDM), geographically weightage regression (GWR), etc. were found highly effective and reliable when variables are locally varying, spatially dependent, and autocorrelated. Unlike ordinary regression, the spatial regression approach considers the spatial autocorrelation among the observation. Moreover, the spatial regression models can effectively estimate the influence of independent factors on target variables by differentiating the spatial dependence by including the lag and error components of independent features Kauhl et al., 2015; Yang and Jin, 2015) . These functional capabilities make the spatial regression models a promising alternative for spatial epidemiological studies. Only a few studies are available so far that investigated the close association between socio-demographic determinants and the spread of COVID-19 using spatial regression approach. Therefore, this study has made an effort to address the mentioned research gap and to provide effective solutions for future preparedness for COVID-19 like situation. The main objectives of this study are: (1) to identify the key socio-demographic driving factors that have a substantial impact on the overall pattern of COVID-19 casualties; (2) implementing global and local spatial regression models to assess the spatial association between the driving factors and COVID-19 cases/death. The COVID-19 cases and deaths data was retrieved from 31 st December 2019 to 29 th April 2020 from European Union Open data portal 1 . Few European countries (Albania, 1 https://data.europa.eu/euodp/en/data/dataset/covid-19-coronavirus-data J o u r n a l P r e -p r o o f Andorra, Bosnia and Herzegovina, Czech Republic, Faroe Islands, Guernsey, Jan Mayen, Jersey, Liechtenstein, Macedonia, Monaco, Montenegro, San Marino, Serbia, and Turkey) were discarded from the analysis due to data unavailability. COVID cases and death per 100,000 population was considered for the predictive modelling and subsequent interpretation. The socio-demographic data for the European region was collected from Eurostat 2 . Initially, a total of 28 socio-demographic variables have been identified for regression modelling. The description of the variables is given in Table. S1. These variables have gone through all types of horizontal and cross-sectional adjustments to make the data comparable between different countries. The European Union Statistics on Income and Living Conditions 3 (EU-SILC) usually provides two types of data, (i) cross-sectional data, that takes a particular time or a time frame into consideration, and (ii) longitudinal data, concerning the changes of individual component overtime or a time period. The detailed background methodology about how these predictors were computed can be found on Eurostat 4 . Using the stepwise forward regression approach, a total of 2 (for COVID-19 cases) and 3 (for COVID-19 deaths) variables were selected for the final analysis. The log transformation approach was adopted to address the scale effect and skewness in the datasets. To further ensure data normality, four different tests were employed, i.e., Shapiro-Wilk, Anderson-Darling, Lilliefors, and Jarque-Bera. All these four tests collectively indicate that data are normally distributed; hence, we accepted the null hypothesis, which assumes that "the variable from which the sample was extracted follows a normal distribution." For the COVID-19 deaths, a four parameters model was developed using three explanatory variables-income (Inc), poverty (Pov), and total population (TotPop). For the COVID-19 cases, a three-parameter regression model has been developed by incorporating 2 variables in the model, i.e., income (Inc) and poverty (Pov). These filtered variables had acceptable (<2) variable inflation factor (VIF) values and explained substantial model variances. All these variables exhibited spatial non-stationarity and hence produced a spatially dependent output in different modelling set-ups. Additionally, partial least square regression (PLSR) and principal components regression (PCR) modelling was conducted for identifying the key variables and to develop multivariate regression models for COVID-19 cases and deaths. All four spatial regression models have produced statistically significant estimates at different probability levels. Additionally, both global (spatial autocorrelation) and local (Getis-Ord-Gi hotspot/cold spot) analysis was carried out for evaluating global and local distribution and significance of the features. Different spatial dependence tests, including Moran's I (error), Lagrange Multiplier (lag), Robust LM (lag), Lagrange Multiplier (error), Robust LM (error), Lagrange Multiplier (SARMA), etc. was performed to evaluate the spatial dependencies in observation and relevance of spatial regression modeling in this study. The spatial regression models (SRM) have been used extensively for evaluating demographic pattern analysis (Chi & Zhu, 2008) , estimating land surface temperature (Jain et al., 2019; Chakraborti et al., 2018) , urban air quality monitoring (Fang et al., 2015) , ecosystem service valuation (Sannigrahi et al., 2020a; Sannigrahi et al., 2020b) . The specific application of spatial regression models is to understand the spatial effects such as spatial autocorrelation, spatial stationarity, and heterogeneity of feature distribution. In this study, total four spatial regression models, i.e., Geographically weighted regression (GWR), Spatial Error Model (SEM), Spatial Lag Model (SLM), and Ordinary Least Square (OLS) models were implemented to evaluate how the socio-demographic factors are shaping the pattern of COVID-19 case/deaths across Europe. Among these four regression models, the global interaction between the demographic factors and COVID-19 cases/deaths were analyzed using OLS, SEM, SLM models as these model are not impacted by spatial autocorrelation or homogeneity in the feature space. The local association between the control and response variables was calculated using the GWR model. The GWR model is a local spatial regression model that assumes that traditional 'global' regression models such as OLS, SEM, SLM, etc. may not be effective enough do describe spatial variation of interactions, especially when spatial process varies with spatial context (Chen et al., 2018; Oshan et al., 2019 Oshan et al., , 2020 Mollalo et al., 2020) . Unlike OLS, SEM, SLM models, the GWR model depends on the assumption of spatial non-stationarity and heterogeneity in feature space and quantifies the locally varying parameter estimates (Fotheringham et al. 1996; Brundson et al., 2002; Fotheringham and Oshan, 2016) . GWR calculates the location-specific interaction among the control and response variables after J o u r n a l P r e -p r o o f integrating the spatially referenced data layers (Brundson et al., 2002; Lugoi et al., 2019; Fotheringham and Oshan, 2016 ). The GWR model will be ineffective and may produce biased estimates if spatially autocorrelated regression residuals are statistically significant, or one or more control variables exhibit unexpected spatial variation among the regression coefficients. refers to the vector of the parameter estimates ( m ×1), X denotes the matrix of the explanatory variables () nm  , () Wiis the local spatial weight matrix () nn  , y is the vector of the response variable (Fotheringham and Oshan, 2016; Mollalo et al., 2020) . (Brunsdon et al., 1996) suggested that the GWR can easily compute locally varying parameter estimates, and thus found to be highly effective to produce detailed spatially explicit maps of locational variations in relationships. Regarding the kernel selection and defining local weight matrix in the GWR model, an adaptive bi-square (based on nearest neighbor) kernel selection approach, which is found more accurate than a fixed distance-based kernel parametrization, was adopted for GWR modelling. For optimum bandwidth selection using the nearest neighbor's information, model inbuilt golden search function was used. Additionally, the selection of bandwidth and parametrization of the number of nearest neighbour was made by verifying the AIC values. The other spatial regression models (OLS, SEM, SLM) are global in nature, and therefore, no local spatial weight is parameterized for these models. However, for defining the global spatial weight, the first-order Queens' contiguity approach was adopted. The OLS is a type of global regression models that examine the (non)spatial relationships between the set of control and response variables with the fundamental assumption of homogeneity and spatial non-variability Oshan et al., 2019; Mollalo et al., 2020; Ward and Gleditsch, 2018) : Where i and yi are the COVID-19 incidence parameters, β0 is the intercept, xi is the vector of selected demographic variables, β is the vector of regression coefficients, and εi is a random error. The fundamental function of OLS is to optimize the regression coefficients (β) by reducing the sum of squared prediction errors (Anselin and Arribas-Bel, 2013; Mollalo et al., 2020; Oshan et al., 2019) . The usual OLS method assumes that the residual errors are homogenous and un-correlated, and thereby the traditional OLS has proven to be inefficient when the errors are heterogeneous and spatially correlated and lead to a bias in regression coefficient estimation (Goodchild et al., 1993; Yang & Jin, 2010) . The SLM is based on a "spatially-lagged dependent variable" and assumes the close association between the response and control variables. Additionally, SLM also assumes dependency between the independent variables, which denotes that an independent variable could be influenced by another independent variable in the neighbourhood region (Z. . Therefore, spatial lag function, which computes the influence of adjacent independent variables on another independent variable, can be used as a new independent variable in spatial regression modelling (Z. . The SLM incorporates spatial dependency between the parameters into the regression model (Anselin, 2003; Ward and Gleditsch, 2018; Mollalo et al., 2020; . where ρ is the spatial lag parameter, and Wi is a vector of spatial weights (a row of the spatial weights matrix). The weight matrix (W) of SLM indicating the neighbors at location i and connects one independent variable to the explanatory variables in feature space (Anselin and Arribas-Bel, 2013; Mollalo et al., 2020) . The SEM assumes spatial dependence in the OLS residuals, which is generated from the OLS modelling error term as OLS, often ignoring the spatial dependent independent variables in the modelling Mollalo et al., 2020) . Therefore, the residuals of OLS are decomposed into two spatial components-error term and a random error term (for satisfying the assumption in the modelling). GeoDa and GeoDaSpace software 6 . All the statistical analysis was performed in R studio 7 (an integrated development environment for R programming language), Python, XLSTAT 8 , and SPSS 9 software. Mapping and data visualization was done in ArcGIS Pro, Python, and R studio. The bivariate local Moran's I and multivariate local Geary C cluster and outlier analysis was performed using GeoDa software. The spatial distribution of COVID-19 cases and deaths are presented in Fig. 1 Fig. 1, Fig. 2 Romania, Hungary, Slovenia, respectively (Fig. 1, Fig. 2) . These heterogeneous distributions of COVID cases and deaths can be linked with the socio-demographic pattern of the country. The spatially varying local R 2 and coefficient values for each explanatory variable were computed using the GWR model (Fig. 3, Fig. 4, Fig. 5, Fig. 6 (Fig. 3) . In addition to the Local R 2 approximation, the spatially varying coefficient values for the explanatory variables were also analyzed and presented in Fig. 4 . For income factor, all the European countries exhibited a positive coefficient with different intensity, found highest in Germany, Belgium, Netherland, Italy, Austria, Slovenia, Switzerland. In contrast, comparably lower coefficient values have been approximated for Spain, Portugal, Ireland, Norway, Sweden, Finland (Fig. 4) . On the other hand, a two-parameter GWR model, which has been developed for evaluating the association between poverty and COVID cases in the European countries, has produced a negative coefficient for many cases (Fig. 4) . This indicates that poverty and COVID cases are negatively associated. For the COVID deaths, four-parameter regression models were developed using the local GWR model (Fig. 5) . The highest association between the explanatory variables, i.e. poverty, income, total populations and COVID death was accounted for Italy (R 2 = 0.71), Croatia (R 2 = 0.68), Slovenia (R 2 = 0.67), Austria (R 2 = 0.65), Hungary (R 2 = 0.64), and Greece (R 2 = 0.64). Conversely, the minimum association between these variables and COVID deaths were found in United Kingdom (R 2 = 0.002), Ireland (R 2 = 0.07), Netherland (R 2 = 0.44), Cyprus (R 2 = 0.45), Lithuania (R 2 = 0.46), Latvia (R 2 = 0.48) (Fig. 5) . In addition to the regression estimates, the spatially varying and autocorrelated coefficient estimates of the three explanatory variables were also measured and documented in Fig. 6 . Among the three variables, income and total population have exhibited a positive coefficient in the regression modelling. As expected, a negative association between J o u r n a l P r e -p r o o f poverty and COVID deaths was found in Northern European region, especially in Norway, Finland, Sweden, Estonia, Latvia, Lithuania, respectively (Fig. 6) . The spatial distribution of four confirmatory components, i.e., local correlation coefficient, local condition number, local variation decomposition proportions, and local variation inflation factors, were computed using python programming, and the same is presented in Fig. 7 which lies within the acceptable threshold (Fig. 7) . Based on these unbiased explanatory variables, a set of multi-parameter local GWR models was performed. For COVID cases, the three-parameter regression model has explained 88% model variances, and for COVID death, the four-parameter regression model was able to explain 70% model variances (Table. 1) . Additionally, both models are statistically significant at P>=0.05 significance level. The individual impact of the socio-demographic variables (2 for COVID-19 cases and 3 for COVID-19 death) on COVID incidences are also evaluated using both global (OLS) and local (GWR) regression model (Table. 2, Table. 3, Table. 4). Among the 2 variables approximated for the COVID cases, the highest local R 2 was calculated for income (R 2 = 0.82), followed by poverty (R 2 = 0.74) ( Table. 2). However, the interaction effect of these variables was found much higher (R 2 = 0.85) than the individual effect. For COVID deaths, the total population factor exhibited the highest individual effect (R 2 = 0.68), followed by income (R 2 = 0.57), and poverty (R 2 = 0.55), respectively. All these values were found statistically significant at different probability levels. Considering the combined effect of these variables on COVID death, the interaction effect of income and total population was found highest (R 2 = 0.68), followed by income/poverty (R 2 = 0.62), and poverty/total population (R 2 = 0.62). Except for income/total population, the interaction effects of the other variables (income/poverty and poverty/total population) was found higher than their individual effect. The global OLS model was also performed to re-confirm the local modelling estimates, and the results of the same are presented in Table. 3 and Table. Using the multi-parameter regression models, the prediction of COVID-19 cases and deaths was made using the local GWR model (Fig. 8) . For the cases, the income factor has predicted the COVID cases with 82% accuracy. On the other hand, the prediction accuracy (77%) was found much lower for the poverty factor. Combinedly, poverty, and income factors have explained substantial model variances and predicted the COVID cases with 85% accuracy. For the deaths, the predictive power of all three explanatory variables was measured and found highest for the total population (75%), followed by poverty (62%) and income (57%). Therefore, the GWR based prediction for both cases and deaths is suggesting the superiority of spatial regression models in explaining the heterogeneous distribution of COVID cases and deaths across Europe. Additionally, spatial regression modelling will be highly effective, where spatial dependence among the observation is quite obvious and omnipresent. Fig. 10 shows the linear association between socio-demographic variables and COVID-19 for different parametric models. For the COVID-19 cases, the coefficient of determination (R 2 ) value was recorded as 0.74, while for the deaths, the linear model has explained 50% model variances (Fig. 9). Fig. 10 explained the linear association between each socio-demographic variables and COVID-19 cases and death. Among the three sociodemographic variables, the income and total population factors exhibited a strong association with COVID cases/deaths. In comparison, the poverty factor has not shown any strong association with COVID cases/deaths. The correlation among and between the sociodemographic variables and COVID-19 cases and deaths are presented in Fig. 11 . Among the driving factors, a high correlation was observed for income; a comparably weak association was accounted for the total population factor. Moreover, a weak negative correlation was found between poverty and COVID factors (Fig. 11) . In addition to this, the correlation between the J o u r n a l P r e -p r o o f case and death factors (all the three explanatory variables approximated for cases/deaths were merged together for evaluating their collective influence on COVID cases) and COVID counts was found higher than their individual impact. The spatial dependency among the observation was tested using global Moran's I statistics (Fig. 12) . For both cases and deaths, a statistically significant spatial cluster was found. The Local Moran's I have produced a similar pattern, where 4 European countries are exhibiting a high-high spatial cluster, 3 countries show lowlow spatial cluster, and a high-low spatial cluster was observed for a single country (Fig. S1) . The spatial cluster and outliers were measured for the explanatory variables using the Local Geary C multivariate cluster method, and the same is presented in Fig. S1 . A total of 6 statistically significant spatial clusters was calculated, among which 4 were significant at P=0.05 significance level, and 2 clusters were found significant at P=0.01 significance level. For evaluating the robustness and accuracy of the spatial regression models, the normality check was done for standardized residuals of the GWR model, and for all explanatory variables, statistically significant normality scores were measured, which combinedly suggesting that model estimates are not biased and irrelevant. The overall summary of the four spatial regression models is reported in Table. 6. Among the 2 socio-demographic variables chosen for the cases, the average R 2 was observed for income (R 2 = 0.71), followed by poverty (R 2 = 0.45), respectively. For deaths, the highest R 2 value was calculated for income (R 2 = 0.51), followed by total population (R 2 = 0.49), and poverty (R 2 = 0.39), respectively. Considering the results of all four spatial regression models, the socio-demographic variables explained 88% model variances for the COVID-19 cases and 72% model variances for the COVID-19 deaths (Table. 1). The spatial distribution of COVID-19 deaths and confirmed cases were examined in order to understand how the socio-demographic structure of a country can regulate the overall casualties caused by the novel coronavirus. and Northern European region (Norway, Finland, Sweden). However, the above pattern is found somewhat different when the proportion or density of COVID cases and deaths were taken into consideration. For instance, the case density (cases per 100 000 persons) was found maximum in Luxembourg, Belgium, Spain, Ireland, and found minimum in Bulgaria, Greece, Slovakia. These statistical figures suggest that the rate of COVID infection, which indeed portray more logical and reliable estimates than its absolute counterparts, should be taken into account for unbiased estimation and effective interpretation of results. Additionally, the uneven distribution (for both cases and deaths of COVID-19) in the European countries could be linked to the age of the population (old age and median age of population). It can be seen in Fig. 2 , the median age of population in Italy and Spain is 46.5 and 43.9, and these two countries affected badly by COVID pandemic in terms of the number of cases and deaths. As of 11 th July 2020, total 34,938 and 28,403 deaths were recorded so far in Italy and Spain 10 . The spatial association between the socio-demographic variables and COVID-19 cases and deaths were found maximum in the central European regions (Germany, Switzerland, Italy, Austria). All these countries have been affected badly in terms of the total number of cases and deaths caused by COVID-19. For the cases, a weak association between income and the COVID-19 cases was evident in the western European countries (Portugal, Spain, Ireland). The same association was found considerably high in the central European region. This suggests that income factors do not have a uniform and spatial stationary interaction with COVID cases. Several factors could be responsible for this uneven distribution of spatial association. This includes the age structure of the population, ratio of the elderly population, ratio of dependent population, preexisting health records, human mobility, the socio-economic structure of the society, etc. The intercept values were calculated for the two response variables (cases and deaths), which followed the same pattern as observed for the regression estimates. The individual influences of all the socio-demographic variables on COVID-19 cases and deaths were analysed, and these variables have exhibited a statistically significant spatially dependent model estimates. Using all the (non)spatial regression models including GWR, OLS, SLM, SER, PCR, PLS, MLR, the individual and interaction effect of the demographic variables on COVID-19 cases and deaths were analyzed and reported. Among the 2 variables considered for COVID-19 cases, the income factor has strongly regulated the COVID-19 cases across the European region. For the COVID deaths, the income and total population factors were strongly correlated 10 https://www.ecdc.europa.eu/en/geographical-distribution-2019-ncov-cases J o u r n a l P r e -p r o o f with the deaths and explained substantial model variances. The positive association between income/total population and COVID cases/deaths indicates that these two factors could be the key controlling variables that determine the overall casualties caused by this pandemic in the European countries. A similar close association between the socio-demographic factors and COVID-19 was observed in Wuhan, China . found a close association between pre-existing illness of the patients, including Acute Respiratory Distress Syndrome (ARDS) and Pneumonia and its association with COVID-19 and stated that patients with existing respiratory illness were more susceptible to COVID-19. The psychological status of the people, especially the old age people, is closely linked with the diagnostic of COVID 19 . Therefore, the combination of effective psychological interventions, including the lower level of psychological pressure and behavioural practices that boost mental health, can be used to improve the psychological status of vulnerable communities . Aging adults (>65) with long-term illness and incapable of household works were found highly vulnerable COVID-19 (Lakhani, 2020) . The proportion of deaths due to COVID-19 in Italy, Netherlands, Spain, and France was 50%, 58%, 59%, and 59% for the population with age >80 (Medfod & Trias-Llimos, 2020) . A similar observation was documented in Likassa et a. (2020), where the spatial distribution of COVID-19 cases was highly associated with case-fatality rate, and the linkages between these two variables were much stronger and reached up to 8.0% for patients with the age group of 70 to 79 years and 14.8% for patients aged >80 years. Likassa et a. (2020) also stated that the high infection and death rate in China, Italy, Iran, and the USA, could be linked with the spread of previous virus outbreak. Lippi et al., (2020) stated that three main determinantsmale sex, population with age >60, and preexisting comorbidities such as diabetes, hypertension, chronic respiratory diseases, cancer, and cardiovascular disorders, strongly determine the rate of COVID-19 death and infection. These statistics signifying the inherent connections amongst the socio-demographic composition and overall COVID-19 deaths and cases reported so far in the European region (Jia et al., 2020) ; Dowd et al., 2020; Mollalo et al., 2020; Borjas, 2020; Almagro & Orane-hutchinson, 2020) . Apart from the demographic factors, several climatic factors, including average temperature, minimum temperature, maximum temperature, rainfall, average humidity, wind speed, and air quality has also regulated the spread and casualties of COVID-19 (Bashir et al., 2020; Ma et al., 2020) . The availability of sufficient SARS-CoV-2 testing centers is also found to be important for adopting control strategies and decision making for minimizing the impact of COVID-19 on the overall socio-ecological system (Rader et al., 2020) . In addition to this, Kraemer et al. (2020) , found that human mobility factors is the key critical factor that J o u r n a l P r e -p r o o f aggravated the spread of COVID-19 cases in China as the growth rates become stable or negative in some areas where strong control measures were implemented and mandatorily imposed. However, the mobility factors in the other regions where the stringent regulations were not implemented, still pose severe threats by transmitting the infection in the closest neighbours . Therefore, it has been suggested that paying more attention to controlling (inter)national migration, restricted population flows, modernizing the healthcare system by improving diagnosis and treatment capacity, and upgrading the public welfare system to make it fully functional for the crisis situation, could be the point of interest in order to fight against the COVID-19 like situation effectively (Su et al., 2020) . Cities in the developing and developed world, including the UNESCO defined creative cities, which was designed in 2004 for providing maximum priority on creativity and sustainable urban development by achieving efficiency in all aspects, have been impacted badly by the outbreak of COVID pandemic. Though the long-term impact of COVID-19 on urban and city environment is challenging to predict, the historical shreds of evidence suggest that the long-term inelastic and old fashioned exhausted strategical plans of cities had always been shaped by strong interventions, like the outbreak of deadly viral diseases, natural calamities such as flood or earthquake 11 . Therefore, it is highly expected that in COVID-19 recovery period, the concerned stakeholders, including governments bodies, decision-makers, businesses leaders, city planners, land administrators, etc. will be directed in a way that forced them to re-think the viability of existing plans and find more comprehensive solutions that accelerate sustainability in city planning. There are few specific areas where complete structural reforms can be done, such as promoting a healthier building environment, which characterized by clean indoor air quality for both office and home environment. Healthier indoor environments can enhance the cognitive performance of a person that eventually boosts the logical and emotional intelligence of the person. Therefore, more focus on nature-based landscape design, such as open spaces and greenery for meditation, exercise, green material in living areas, proper ventilation system, uses of energy-efficient building materials, will be the need of the day. In this study, the spatial association between the socio-demographic variables and J o u r n a l P r e -p r o o f -a framework for localised exploratory data analysis. Computers, Environment and Urban Systems Assessing the dynamic relationship among land use pattern and land surface temperature: A spatial regression approach Does industrial land price lead to industrial diffusion in China? An empirical study from a spatial perspective Spatial Regression Models for Demographic Analysis Global Econoic Effects of COVID-19 Modeling the Spatial Distribution of Mosquito Vectors for West Nile Virus in Connecticut, USA. Vector-Borne and Zoonotic Diseases Demographic science aids in understanding the spread and fatality rates of COVID-19 Psychological interventions for people affected by the COVID-19 epidemic Estimating the impact of urbanization on air quality in China using spatial regression models The geography of parameter space: an investigation of spatial non-stationarity China: A Hospital-Based Case-Cohort Study Influences of urban spatial form on urban heat island effects at the community level in China Environmental modeling with GIS The species and its viruses -a statement of the Coronavirus Study Group Evaluation of spatially heterogeneous driving forces of the urban heat environment based on a regression tree model Urban heat island intensity and its mitigation strategies in the fast-growing urban area Population flow drives spatio-temporal distribution of COVID-19 in China The Spatial Distribution of Hepatitis C Virus Infections and Associated Determinants--An Application of a Geographically Weighted Poisson Regression for Evidence-Based Screening Interventions in Hotspots The effect of human mobility and control measures on the COVID-19 epidemic in China The Contribution of Age Structure to the Number of Deaths from Covid-19 in the UK by Geographical Units Which Melbourne metropolitan areas are vulnerable to COVID-19 based on age, disability and access to health services? Using spatial analysis to identify service gaps and inform delivery Improving epidemiologic data analyses through multivariate regression modelling Determinants of the geographic distribution of Puumala virus and Lyme borreliosis infections in Belgium Clinical and demographic characteristics of patients dying from COVID-19 in Italy versus China Spatial pattern of leisure activities among residents in Beijing, China: Exploring the impacts of urban environment Dynamic spatial spillover effect of urbanization on environmental pollution in China considering the inertia characteristics of environmental pollution Ecosystem productivity response to environmental forcing , prospect for improved rain-fed cropping productivity in lake Kyoga Basin Effects of temperature variation and humidity on the death of COVID-19 in Wuhan A quantitative analysis of the spatial and temporal evolution patterns of the bluetongue virus outbreak in the island of Lesvos Population age structure only partially explains the large number of COVID-19 deaths at the oldest ages Invited commentary: insulin resistance syndrome? Syndrome X? Multiple metabolic syndrome? A syndrome at all? Factor analysis reveals patterns in the fabric of correlated metabolic risk factors A brief conceptual tutorial of multilevel analysis in social epidemiology: using measures Epidemiology & Community Health The influence of social and economic ties to the spread of COVID-19 in Europe GIS-based spatial modeling of COVID-19 incidence rate in the continental United States MGWR: A python implementation of multiscale geographically weighted regression for investigating process spatial heterogeneity and scale Increased travel times to United States SARS-CoV-2 testing sites: a spatial modeling study Examining effects of climate change and land use dynamic on biophysical and economic values of ecosystem services of a natural reserve region Responses of ecosystem services to natural and anthropogenic forcings: A spatial regression based assessment in the world's largest mangrove ecosystem Influence of socio-ecological factors on COVID-19 risk: a cross-sectional study based on 178 countries/regions worldwide Estimating local-scale urban heat island intensity using nighttime light satellite imageries. Sustainable Cities and Society, 102125. United Nations Assessment of dietary patterns in nutritional epidemiology: principal component analysis compared with confirmatory factor analysis Immediate psychological responses and associated factors during the initial stage of the 2019 coronavirus disease (COVID-19) epidemic among the general population in China Spatial regression models World Health Organization Risk Factors Associated with Acute Respiratory Distress Syndrome and Death in Patients with Coronavirus Disease Identifying the influencing factors controlling the spatial variation of heavy metals in suburban soil using spatial regression models GIS-based spatial regression and prediction of water quality in river networks: A case study in Iowa A geographically weighted regression model augmented by Geodetector analysis and principal component analysis for the spatial distribution of Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Overall summary of spatial regression models that indicates the linkages between the demographic variables and total COVID-19 cases and deaths across Europe