key: cord-0973573-q25ckcyb authors: Kianfar, Nima; Mesgari, Mohammad Saadi; Mollalo, Abolfazl; Kaveh, Mehrdad title: Spatio-temporal modeling of COVID-19 prevalence and mortality using artificial neural network algorithms date: 2021-11-11 journal: Spat Spatiotemporal Epidemiol DOI: 10.1016/j.sste.2021.100471 sha: 8dac1e2ab8dd8e4e14f4c5229728fc8e254f5cb0 doc_id: 973573 cord_uid: q25ckcyb The outbreak of coronavirus disease (COVID-19) has become one of the most challenging global concerns in recent years. Due to inadequate worldwide studies on spatio-temporal modeling of COVID-19, this research aims to examine the relative significance of potential explanatory variables (n=75) concerning COVID-19 prevalence and mortality using multilayer perceptron artificial neural network topology. We utilized ten variable importance analysis methods to identify the relative importance of the explanatory variables. The main findings indicated that several variables were persistently among the most influential variables in all periods. Regarding COVID-19 prevalence, unemployment and population density were among the most influential variables with the highest importance scores. While for COVID-19 mortality, health-related variables such as diabetes prevalence and number of hospital beds were among the most significant variables. The obtained findings from this study might provide general insights for public health policymakers to monitor the spread of disease and support decision-making. On January 29, 2020, the World health organization (WHO) declared the coronavirus disease (COVID- 19 ) an epidemic, and shortly after, on March 11, 2020 announced it a pandemic [1] . As of October 1, 2021, almost 234 million cases and more than 4.7 million associated deaths related to the disease have been reported globally [2] . The outbreak of this acute respiratory infection has adversely impacted individuals and societies [3] . Although initial cases of COVID-19 were found in China, the transmission pattern of the virus has changed many times, causing irreparable damages worldwide [4] . Understanding the interactions between the determinant variables and health outcomes seems incomprehensible. In recent decades, artificial neural networks (ANNs) have been widely utilized to model the relationship between the factors and infectious diseases [5, 6] . The primary aim of ANNs is to predict the future status or unknown values of a particular dependent variable from a given set of independent variables. However, within ANNs, quantifying the contribution of each input variable in predicting the health outcome is difficult [7] . Previous studies have utilized various ANNs topologies to quantify the contribution of explanatory variables on dependent outcomes. Duh et al. [8] proposed multilayer neural networks for evaluating the input weights of ANNs. They validated this technique on three datasets and found that ANNs are effective in epidemiologic problems that require complicated classification techniques. Olden and Jackson [9] examined the neural interpretation diagram, Garson's algorithm, and sensitivity analysis to understand neural network relation weights. They showed that by extending randomization methods to ANNs, the black box mechanics of ANNs could be illuminated. Olden et al. [10] proposed the connection weights approach and argued that this approach is the least biased method that can accurately quantify the variable importance. Ibrahim [11] provided a modification to the connection weights algorithm and most squares method in multilayer perceptron (MLP) neural networks. They used crop production as a case study and compared this model with the connection weights algorithm, dominance analysis, Garson's algorithm, partial derivatives, and multiple linear regressions. The proposed algorithms' output was evaluated using empirical evidence. Their findings indicated that the most squares method outperformed other methods, which was consistent with the results of multiple linear regressions in terms of partial R 2 [12] . Because of the complexity of interactions between variables, particularly in large datasets, variable importance analysis (VIA) has gained attention in many practical applications [13] . VIA is a critical task in classification or regression problems to improve model interpretability, computational costs, data storage, and ultimately provide a sparse model without sacrificing prediction capacity [14] . Dealing with various balance scenarios, Dfuf et al. [15] introduced the nonparametric ℎ − 2 variable importance technique, which uses a multivariate continuous response system to select and rank the most influential variables. The method measures the dissimilarities between the distribution of errors caused by the base learner before and after permuting the variable. Casiraghi et al. [16] used a prediction model, "an explainable machine learning decision system based on additive trees", which processed clinical, radiological, and laboratory data of COVID-19 patients to predict the risk of severe outcomes. They combined Boruta and random forest in a 10-fold cross-validation scheme to produce variable importance estimates not affected by the presence of surrogates. Pasha et al. [17] employed multiple linear regression and a nonlinear regression based on 43 socio-economic and meteorological variables of 31 counties in California, United States. They found that the total population, household income, occupation, and transportation are more influential on COVID-19 spread than other variables. Shaffiee Haghshenas et al. [18] applied ANNs based on particle swarm optimization and differential evolution algorithms to prioritize climatic and urban factors. They found that population density and humidity were the most influential variables to predict the confirmed COVID-19 cases. In addition to the machine learning algorithms, the geographic information system (GIS) is a robust tool for analysis and visualizing many public health problems [19, 20] . Recent GIS-based research has shown that several factors such as air quality [21] , population flow [22, 23] , and population density [24, 25] could contribute to the higher rates of COVID-19 morbidity and mortality. In the Caribbean, Moonsammy et al. [26] applied spatial lag and linear regression models to identify spatial clusters of COVID-19 and the most influential socio-economic variables. They suggested that COVID-19 cases and deaths in the Caribbean have a spatial connection with mainland countries. They also concluded that population transmission could contribute to higher COVID-19 spread. The consequences of the COVID-19 outbreak on the environment have also been investigated in some studies. For instance, Ambade et al. [27] examined the levels of three air pollutants, namely particulate matter (PM2.5), Black Carbon (BC), and Polycyclic Aromatic Hydrocarbons (PAHs), in Jamshedpur city, India. Their results indicated that the concentrations of the contaminants were reduced during the lockdown compared to unlock down circumstances and regular days. Guatam [28] showed that India experienced a large decrease in aerosol concentration during the lockdown, which led to fewer deaths during the outbreak. Guatam [29] also suggested that lockdowns could help Asian and European countries experience lower levels of NO2. On the other hand, in China, Wang et al. [30] demonstrated that quarantine actions would not be sufficient to prevent severe air pollution despite reductions in transportation and industrial emissions. COVID-19 transmission is not limited to national borders and geographical territories. The primary focus of many studies that utilized machine learning methods such as ANNs was limited to a specific geographic location and applied pure spatial analysis with few sets of parameters while disregarding the impact of various potential variables over time. Therefore, to bridge the gap, this study investigates the influence of a broad range of explanatory variables (n=75) on disease prevalence and mortality using VIA methods based on ANNs, across the globe. This research optimized ANNs structure using a weighted information criterion (WIC) index to improve modeling accuracy. Moreover, as COVID-19 has shown various behaviors and mutated several times, different indicators were used to estimate mortality and morbidity rates over time. For this purpose, nine targets have been used to study the neural network's learning process with distinct desires. The daily COVID-19 data were obtained from WHO [5] from the beginning of March 2020 to the end of February 2021. The data contained new confirmed COVID-19 cases and newly confirmed deaths for all countries. Moreover, nine different indicators were used to study the learning process of further modeling. The formula for each indicator can be found in Table 1 (for prevalence) and Table 2 (for mortality). We divided the COVID-19 data into four equal time intervals (3-month periods): early March 2020 to the end of May 2020 (Period 1), early June 2020 to the end of August 2020 (Period 2), early September 2020 to the end of November 2020 (Period 3), and early December 2020 to the end of February 2021 (Period 4). In addition to COVID-19 data, a set of 75 variables, including demographic, environmental, social, economic, cultural, health, and public transportation variables was compiled at the country level as explanatory variables. The category, name, and source of the variables are presented in Table 3 . Existence of many correlated explanatory variables (n=75) may cause multicollinearity which can in turn reduce the generalizability of the models due to overfitting. In order to reduce multicollinearity, variance inflation factor (VIF) was used [35] . Using VIF and also Pearson's correlation analysis, 18 correlated variables were removed, and the most uncorrelated ones were selected as the input of the further employed models. ANNs are computational systems consisting of a large number of connected nodes called neurons [36] . ANNs can identify the relationships among dependent and independent variables, which helps in understanding system function [37] . Neurons in these networks are structured in different layers, including input layer, output layer, and hidden layer(s). There is full connections between the neurons in the input layer and the ones in the hidden layer. Likewise, each neuron in the hidden layer is connected to the neurons in the output layer [6] . Figure 1 shows the topology of a single-layer neural network with a non-linear sigmoid transfer function in the hidden layer and a linear function in the output layer. Theoretically, any function with a finite number of discontinuities can be approximated by using a single-layer neural network with a non-linear sigmoid transfer function in the hidden layer and a linear one in the output layer ( Fig. 1 ) [38] . Therefore, in this study, single-layer perceptron (SLP) neural networks with the mentioned characteristics were employed. The ultimate purpose of this research is to assess the relative importance of various variables in modeling COVID-19 prevalence and mortality over time. For this purpose, we first optimized the structure of ANNs for hyperparameters, number of neurons in the hidden layer, and learning parameters [39] . We used Bayesian regularization method to train the network while addressing overfitting problem and complex interactions between variables [40] . Then we determined the optimum number of neurons in the hidden layer using WIC index [41] . Based on this method, the number of neurons in the hidden layer was systematically increased from one to the number of variables, and then the WIC index value of each model was calculated. The lower model's WIC index indicates a more efficient model [41] . Fig. 2 shows the WIC index model selection process. Different targets were used as the desired value (system output) as COVID-19 has shown various behaviors and mutated several times to estimate mortality and morbidity rates. For this purpose, nine different targets have been used to study the neural network's learning process with different desires. The accuracy for each of these targets was evaluated by ANNs. A target with highest accuracy suggests a highest suitability for determining the importance of variables and thus was selected as the optimum target for modeling. As the indicators are not in the same scale, the resulting models have been compared with each other by the normalized root mean square error interquartile index (RMSEIQR) [42] . Compared to the RMSE, which is a scale-dependent index and partly sensitive to outliers and extreme values, RMSEIQR can be used as a practical index for comparing models over various concentration scales [42] . Moreover, RMSEIQR was used as a common tool to assess and measure the uncertainty of the results [43] . After variable selection, we assessed the relative importance of the selected variables in modeling COVID-19 prevalence and mortality for each period. The following steps explain the process of determining relative importance of variables in each period (Fig. 3 ): Step 1: Different target values from COVID-19 data were generated as described in 2. Step 2: WIC index was used to determine optimum network architecture for modeling each type of target (model selection). Nine of them were chosen from the n * m models (n: number of explanatory variables; m: number of targets) in total. Step 3: Models were developed based on optimum networks and their RMSEIQR were computed. Step 4: Two separate models (prevalence and mortality) with the lowest RMSEIQR values for each period were selected. Step 5: The variables were ranked based on relative importance using VIA methods. Ten different methods were used to perform VIA through the MLP artificial neural network. These ten VIA methods are described in the next section. The relative importance of input variables refers to each variable's contribution to predict the dependent variable [11] . Ten VIA methods were used to derive the relative importance of variables from these qualified networks: connection weights algorithm, modified connection weights, most squares, Garson, partial derivatives, stepwise, perturb, Lek's profile, modified Lek's profile, and variance-based approaches. The findings of these approaches can be integrated to draw a general inference. For this purpose, the total of the relative weights obtained from various methods (in percent) was calculated for each variable. This was performed individually for each period, for both infected cases and associated deaths. Below, we briefly explained the VIA techniques used in this study to quantify the relative importance of selected variables used in ANNs. The main benefit of the CW algorithm is that the relative contribution of each connection weight is preserved for both magnitude and sign [10, 11] . The relative importance of a given input variable can be defined as Eq. (1). Where is the relative importance of the input layer, is the input neuron, is the total number of neurons in the hidden layer, and is the output neuron. This method estimates the final network weights obtained through network training. The estimates of final weights differ depending on the initial weights used at the beginning of the training phase [10] . Using the same notation as the CW algorithm, after calculating the sum of product of final weights of connections from input neurons to hidden neurons, a correction term (partial correlation) is multiplied by this sum and the absolute value is taken. This absolute value is called the corrected sum. The corrected sum of each input is then divided by the total corrected sum to determine the relative importance of each input in the MCW algorithm, which is calculated as Eq. (2) and Eq. (3) [11] . Where . is the partial correlation of input with output after input , which assesses the association degree between two random variables. Moreover, denotes the simple correlation between input and output . -Most squares Using the same notation as the CW algorithm, the most squares approach computes the sum of the squared between initial weight ( ) and final weights ( ) for each input. The sum of squared differences for each input is then divided by the total sum of all inputs. Eq. (4) is used to calculate the relative importance of each input [11] . The output variable in the PD method would decrease when the input variable increases if the PD is negative [11] . In Eq. 6, is the output with respect to input , denotes the total number of observations in a network with inputs, one hidden layer with ℎ neurons, and one output neuron. is the derivative of the output neuron with respect to the corresponding input. ℎ is the ℎth hidden neuron's output, and ℎ and ℎ are the correlation weights between the output neuron and the ℎth hidden neuron, and between the th input neuron and the ℎth hidden neuron, respectively. In Eq. 7, is the sum of the square partial derivatives. The stepwise method involves adding or removing one input variable step by step while considering the effect on the output result. Depending on various arguments, the input variables are ranked according to their significance based on the changes in mean squared error (MSE). The largest increases or decreases in MSE due to input deletions are used to classify inputs in order of importance [45] . Perturb method aims to measure how minor changes in each input will affect the neural network output. The algorithm modifies one variable's input values while leaving the others unchanged. The output variable's responses to each change in the input variable are registered. The input variable with the greatest relative effect on the output is the one with the largest changes. The input variables are classified according to the impact of the small changes [46] . In Lek's profile method, each input variable is studied while the others are blocked at fixed values. The basic idea behind this method is to create a fictitious matrix that encompasses the entire range of input variables. Each variable is divided into a set of equal intervals between its minimum and maximum values. Except for one, all variables are set to their minimum, first quartile, median, third quartile, and maximum values at the beginning. The median value is subtracted from these five numbers. The output variable's profile is plotted for the considered values [46] . (Eq. 10) Where ҧ represents the mean of values, is the corrected sum of squares, and is the total number of updates. Based on the lowest obtained values for RMSEIQR, we selected prevalence rate in interquartile range (PR-IQR) as the target for modeling the prevalence rates of COVID-19 in each studied period ( Table 4 ). The spatio-temporal variations of prevalence rates in IQRs for each period has been depicted in Fig. 4 . According to Fig. 4 , the countries in North and South America had a persistent higher prevalence rates in IQR than the rest of the world in all periods. In the period 2, the countries in continental Europe and America showed a relatively increasing trend in COVID-19 prevalence compared to the period 1, as the prevalence rates in IQR values have increased in these areas. The period 3 was the peak of the disease prevalence compared to other periods. During this period, Europe and most countries in north Asia were significantly infected by COVID-19. In period 4, the prevalence rates slightly decreased compared to period 3. This reduction in changes is more visible in America, maybe due to earlier initiation of vaccination programs. However, the countries of Central and South Africa have had no remarkable differences in prevalence rates (in all periods), except for the southernmost ones, including South Africa and Namibia, which have had the highest prevalence rates in IQR over time (Fig. 4) . Regarding COVID-19 deaths, we selected mortality rate (MR) as the target indicator in all periods due to the lowest values of RMSEIQRs (Table 4 ). The spatio-temporal distribution of the MRs is demonstrated in Fig. 5 . According to Fig. 5 , the changes in MR trends is more visible in America and Europe continents. In period 1, the distribution of MR was almost uniform across the world. Moreover, in the first period, most countries experienced lower MR rates compared to the following periods. In the period 2, South American countries including Brazil, Argentina, Bolivia, Peru, and Colombia experienced higher MRs than other countries. The period 3 shows a relatively significant increase in COVID-19 MRs in continental Europe and North America. Although the highest prevalence rates in IQR were found in period 3 (Fig. 4) , period 4 was found to be the peak of mortality rates, especially in the United States, Brazil, South Africa, and some European countries (Fig. 5) . Based on the WIC index, the optimum network architecture for modeling each type of target was identified. Nine models were chosen from a total of n * m (n: number of explanatory variables; m: number of targets) models (step 2). Further, two models with the lowest RMSEIQR were selected for each period, one model for prevalence and the other for mortality (step 4). Table 4 lists the models that were selected in step 2 and 4. The ANN topologies that were selected to perform VIA are represented as bold rows in Table 4 . Fig. 6 to Fig. 9 depicts the twenty most influential explanatory variables on COVID-19 prevalence and mortality for all selected periods, respectively. As can be seen, some of the explanatory variables were among the twenty most important variables across all periods (non-black horizontal bars). Most economicrelated variables such as unemployment, gross national income (GNI) per capita, and GNI per capita growth have always been among the most influential explanatory variables on COVID-19 prevalence. In addition, other variables related to public transportation, including rail and air transportation, as well as surface temperature, population density, and urban population were among the most significant variables for cases at all periods. For mortality, diabetes prevalence, the number of hospital beds (per 1,000 people), number of nurses and midwives (per 1,000 people), negative affect (negative emotions and experiences during life), and air transportation were the most influential explanatory variables for all periods. In addition, Table 5 lists the two most influential variables for each period based on the median of weights. The outbreak of COVID-19 has adversely affected many countries around the world. Numerous mutations caused by the SARS-CoV-2 virus have intensified its spread, making the control of the epidemic even more challenging. Identifying the effective variables and their relationship with disease prevalence and mortality over time can be useful for controlling disease outbreak. ANNs are among the most widely used approaches to model this relationship, particularly as the associated data and computations become more readily available [49] . Since the epidemic of COVID-19, as a contagious disease, is directly related to the geographical concept of an area, GIS can play an essential role in its planning, management, and modeling [5] . GIS has been used in many studies to manage and plan epidemiological issues from spatial perspectives [50, 51] . It also has been consistently used to analyze health-related data and can be a valuable tool for analyzing the spread of disease in each region [50] . Increasing the power of computers, improving spatial analysis methods, and developing artificial intelligence models have led to the development of advanced and modern GIS applications in disease modeling and prediction [52] . Therefore, in this study, we utilized GIS technology to develop a spatio-temporal model for COVID-19 prevalence and mortality. Given that little space-time COVID-19 modeling has been conducted at the global scale, we compiled a geodatabase of potential influential variables on the prevalence and mortality of the disease and ranked Dealing with complicated interactions among variables, we applied ten different VIA methods to evaluate the influence of potential explanatory variables by optimizing the data storage, advancing the model interpretability, and providing a smaller number of influential variables without losing accuracy. VIA techniques can be implemented to solve the intricacy of interactions among variables on big datasets [13] . For instance, these techniques were used to figure out how well each variable influences the COVID-19 prevalence. Dfuf et al. [15] implemented a parametric and a nonparametric VIA method and calculated the impact of the 35 companies on the political, economic, and social instability captured by two highly regarded Spanish economic newspapers during the COVID-19 outbreak. The result showed that the nonparametric VIA method outperformed its competitors since it incorporates all the information using the entire distribution errors. Economic variables have retained their significant impact on higher rates of COVID-19 prevalence over time. Consistent with our findings, unemployment was found strongly correlated with the increased risk of disease prevalence [53] . Since unemployment and poverty reduce people's ability to access health facilities, unemployed people who are infected communicate with others in the society without being treated, which may increase the severity of the disease transmission. Another hypothesis that can explain this association is unemployed individuals and uneducated people are less likely to get vaccinated due to underestimating the positive impacts or overestimating the risks of getting vaccinated, which can cause a higher prevalence of the COVD-19 in a society [54, 55] . Some other studies, such as [53] , have shown that unemployment and inadequate social welfare can increase the disease spread. Demographic variables were other influential variables affecting the COVID-19 spread. Due to the contagious nature of COVID-19, the higher population density and overcrowding in an area are associated with the greater likelihood of disease occurring [56, 57] . On the contrary, countries with a lower population density showed lower prevalence rates of COVID-19 in all periods, such as Australia and Russia. Consistent with our findings, a recent study [4] shows that the higher population density rates in Oman could result in a higher prevalence of COVID-19. A research by Ahmadi et al. [24] suggests that population density and intra-provincial movement are directly associated with the spread of the coronavirus in Iran. Other studies confirm that higher population density increases the chance of transmission of the virus [58, 59] and can alter the prevalence and mortality rates [60] . The use of public transit was persistently found significant on COVID-19 prevalence in all periods. A possible explanation might be that many people in public transportation stand together for a long time in a closed environment especially transportation by plane and train. As a result, the contagious virus can rapidly be transmitted from infected individuals to other passengers, causing the disease to spread more severely. Zheng et al. [61] showed that the infected individuals during the incubation period brought the disease from Wuhan, China to other cities and nations by using public transportation such as flights, trains and buses. In New York, Cordes and Castro [62] suggested that people who rely on public means of transportation might be at higher risks of COVID-19 due to contact with other infected passengers, consistent with our findings. Regarding COVID-19 mortality, diabetes prevalence was found to be a significant variable in all periods. Inadequate and poor immunological responses to viral infections may be among the leading cause of mortality in COVID-19 patients with diabetes [63] . The increased blood sugar level in a person with diabetes can severely damage the beneficial intracellular bacteria, which in turn increases the viral binding affinity and reduces the virus removal [64, 65] . Exploring the spatial variations of COVID-19 in the Caribbean, Moonsammy et al. [26] found that the higher prevalence of diabetes in the Caribbean could increase COVID-19 deaths. A meta-analysis on more than 16,000 patients also found that diabetes in patients with COVID-19 doubled the risk of death [66] . Consistent with our results, other researchers have shown a strong relationship between diabetes prevalence and COVID-19 mortality [67, 68] . There were several caveats and limitations in this study that should be acknowledged. First, due to the worldwide distribution of this study, it is most likely that some countries have not provided accurate statistics about COVID-19 prevalence and deaths, which may bias the results. Another limitation of this study was associated with different lockdown policies and stay-at-home restrictions for each country. Some countries quickly began quarantine policies after the pandemic was announced than others that did not make any specific lockdown policy. Although we tried to find the most influential factors related to COVID-19 prevalence and mortality for all countries at the same time, a study on a higher spatial resolution (subcountry level) can provide more reliable results. Despite above-mentioned limitations, the findings may help policymakers to track the spread of disease over time based on the most significant variables identified by the employed models. In summary, we examined ten different VIA methods to estimate the relative importance of potential explanatory variables on COVID-19 prevalence and mortality at a global scale. Due to the numerous mutations of the virus, various targets were considered for modeling to enhance the accuracy of the results. Our findings indicated that the extracted relative importance from different models by VIA methods varies over time. However, several variables were persistently among the most influential variables on the prevalence and mortality of the disease in all periods. Unemployment, population density, air and rail transportation, urban population, GNI per capita, GNI per capita growth, and surface air temperature were among the most significant variables on disease prevalence in all periods. Regarding COVID-19 mortality, diabetes, air transportation, number of hospital beds, number of nurses, and negative affect were among the most influential variables. Better spatial resolution can improve the validity of the results in future studies. Policymakers and epidemiologists can use spatio-temporal analysis to monitor and evaluate COVID-19 prevalence and mortality concerning significant variables. None. Archived: WHO Timeline-COVID-19 World Health Organization (WHO), WHO Coronavirus (COVID-19) Dashboard A novel coronavirus outbreak of global health concern. The lancet Sociodemographic determinants of COVID-19 incidence rates in Oman: Geospatial modelling using multiscale geographically weighted regression (MGWR) Artificial neural network modeling of novel coronavirus (COVID-19) incidence rates across the continental United States A GIS-based artificial neural network model for spatial distribution of tuberculosis across the continental United States Pattern recognition and neural networks Epidemiologic interpretation of artificial neural networks Illuminating the "black box": a randomization approach for understanding variable contributions in artificial neural networks. Ecological modelling An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data. Ecological modelling A comparison of methods for assessing the relative importance of input variables in artificial neural networks An artificial neural network approach to spatial habitat modelling with interspecific interaction. Ecological modelling Trends in sensitivity analysis practice in the last decade Variable importance analysis: a comprehensive review Variable importance analysis in imbalanced datasets: A new approach Explainable Machine Learning for Early Assessment Risk Prediction in Emergency Departments An analysis to identify the important variables for the spread of COVID-19 using numerical techniques and data science. Case Studies in Chemical and Environmental Engineering Prioritizing and analyzing the role of climate and urban parameters in the confirmed cases of COVID-19 based on artificial intelligence applications. International journal of environmental research and public health Geographic information system-based analysis of the spatial and spatiotemporal distribution of zoonotic cutaneous leishmaniasis in Golestan Province, northeast of Iran. Zoonoses and public health Machine learning approaches in GIS-based ecological modeling of the sand fly Phlebotomus papatasi, a vector of zoonotic cutaneous leishmaniasis in Golestan province Correlation between climate indicators and COVID-19 pandemic Spatial disparities in coronavirus incidence and mortality in the United States: an ecological analysis as of Population flow drives spatio-temporal distribution of COVID-19 in China Investigation of effective climatology parameters on COVID-19 outbreak in Iran COVID-19 emergence and social and health determinants in Colorado: a rapid spatial analysis. International journal of environmental research and public health COVID-19 modelling in the Caribbean: Spatial and statistical assessments COVID-19 lockdowns reduce the Black carbon and polycyclic aromatic hydrocarbons of the Asian atmosphere: Source apportionment and health hazard evaluation The influence of COVID-19 on air quality in India: a boon or inutile COVID-19: air pollution remains low as people stay at home. Air Quality, Atmosphere & Health Severe air pollution events not avoided by reduced anthropogenic activities during COVID-19 outbreak. Resources, Conservation and Recycling Sustainable Development Solutions Network Religious Diversity Index Scores by Country 4 April Detecting multicollinearity in regression analysis Artificial neural networks for land-cover classification and mapping. International journal of geographical information science Artificial neural network modeling of phytoplankton blooms and its application to sampling sites within the same estuary Comparing sigmoid transfer functions for neural network multistep ahead streamflow forecasting Metaheuristic design of feedforward neural networks: A review of two decades of research Predictive abilities of bayesian regularization and Levenberg-Marquardt algorithms in artificial neural networks: a comparative empirical study on social data. Mathematical and Computational Applications A new model selection strategy in artificial neural networks Cluster-based bagging of constrained mixed-effects models for high spatiotemporal resolution nitrogen oxides prediction over large regions Quantifying DEM uncertainty and its effect on topographic parameters. Photogrammetric Engineering & Remote Sensing Interpreting neural network connection weights Ranking importance of input parameters of neural networks. Expert systems with Applications Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological modelling Sensitivity analysis of chaos in a nonlinear pendulum through artificial neural networks Note on a method for calculating corrected sums of squares and products Deep learning for supervised classification of spatial epidemics. Spatial and spatio-temporal epidemiology Spatio-temporal epidemiology: Principles and opportunities. Spatial and spatio-temporal epidemiology Spatial epidemiology: an empirical framework for syndemics research Recognizing suspect and predicting the spread of contagion based on mobile phone location data (counteract): a system of identifying covid-19 infectious and hazardous sites, detecting disease outbreaks based on the internet of things, edge computing, and artificial intelligence The impact of unemployment on health: a review of the evidence Determinants of COVID-19 vaccine acceptance in the US. EClinicalMedicine Spatial Modeling of COVID-19 Vaccine Hesitancy in the United States The socio-spatial determinants of COVID-19 diffusion: the impact of globalisation, settlement characteristics and population. Globalization and health Coronavirus and migration: analysis of human mobility and the spread of Covid-19. Migration Letters The spread of COVID-19 virus through population density and wind in Turkey cities High population densities catalyse the spread of COVID-19 Impact of population density on Covid-19 infected and mortality rate in India. Modeling Earth Systems and Environment Spatial transmission of COVID-19 via public and private transportation in China. Travel medicine and infectious disease Spatial analysis of COVID-19 clusters and contextual factors Glycemic control and risk of infections among people with type 1 or type 2 diabetes in a large primary care cohort study COVID-19 pandemic, coronaviruses, and diabetes mellitus Diabetes and COVID-19. Open Life Sciences Is diabetes mellitus associated with mortality and severity of COVID-19? A meta-analysis Diabetes mellitus is associated with increased mortality and severity of disease in COVID-19 pneumonia-a systematic review, metaanalysis, and meta-regression Diabetes is a risk factor for the progression and prognosis of COVID-19. Diabetes/metabolism research and reviews We would like to thank anonymous reviewers for taking the time and effort to review the manuscript. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.