key: cord-0230016-hdse98ct authors: Arashi, Mohammad; Bekker, Andriette; Salehi, Mahdi; Millard, Sollie; Erasmus, Barend; Cronje, Tanita; Golpaygani, Mohammad title: Spatial analysis and prediction of COVID-19 spread in South Africa after lockdown date: 2020-05-19 journal: nan DOI: nan sha: 21134e38ac3734206970297aff3f9ab6a6290e83 doc_id: 230016 cord_uid: hdse98ct What is the impact of COVID-19 on South Africa? This paper envisages assisting researchers and decision-makers in battling the COVID-19 pandemic focusing on South Africa. This paper focuses on the spread of the disease by applying heatmap retrieval of hotspot areas and spatial analysis is carried out using the Moran index. For capturing spatial autocorrelation between the provinces of South Africa, the adjacent, as well as the geographical distance measures, are used as a weight matrix for both absolute and relative counts. Furthermore, generalized logistic growth curve modeling is used for the prediction of the COVID-19 spread. We expect this data-driven modeling to provide some insights into hotspot identification and timeous action controlling the spread of the virus. During December 2019, several cases of pneumonia of an unknown aetiology were reported in Wuhan, a city within the Hubei province of China ( [1] ). Within a week investigators found that the initial cases where all associated with a seafood market where live poultry and wild animals were being sold ( [2] ). Since then the disease has been registered, and become known, as the coronavirus or COVID-19 which is caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV2). This disease has shown that in early stages of infection, symptoms of severe acute respiratory infection can occur. These may include a cough, fever and shortness of breath ( [3] ). Some patients may then develop acute respiratory distress syndrome (ARDS) and other serious complications which may potentially lead to multiple organ failure ( [1] ). Since mid-December, COVID-19 has spread to all seven continents, increasing its prevalence throughout the entire world, and was declared a pandemic, by the World Health Organization (WHO) on the 11th of March ( [4] ). This rapid spread has been fuelled by the fact that the majority of infected people do not experience severe symptoms, which makes it more likely to remain mobile and hence infect others ( [5] ). The transmission primarily occurs through contact from person to person, coughing or sneezing and touching of contaminated surfaces ( [6] ). On the 14th of February the first case of COVID-19 was reported in Africa, in the City of Cairo, by the Egyptian Ministry of Health and Population. The individual, who travelled between China and Cairo on a business trip was identified through contact screening ( [7] ). The first South African case was confirmed by The National Institute for Communicable Diseases (NICD) on the morning of the 5th of March 2020. The patient, a 38 year old male, was part of a group of 10 people, including his wife, who arrived back in South Africa on the 1st of March from Europe. Since then the number of infections and deaths have risen drastically. President Cyril Rhamaphosa was praised by the director-general of the WHO, Dr Tedros Adganom Ghemreyesus, for his leadership and approach to protecting South Africans during these trying times ( [8] ). The British Broadcasting Corporation (BBC) also commended President Cyril Ramaphosa for his leadership and for South Africa's "ruthlessly efficient" response to the coronavirus ( [9] ). On the 15th of March President Cyril Ramaphosa declared a national state of disaster, the terms of the Disaster Management Act which enable the focus to be put on preventing and reducing the risk of the virus spreading ( [10] ), and only a few days later on the 23rd of March the President declared a national lockdown commencing on midnight of the 26th of March. In South Africa, these extreme measures are absolutely necessary, as the country contains a high risk population combined with low-income country characteristics. The main concerns, which are thought to escalate the spread of the coronavirus, are the large and densely populated areas and townships, including a high level of poverty and movement within these areas. Combined with existing epidemics such as the human immunodeficiency virus (HIV), tuberculosis (TB) and malaria, this might lead to an increase in morbidity and mortality. Since the wide spread of non-communicable diseases, such as chronic obstructive pulmonary disease (COPD), heart disease, hypertension and diabetes, in Africa are known risk factors for severe cases of COVID-19 these may also increase the death rate in these lowerincome countries ( [11] ). As winter is approaching, overcrowded houses and the large immunocompromised population, will contribute to the increase in the number of COVID-19 cases rising ( [12] ). To date, as reported by the Coronavirus disease 2019 [13] , South Africa has the largest number of COVID-19 cases in Africa. Although it has been shown that South Africans are generally complying during the lockdown, by investigating insights from vehicle-tracking data, is was shown that vehicle activity dropped by 20% even before the lockdown and reduced by 75% after the lockdown was implemented ( [14] ). This decline in movement directly indicates the effect on the economy, with the closure of businesses like manufacturing, retail and restaurants, to name only a few. With numerous businesses no longer operating, many South Africans are no longer receiving an income. The Human Science Research Council also released a note on the mental health of South Africans, stating that the stage is already set for major mental health implications, and noting that that failure to put measures into place to mitigate the psychological impacts of quarantine, is likely to lead to an ineffective and slow economic recovery ( [15] ). In this study, we make use of the Moran index to spatially identify the spread in South Africa with respect to the provinces for the COVID-19 infections. Finding these hotspots will provide insights in identifying, and assist in tracking, the COVID-19 spread. With this information South Africa will be better able to predict local outbreaks and develop public health policies to better manage and update medical procedures currently set in place. Since the strict lockdown (level 5) is phased out from the 1st of May (moved to level 4 lockdown), the location of these hotspots could assist in guiding the riskadjusted strategy and the economic activity plan, set out by the South African government. It would thus be important to know where these hotspots are and if they are statistically significant. Further, a generalized logistic model of the growth trend will be employed to show the difference between the hotspot areas and the areas outside of it. With the continuing growth and development of COVID-19 in South Africa, this analysis might be helpful to guide political leaders and health authorities to manage the allocation of resources and prepare for future virus control. The effect of COVID-19 is still in early stages in South Africa but different tendencies have already been observed when compared to the US and other European countries ( [13] ). Understanding these tendencies will be very important in guiding the fight against COVID-19 in South African as well as the rest of Africa. This is an initial study from which many other interesting studies will follow and it will be very important to continue with analyses as more cases are reported and more data becomes available. South Africa, with the most confirmed COVID-19 cases, will need to be the leader in guiding the fight against COVID-19 in Africa. South Africa, formally known as the Republic of South Africa, is situated at the southernmost tip of Africa and covers a surface area of 1 219 602 km. With a coastline stretching more than 3000km from the desert border of Namibia touching the Atlantic Ocean, around the tip of Africa to the northern bordered of Mozambique on the Indian Ocean side. South Africa shares common boundaries with Namibia, Botswana, Zimbabwe, Swaziland, with the Mountain Kingdom of Lesotho landlocked by SA. The Prince Edward and Marion islands lie some 1 920km south-east of Cape Town ( [16] ). With a population of more than 59 million, South Africa is the world's 25'th most populated nation consisting of nine different provinces. South Africa has three designated capital cities; executive Pretoria, judicial Bloemfontein and legislative Cape Town. The largest city and main economic hub being Johannesburg, which is also the main entry point for visitors from other countries via OR Thambo International Airport ( [16] ). The following timeline of the major interventions in South Africa for the COVID-19 outbreak and the statistics are shown in Figures 1 and 2 respectively. April 2020 May 2020 Distances between the provinces are determined using the main city from each of the provinces. The main city is the city most likely to be the highest risk for COVID-19 infection. In most cases, the main city is also the capital city of the province. The main cities are indicated in Table 1 . The spatial correlation between the 9 provinces, Northern Cape, Eastern Cape, Free State, Western Cape, Limpopo, North West, KwaZulu-Natal, Mpumalanga, and Gauteng in South Africa, we use the Moran's autocorrelation coefficient, also known as the Moran index (denoted by I) in geographic health science. Furthermore, we make use of generalized logistic function (GLF) for identifying an appropriate growth curve of COVID-19. Hence, this section is devoted to the definition of Moran index (Moran's I) as well as the GLF. The Moran index, originally defined by [17] , is a measure of spatial association or spatial autocorrelation which can be used to find spatial hotspots or clusters and is available in many software applications. This index has been defined as the measure of choice for scientists, specifically in environmental sciences, ecology and public health ( [18] ). Some other indices include the Getis' G index, Geary's C, local Ii and Gi, spatial scan statistics and Tango's C index ( [19] and [18] ). The Moran index has both a local and global representation. The global Moran's I is a global measure for spatial autocorrelation while the local Moran's I index examines the individual locations, enabling hotspots to be identified based on comparisons to the neighbouring locations ( [19] ). This local Moran's I has been successfully applied to hotspot identification for infection clusters such as those investigated by [20] , who researched the bovine tuberculosis breakdowns (bTB )in Northern Irish cattle herds in order to access the spatial association in the number and prevalence of chronic bTB across Northern Ireland. Other areas where this index has been successfully applied and commonly used are diseases, mortality rates, environmental planning and environmental sciences. It's important to note that the result can be affected by the definition of the weight function, data transformation and existence of outliers ( [19] ). Until now, not many COVID-19 related research has made use of the Moran Index and no research was found for South African specific cases. Some studies that include the use of this index are: 1) [21] explored the spatial epidemic dynamic of COVID-19 in mainland China in order to determine whether a spatial association of the COVID-19 infection existed; 2) [22] applied the Moran index to a spatial panel which showed that COVID-19 infection is spatially dependent and mainly spread from Hubei Province in Central Chine to neighbouring areas; 3) [23] used a global dataset of COVID-19 cases as well as a global climate database and investigated how climate parameters could contribute to the growth rate of COVID-19 cases while simultaneously controlling for potential confounding effects using spatial analysis; 4) [24] used data on all mobile phone users to examine the impact of the Coronavirus outbreak under the Swedish mild recommendations and restrictions regime on individual mobility and if the changes in geographical mobility vary over different socio-economic strata and 5) [25] investigated the influence of spatial proximities and travel patterns from Italy on the further spread of the SARS-CoV-2 around the globe. This index is an extension of the Pearson's product-moment correlation coefficient for spatial pattern recognition. Observations in close proximity are more likely to be similar than those far apart ( [26] [27] ). In order to formulate the Moran index for our purpose, assume we have provinces and the pair ( , ) is for the attribute (variable) in provinces , = 1, ⋯ , , respectively. Then, the spatial weight quantifies the level of closeness between and and the Moran index is defined by where ̅ = −1 ∑ =1 and = ∑ ∑ =1 ; ≠ . The Moran's ℐ takes value on [−1,1] and ℐ = 0 shows no spatial correlation between the provinces for the underlying attribute. According to [28] , there are two ways to identify the weights. In our context, we identify the ( , )-th element of the weight matrix , from taxonomic level classification viewpoint, as = { 1 if the provinces i and j are connected 0 otherwise Using the phylogenetic tree classification (geographical distance), we assign the weights following where is the distance between the province centre and province centre , is a distance threshold, and is a power level parameter. See [29] for more detail and comparison between different weights. In this section, we predict some attributes via logistic growth curve modeling. The logistic function/curve is commonly used for dynamic modeling in many branches of science including chemistry, physics, material science, forestry, disease progression, sociology, etc. For our purpose and generality, we follow the Richards' differential equation (RDE) due to [30] given by with initial condition ( ) = , is the carrying capacity, the maximum capacity or total population here, , > 0 to obtain the generalized logistic curve (GLC) The typical logistic curve which is widely used in modeling, is the special of the GLC for = 1. Further, the Gompertz curve can be obtained for the limiting case → 0 + . See [31] for more details and applications of the GLC. While only a few studies applied the logistic growth models to COVID-19 specific research questions, only one combined the model with the use of the Moran index to show that the infection is spatially dependent ( [22] ), with no studies for South African data. Some of these studies, which applied only the logistic growth model include; 1) [32] who uses the logistic growth equation to describe the process on a macroscopic level and 2) [33] who reviews the epidemic virus growth and decline curves in China using the phenomenological logistic growth model. In this section, we start off with a general inspection on the provincial distribution of the total confirmed and death cases given in Figures 3 and 4 , respectively. From these figures it is observed that the heatmaps of confirmed and death cases agree, and therefore more confirmed cases are followed by more deaths. Furthermore, it is observed that the hotspots are Western Cape and Gauteng, with the former the highest risk of infection. In order to test the spatial autocorrelation of COVID-19 in South Africa, the interaction between provinces is estimated using Moran's I from March 21, 2020 to April 25, 2020 based on absolute counts by using Eq. (1) and the results are reported in Table 2 . Based on the adjacent 0-1 weight matrix, not all Moran coefficients are significant at the significance level of 5% (see the fifth column in Table 2) , and the values are around the interval of [0, 1], since there is a positive correlation among the confirmed cases according to the geographical structure. Comparatively, no significant spatial correlation is tested out based on spatial geographic distance (the last column of Table 2 ), which indicates the spreading direction in South Africa is mainly based on adjacent areas to neighbors, and doesn't matter how far the distance to the infectious center. So main cities adjacent to are at higher risk. To extend the analysis, we calculated the corresponding p-values of Moran's test given by Table 2 over the time, shown in Figure 5 . Comparing the corresponding p-values of Moran's test, some deviation exist in the statistical timeliness in main cities of provinces from March 21, 2020 to April 25 so it is inevitable that maybe bias occurred in our results. From April 20 to 25, the spatial autocorrelation is significantly different in terms of adjacency to main cities. This means that the prevalence of COVID-19 varies in main cities. Additional analysis has been taken into account to validate the results based on the Moran index characteristics. The observed Moran index along with its expected value and standard error are tabulated in Table 3 . The expected value of Moran index is -1/(N-1)=-1/8=-0.125 in our case. Using the adjacency weights, we obtain the same result as discussed based on Table 2 . However, since the null hypothesis that there is no spatial autocorrelation between the provinces is not rejected for the geographical distance, we can argue there is no evidence of negative auto-correlation here, as with random data you would expect it to be a negative value more often than positive. The impact of President Cyril Ramaphosa's decision in containing the outbreak by strict lockdown regulations is supported by the p-values in Figure 6 (see dates 20 April and onwards). In addition, the spatial autocorrelation of COVID-19 in South Africa based on the relative counts, has been estimated using I from March 21, 2020 to April 25, 2020, and the results are reported in Table 3 . In this table, for the adjacent 0-1 weights, more coefficients are significant at the level of 5% (see the fifth column in Table 4) , with values in the interval [0, 1], since there is no positive correlation among the confirmed cases according to the geographical structure. Comparatively, very few significant spatial correlations are measured based on the spatial geographic distance (the last column of Table 3 ), which indicates that the spread direction in South Africa is mainly based on adjacent areas to neighbours, and less so on the distance to the infection centre. In conclusion, as it is identified in heatmaps Figures 3 and 4 the two main cities are at higher risk. We also calculated the corresponding p-values of Moran's test given by Table 3 over the time, shown in Figure 7 . Comparing the corresponding p-values of Moran's test, some deviation exist in the statistical timeliness in main cities of provinces from March 21, 2020 to April 25 so it is inevitable that bias occurs in the results. April 20 to 25 is not significantly different in terms of adjacency to main cities. This means that the prevalence of COVID-19 based on relative counts, is the similar in the main cities. Table 4 over the time, according to the relative counts (absolute counts divided by 1M residents). Table 2 over the time, according to the relative counts. Figure 9 displays observed cumulative confirmed cases, the fitted logistic growth model given by Eq. (2) with ν=1 and the corresponding 95% confidence interval for each province. However, the model in Eq. (2) was not fitted on the data associated with the WC province. As it is seen from Figure 9 , the cumulative number of confirmed cases in all provinces is described very well by a logistic growth model. The high values of the R-Squared, shown at the bottom-right of each figure, also confirms the goodness of fit of all 8 models, with FS having the lowest R-Square (0.939). Hence, we can rely on predictions given as red lines until June 5 proportionally to the magnitude of the given R-Squared for each province. Despite the inaccuracies associated with medical predictions, identifying hot spots and logistic modelling is still invaluable for better understanding of the spread in South Africa. The results of the Moran index showed the impact of President Cyril Ramaphosa's decision in containing the outbreak by strict lockdown regulations and the inter provisional travelling prohibition has a positive role in tapering the counts. The results indicated that the spreading direction in South Africa is mainly based on adjacent areas to neighbours, and doesn't matter how far the distance is to the infectious centre. The logistic growth models show a good fit to the provincial data, with R-Square values above 0.9. Visually however it is clear that for certain provinces a different modelling strategy could yield even better results ( [36] ). These provinces are GP, FS and NC and likely also WC. With South Africa phasing out the lockdown as of the beginning of May, implementing the riskadjusted strategy and economic activity plan, South African will be seeing workers returning to their workplace and COVID-19 cases are expected to increase. This initial study highlights the importance of continued analysis and showcases the valuable input that can be obtained from these analyses results. Diagnostic utlity of clinical laboratory data daterminations for patients with the severe COVID-19 COVID-19: a fast evolving pandemic COVID-19 Outbreak on Malawi Perspective WHO Director-General's opening remarks at the media briefing on COVID-19-11 Covert coronavirus infections could be seeding new outbreaks Three Emerging Coronaviruses in Two Decades: The Story of SARS, MERS, and Now COVID-19 Imaging clinic operations in the times of COVID-19: Strategies, Precautions and Experiences OIL 2020 Ruthlessly Efficient' Fight Against COVID-19 SANews President Ramaphosa Announces A Nationwide Lockdown 2020 World Health Oranization Update on COVID-19 in the Eastern Mediterranean Region. Media Cent. 2020 Available World Health Organization (2020) Coronavirus disease 2019 (COVID-19) Situation Report DispatchLive Tracker Reveals Whether Motorists Have Kept To Lockdown Rules. DispatchLive 2020 Living through global trauma: Mental-health implications of COVID-19 from a developing country perspective Government Communications 2020, United Nations: Department of Economic and Social Affairs Comparing implementations of global and local indicators of spatial association Use of local Moran's I and GIS to identify pollution hotspots of Pb in urban soils of Galway Spatiotemporal analysis of prolonged and recurrent bovine tuberculosis breakdowns in Northern Irish cattle herds reveals a new infection hotspot Spatial epidemic dynamics of the COVID-19 outbreak in China Spatial-temporal distribution of COVID-19 in China and its prediction: A data-driven modeling analysis Global COVID-19 transmission rate is influenced by precipitation seasonality and the speed of climate temperature warming 2020 Effects of the COVID-19 Pandemic on Population Mobility under Mild Policies: Causal Evidence from Sweden Change in outbreak epicenter and its impact on the importation rikss of COVID-19 progression: a modelling study 2020 Spatial and temporal analysis: autocorrelationin space and time Moran's autocorrelation coefficient in comparative methods, Package ape, cran.rproject.org Adaptation: statistics and a null model for estimating phylogenetic effects On the four types of weight functions for spatial contiguity matrix A flexible growth function for empirical use Features and partial derivatives of Bertalanffy-Richards growth model in forestry Predicting the ultimate outcome of the COVID-19 outbreak in Italy Outbreak analysis with a logistic growth model shows 2020 COVID-19 suppression dynamics in China Generalized logistic functions in modelling emergence of Brassica napus L Coronavirus In SA: WHO Boss Praises South Africa's Response To Modelling the South African 14-day COVID-19 infection rate. To be submitted The authors declare no conflict of interest.