key: cord-0729862-2rzph4dq authors: Ardabili, S.; MOSAVI, A.; Band, S. S.; Varkonyi-Koczy, A. R. title: Coronavirus Disease (COVID-19) Global Prediction Using Hybrid Artificial Intelligence Method of ANN Trained with Grey Wolf Optimizer date: 2020-10-26 journal: nan DOI: 10.1101/2020.10.22.20217604 sha: ee69709a4ae9bf2d241db1df4f1a5726bf1bd0cd doc_id: 729862 cord_uid: 2rzph4dq An accurate outbreak prediction of COVID-19 can successfully help to get insight into the spread and consequences of infectious diseases. Recently, machine learning (ML) based prediction models have been successfully employed for the prediction of the disease outbreak. The present study aimed to engage an artificial neural network-integrated by grey wolf optimizer for COVID-19 outbreak predictions by employing the Global dataset. Training and testing processes have been performed by time-series data related to January 22 to September 15, 2020 and validation has been performed by time-series data related to September 16 to October 15, 2020. Results have been evaluated by employing mean absolute percentage error (MAPE) and correlation coefficient (r) values. ANN-GWO provided a MAPE of 6.23, 13.15 and 11.4% for training, testing and validating phases, respectively. According to the results, the developed model could successfully cope with the prediction task. COVID-19 disease broke out in December 2019 in Wuhan, China [1] . COVID-19 can be considered as a new virus that has no vaccination and proper medicine for treatment. Proper prediction strategy for the COVID-19 outbreak can attract attention to the strategies of quarantine and other governmental measures, like lockdown, media coverage on social isolation, and improving the public hygiene to control it [2] . Recently, several strategies, including mathematical models, have been successfully employed for pandemic prediction of disease. Singhal et al. (2020) used the Fourier decomposition method (FDM) and susceptible-infected-recovered (SIR) model for the prediction of the COVID-19 pandemic. According to the results, the total number of cases and death was predicted to be 12.7 × 106 and 5.27 × 105, respectively [3] . Sarkar et al. (2020) employed a mathematical model for the pandemic prediction of the COVID-19 in India. The proposed model was a combination of susceptible (S), recovered (R), asymptomatic (A), infected (I), quarantined susceptible (Sq), and isolated infected (Iq) models [4] . Anirudh (2020) employed SIR, SEIRU, SEIR, SLIAR, SIRD, ARIMA and SIDARTHE for the COVID-19 pandemic prediction [5] . But, mathematical models have disadvantages such as complexity, time-consuming and lower reliability [6, 7] . Machine Learning (ML) based models have been successfully employed in different fields of science [8] . Pandemic prediction can be considered one of the ML-based models' fields that provided good accuracy and reliability. Recently, researchers have developed studies for the prediction of COVID-19 cases and death rate using ML-based methods. Tabrizchi is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 26, 2020. ; https://doi.org/10.1101/2020.10.22.20217604 doi: medRxiv preprint NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice. COVID-19 in Hungary using time-series data [6] . Accordingly, the present study aimed to predict the global COVID-19 outbreak using a robust hybrid Artificial Neural Network integrated by Grey Wolf Optimizer (ANN-GWO) in the presence of the time-series total and daily COVID-19 cases. The required data for developing the prediction model was generated from https://www.worldometers.info/. The nature of the dataset is time-series based. Accordingly, the total and the daily number of cases are presented in Figure I from January 22 to November 15, 2020. Figure I has two vertical axes. One is for total cases and the other is related to the daily cases. Due to the use of time-series data, it has been decided to choose the input dataset according to Table I for the prediction of the global COVID-19 outbreak as the output parameter. Output X(t-1), X(t-2), X(t-3), X(t-4), X(t-5), X(t-6) and X(t-7) Global COVID-19 Outbreak ANN-GWO performed modeling in the presence of input and output datasets. ANN-GWO can be considered as one of the robust hybrid ML-based methods [12] . The modeling phase was started by employing the different architectures of ANN and different population sizes for GWO. Finally, according to the model's accuracy, ANN with the architecture of 7-10-4-1 in the presence of population size 150 for GWO was selected as the best prediction model for the global COVID-19 prediction with a high accuracy. 70% of total data were chosen for the training phase, 30% of total data were employed for the testing phase and the predicted COVID-19 outbreak for September 16 to October 20, 2020 were selected for the validation phase. The description of ANN-GWO has been comprehensively presented in our previous study in [11] . (1) (2) . Refers to the target values, . Refers to the predicted values and N refers to the number of data. Table II presents the training results, testing and validating phases into two terms: MAPE and r values. According to the results, the selected ANN architecture with a 150 at the maximum iteration of 500 provided a good accuracy for training, testing and validating phases. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 26, 2020. In this paper, the COVID-19 outbreak is modeled as a complex time series. We have developed a hybrid machine learning model based on the artificial neural network. We trained the system with the grey wolf optimization algorithm to get the highest performance. Based on the testing, we projected the outbreak until late May. Training and testing processes have been performed by time-series data related to January 22 to September 15, 2020 and validation has been performed by time-series data associated with September 16 to October 15, 2020. Results have been evaluated by employing mean absolute percentage error (MAPE) and correlation coefficient (r) values. ANN-GWO provided a MAPE of 6.23, 13.15 and 11.4% for training, testing and validating phases, respectively. According to the results, the developed model could successfully cope with the prediction task. The model provided promising results, and it can be used for outbreak prediction. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 26, 2020. ; https://doi.org/10.1101/2020.10.22.20217604 doi: medRxiv preprint Analysis and forecast of COVID-19 spreading in China, Italy and France UV-thermal dual-cured polymers with degradable and anti-bacterial function Modeling and prediction of COVID-19 pandemic using Gaussian mixture model Modeling and forecasting the COVID-19 pandemic in India Mathematical modeling and the transmission dynamics in predicting the Covid-19-What next in combating the pandemic COVID-19 Pandemic Prediction for Hungary; a Hybrid Machine Learning Approach Computational Intelligence Load Forecasting: A Methodological Overview A survey of deep learning techniques: application in wind and solar energy resources Rapid COVID-19 Diagnosis Using Deep Learning of the Computerized Tomography Scans. 2020. 10. Dutta;, A ACKNOWLEDGMENT This research in part, is supported by the Hungarian State and the European Union under the EFOP-3.6.2-16-2017-00016 project. The support of the Alexander von Humboldt Foundation is acknowledged.