key: cord-315676-y0qbkszx
authors: Shahid, Farah; Zameer, Aneela; Muneeb, Muhammad
title: Predictions for COVID-19 with Deep Learning Models of LSTM, GRU and Bi-LSTM
date: 2020-08-19
journal: Chaos Solitons Fractals
DOI: 10.1016/j.chaos.2020.110212
sha: 
doc_id: 315676
cord_uid: y0qbkszx

COVID-19, responsible of infecting billions of people and economy across the globe, requires detailed study of the trend it follows to develop adequate short-term prediction models for forecasting the number of future cases. In this perspective, it is possible to develop strategic planning in the public health system to avoid deaths as well as managing patients. In this paper, proposed forecast models comprising autoregressive integrated moving average (ARIMA), support vector regression (SVR), long shot term memory (LSTM), bidirectional long short term memory (Bi-LSTM) are assessed for time series prediction of confirmed cases, deaths and recoveries in ten major countries affected due to COVID-19. The performance of models is measured by mean absolute error, root mean square error and r2_score indices. In the majority of cases, Bi-LSTM model outperforms in terms of endorsed indices. Models ranking from good performance to the lowest in entire scenarios is Bi-LSTM, LSTM, GRU, SVR and ARIMA. Bi-LSTM generates lowest MAE and RMSE values of 0.0070 and 0.0077, respectively, for deaths in China. The best r2_score value is 0.9997 for recovered cases in China. On the basis of demonstrated robustness and enhanced prediction accuracy, Bi-LSTM can be exploited for pandemic prediction for better planning and management.

and Western Pacific as (7, 515) ; while confirmed cases are (410,744), (6, 125, 802) , (1, 222, 070) , (2, 847, 887) , (1, 032, 167) , and (234,815) [1] .

To be precise, COVID-19 has followed specific patterns and these patterns are based on dynamic transmission of the epidemic. When it occurs, superseding measures of different methods are used to find and evaluate such infective diseases. Any epidemic in a state or country has arisen with different aspect of magnitude with respect to time, particularly weather period changes and spread of virus over the time period, and exhibited as non-linear in nature. To capture these non-linear compelling changes, researchers have gained the attention and designed such non-linear systems to describe the abruptness of infective diseases [2] . Therefore, mathematical models such as SIR (susceptible-infective-removed) for analyzing the epidemics has been introduced [3] . A transmission model with incubation time for malaria [4] and a deterministic model to analyze the interaction between HIV and tuberculosis is successfully developed to solve the nonlinear behavior of parameters [5] . Similar models of discrete time equations are used to control the infected population [6] .

Amid of physical and statistical methods, the difference is to learn the temporal behavior of data such as coronavirus and use of non-linear functions to predict the dynamics [7] [8] .

Usually statistical approaches are based on autoregressive integrated moving average (ARIMA) model that is employed to predict the spread of epidemic trend COVID-2019 [9] and seasonal autoregressive integrated moving average (SARIMA) model which estimates the fatality rate by use of time series analysis on influenza epidemic [10] . These models have also been used to monitor and predict the dengue hemorrhagic fever (DHF) cases in southern Thailand [11] and hemorrhagic fever with renal syndrome (HFRS) cases in China to control diseases more effectively [12] . Another popular statistical model in the field of health care system is known as artificial intelligence (AI) based which is used to learn and train the COVID-19 dataset of Hubei Province in China to predict the epidemic peaks and trend size [13] . In numerous cases, these methods are not capable to fit actual data utterly and predicted accuracy is very low, while predicting the rise of COVID-19 spread.

In order to get better performance of statistical methods, machine learning (ML) models which cover several fields such as power and energy engineering [14] , technology [15] , psychology [16] , is used for early prediction and real-time spread of data. Recently, one of ML approach namely, infection size aware random forest (iSARF), observed by classification group has been proposed, which highlights the infection size and lung fields [17] . Other models are multilayer perceptron (MLP) and adaptive network-based fuzzy inference system, (ANFIS) utilized for evaluating the complex variation behavior and predicting the COVID-19 transmission [18] . Hybrid approach of support vector regression (SVR) and ARIMA has been suggested to take the confirmed cases and give predictions related to the number of contaminated persons countrywide [19] . [24] .

The novelty of the reported work lies in creating the three categories of confirmed cases, death cases and recovered cases from dataset and intelligently developing a COVID-19 predictor to predict and analyze future trends of these three categories. This experiment is based on the data set of confirmed COVID-19 cases available until June 27, 2020. Additionally, owing to the dynamic nature of coronavirus, ML and DL models have been implemented for early predictions. The prominent features of the methodology are summarized in terms of highlights as follows:

 Statistical models as ARIMA, ML technique of SVR with polynomial and RBF kernels, and DL mechanisms of LSTM, GRU and Bi-LSTM are proposed to predict the COVID-19 three categories, confirmed cases, deaths and recovered cases for ten countries.  Accuracy of models is measured in terms of three performance measures, MAE, RMSE and r2_score.  Bi-LSTM time series model enhances the learning ability and memorizing the long sequence. Dl techniques in general and Bi-LSTM in specific are proposed for smallest prediction error and higher accuracy.

Rest of the article is organized as follows: Section II describes the proposed methodologies, dataset and performance metrics; Section III includes detailed results of the designed scheme. While the conclusion are provided in the last section.

In this work, two kind of methodologies, statistical model and machine learning models including simple and deep learning techniques are established for COVID-19 predictions. In the first phase, design of ARIMA and SVR as simple ML algorithm are discussed, whereas in the next phase, description of various DL models are presented. The statistical performance in terms of three error measures, MAE, RMSE and r2_score are also specified in this section for performance evaluation. The graphical overview of the proposed scheme is illustrated in Fig. 1 , in which three categories (confirmed case, deaths cases and recovered cases) of data is collected and after preprocessing, data is passed to respective models separately and performance of models are measured through error measures. Furthermore, detail description of proposed models is provided below.

ARIMA model comprises three processes named as auto regression, integration and moving average which is data independent and employed for model architecture and parameter estimation that is linear function for past observations and arbitrary error [25] [26] . Time series form of underlying process is:

In equation (1) 

Another effective time series implementation of support machine (SVM) anticipated by Vapnik is known as support vector regression [9] . Both the SVM and SVR are used to minimize the error of margin and employ kernel functions for non-separable classes. The results can be improved by optimizing its parameters; in this regard grid and heuristic search are used to get best parameters [27] . SVR for the multidimensional data is mathematically formulated as:

Here,  . This is useful to deal with nonlinear functions in which data is mapped into high dimension space known as kernel space for high accuracy results. Finally, SVR function is mathematically obtained as equation (6):

Here, primal formula of kernel function is

represents the features in kernel space. Various kernel functions such as RBF and polynomial kernels are used, and their mathematical formulae is given as:

In equation (8, 9)  and d is the parameter of kernel that is tuned.

RNN [28] has been employed for sequential time series applications with temporal dependencies. An unfolded RNN has the capability to process current data by use of previous data. Meanwhile, RNN has the problem to train the long term dependencies data, which is solved by one of the variants of RNN. LSTM anticipated by Hochreiter and Schmidhuber [29] , has been used as advance version of RNN network and has overcome the limitation of RNN by use of hidden layer unit known as memory cells. Memory cells have the selfconnections that stored the network temporal state and controlled through three gates named as: input gate, output gate and forget gate [30] . The work of input gate and output gate is used to control the flow of memory cell input and outputs into the rest of network. In addition, forget gate has been added to the memory cell, which pass the output information with high weights from previous to next neuron. The information reside in memory depend upon the high activation results; if the input unit has high activation, the information is stored in memory cell. In addition, if the output unit has high activation then it will pass the information to next neuron. Otherwise, input information with high weights resides in memory cell.

LSTM network is compute mapping between input sequence and output sequence, i.e. 

GRU is the simple variant of LSTM that has two gates, one is "update gate" which comprises of input, forget gates and "reset gate" [32] [33] . GRU has no additional memory cell to keep information, therefore, it can only control information inside the unit. 

Dataset of novel coronavirus is taken from the link [34]. The .csv file of confirmed cases, death cases and recovered cases of all countries is provided column wise. An individual file is created of these three categories from 22 January, 2020 to 27 June, 2020. Covid19 dataset contains number of confirmed cases, deaths and recovered cases of 158 samples and we have taken cases from 1/22/2020 to 5/10/2020 for training purpose and to predict cases from 5/11/2020 to 6/27/2020. For each country, data comprises given cases for 110 days and have to predict for next 48 days. The data is preprocessed before it is given to ML models for training.

Three performance measures are used to evaluate the performance of the proposed model, these are mean absolute error (MAE), root mean square error (RMSE) and r 2 _score. C denotes the actual value and Ĉ for estimated value. The expected values of MAE is zero for the best model.

RMSE is well-defined as: (20) To demonstrates the variance between dependent and independent parameter. r 2 _score is presented as: The dataset comprises three features of confirmed cases, deaths and recovered cases. Unscaled data slows down the convergence process. MinMaxScaler subtracts the smallest value of feature and formerly divides by features range. The range is the difference between the original maximum and original minimum. MinMaxScaler reserves the shape of the original distribution of data. It does not meaningfully change the information embedded in the original data and does not reduce the importance of outliers. Parameters with their values of SVR, ARIMA and LSTM is shown in Table  1 , while results of actual and predicted cases in three categories in terms of performance measures are presented in Table 2 . It can be observed from this table that none of the three models, ARIMA, SVR_Poly, and SVR_RBF fits the dataset very well and therefore does not generate consistent predictions. Observing the values of RMSE and MAE, for some countries and even for some feature, one predicts better and for others, another model gives better results. In terms of r2_score, mostly the values are negative and thereby depicting poorer performance of the models than linear regressors. Therefore, it can be inferred that none of these models is able to give reliable and accurate predictions.

As a next step, deep learning techniques of LSTM, GRU and Bi-LSTM for three predicted categories are demonstrated in Fig. 3 in terms of MAE, RMSE and r2_score. It is worth mentioning here that parameter optimization of all methods has been carried out through trial and error and values enlisted in Table 1 have been used in generating all the results in this section. Prediction errors in terms of performance measures are plotted as bar charts for comparison among DL techniques. The smallest value of MAE is 20.79663 for Israel among ten countries for confirmed case. As the number of cases is much more for USA and Brazil, therefore the error measures are also higher for these countries as opposed to rest of other countries in actual figures.

Performance measure, r2_score, independently represent values very close to unity without any normalization and inverse transformation, which is a good sign of a consistent, efficient and accurate model for all countries and all cases. Normalized values of MAE and RMSE closer to zero along with r2_score closer to unity are the main criteria to prefer one model on another with lowest prediction error for one country to others. It is noteworthy here that DL models generate normalized error measures which are then transformed corresponding to actual numbers through inverse scalar transformation for more understanding of these cases in real world figures.

Keeping all three performance measures in view, it can be safely concluded on the basis of results that after parameter tuning, Bi-LSTM performs as best model giving highest accuracy. Predicted and actual plots of confirmed cases, death cases and recovered cases of Bi-LSTM are presented in Fig. 4 . These scatter plots demonstrate a very good match of predicted cases against actual ones for all three techniques wit much better performance than baseline regressors. Furthermore, among DL models, Bi-LSTM performs very well and its predicted values completely overlap number of actual cases. Convergence of loss function for ten countries using GRU has been plotted against number of days for confirmed, death and recovered cases in Fig. 5 . These logrithmic graphs demonstrate a smooth evolutionary plot towards converged value of fitness function. For each country this convergence value differs, but overall congverges very well and remains stable and consistent. 

Inferences on the performance of proposed scehmes are listed as follows:

 COVID-19 dataset has been modelled using various regressors including ARIMA, SVR with polynomial and RBF kernels, LSTM, GRU and Bi-LSTM for future predictions on confirmed cases, deaths and recovered case for ten countries across the globe.

 Performance measures of MAE, RMSE and r2_score have been used to compare various models.

 ARIMA and SVR models are unable to follow the trend of these features with higher prediction error and negative values of r2_score. 

☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Coronavirus disease 2019 (COVID-19) situation report-51

World Health Organization

Presumed Asymptomatic Carrier Transmission of COVID-19

Containing papers of a mathematical and physical character

Global asymptotic properties of a delay SIR epidemic model with finite incubation times

Mathematical analysis of the transmission dynamics of HIV/TB coinfection in the presence of treatment

Epidemic dynamics: discrete-time and cellular automaton models

Bridging the gap between evidence and policy for infectious diseases: How models can aid public health decision-making. International journal of infectious diseases

Forecasting of demand using ARIMA model

Application of the ARIMA model on the COVID-2019 epidemic dataset

Mortality during influenza epidemics in the United States

Forecasting Dengue Haemorrhagic Fever Cases in Southern Thailand using ARIMA Models

Forecasting incidence of hemorrhagic fever with renal syndrome in China using ARIMA model

Jianxing He, Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions

A novel wavenets long short term memory paradigm for wind power prediction

Bio-inspired heuristics for layer thickness optimization in multilayer piezoelectric transducer for broadband structures. Soft Computing

Predicting mental health status on social media

Large-scale screening of covid-19 from community acquired pneumonia using infection sizeaware classification

Covid-19 outbreak prediction with machine learning. Available at SSRN 3580188

The Hybrid Forecasting Method

A python based support vector regression model for prediction of COVID19 cases in India

Prediction of Coronavirus Disease (covid-19) Evolution in USA with the Model Based on the Eyring Rate Process Theory and Free Volume Concept. medRxiv

Machine Learning Approach for Confirmation of COVID-19 Cases: Positive, Negative, Death and Release. medRxiv

Multiple-Input Deep Convolutional Neural Network Model for COVID-19 Forecasting in China

ARIMA models to predict nextday electricity prices

An introductory study on time series modeling and forecasting

Frausto-Solís, and I. Vázquez-Rodarte, Volatility forecasting using support vector regression and a hybrid genetic algorithm

LSTM-EFG for wind power forecasting based on sequential correlation features. Future Generation Computer Systems

Long Short-term Memory

Recurrent neural network regularization

Bidirectional recurrent neural networks

Empirical evaluation of gated recurrent neural networks on sequence modeling

Gated recurrent unit (GRU) for emotion classification from noisy speech

☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: