key: cord-0891174-m84nwqsw
authors: Mohan, Senthilkumar; A, John; Abugabah, Ahed; M, Adimoolam; Kumar Singh, Shubham; kashif Bashir, Ali; Sanzogni, Louis
title: An approach to forecast impact of Covid‐19 using supervised machine learning model
date: 2021-04-01
journal: Softw Pract Exp
DOI: 10.1002/spe.2969
sha: 9a0ef79a5f385afb6b8253de0d146499e9ce339e
doc_id: 891174
cord_uid: m84nwqsw

The Covid‐19 pandemic has emerged as one of the most disquieting worldwide public health emergencies of the 21st century and has thrown into sharp relief, among other factors, the dire need for robust forecasting techniques for disease detection, alleviation as well as prevention. Forecasting has been one of the most powerful statistical methods employed the world over in various disciplines for detecting and analyzing trends and predicting future outcomes based on which timely and mitigating actions can be undertaken. To that end, several statistical methods and machine learning techniques have been harnessed depending upon the analysis desired and the availability of data. Historically speaking, most predictions thus arrived at have been short term and country‐specific in nature. In this work, multimodel machine learning technique is called EAMA for forecasting Covid‐19 related parameters in the long‐term both within India and on a global scale have been proposed. This proposed EAMA hybrid model is well‐suited to predictions based on past and present data. For this study, two datasets from the Ministry of Health & Family Welfare of India and Worldometers, respectively, have been exploited. Using these two datasets, long‐term data predictions for both India and the world have been outlined, and observed that predicted data being very similar to real‐time values. The experiment also conducted for statewise predictions of India and the countrywise predictions across the world and it has been included in the Appendix.

The current focus on the analysis of environmental data for the prediction of future trends has made it a prime area of research worldwide. Based on the kind of data prediction and analytical techniques employed, the present as well as the future state of the data can be forecasted or predicted. Different techniques related to modeling, statistics, data mining, artificial intelligence (AI), and machine learning are used for the analysis of data from the past or present in order to forecast future trends. The various stages involved in such analysis and predictions include defining the task, collecting related data from various sources, analyzing the data, statistical analysis, data modeling, deployment of the collected data using multiple techniques, and finally, model monitoring. This kind of predictive analysis is frequently applied to various use case scenarios such as market sales prediction, customer requirement prediction, healthcare status prediction, collection analysis, fraud detection, and so forth. Among these, the analysis and prediction of healthcare data is considered to be an important area of application, especially for predicting the future state of proliferation of highly infectious diseases. In this context, the analysis of Covid-19 related data for predicting its proliferation and containment trends is of utmost importance for arresting this ongoing pandemic across the world. Its highly infectious nature and high mortality rates make every second a valuable one, as its infection and mortality rates continue to burgeon every single day.

Countries across the world have adopted certain protocols to arrest the spread of the disease such as staying indoors, social distancing, hand washing, travel restrictions, lockdowns, and so forth. Some of these measures such as lockdowns are quite severe and affect normal human activities in unprecedented ways and have severe economic ramifications. For instance, the spate of lockdowns across the world recently has severely affected the GDP of the entire world, making robust forecasting of Covid-19 related parameters even more crucial. In order to meet this requirement, various denominations of analyses and predictions using information gathered from different sources such as daily updated websites, Kaggle, Orange, and Weka can be seen. As a result, various techniques and methodologies introduced by different researchers for forecasting the future effects of the Covid-19 pandemic can be seen competing with each other, with each having their unique strengths and weaknesses. Advanced AI techniques such as machine learning and deep learning have also being used to undertake such forecasting, with each technique having its own unique approach. In machine learning for instance, different approaches and techniques such as the regression model, the autoregressive model, the classification model, and so forth are used.

The novelty of the work proposed and outlined in this article is that it considers an approach that combines nonlinear transmission and social-spatial and temporal transmission along with the monthwise prediction of future data. Most of the historical data-driven approaches have been linear methods, and do not consider the temporal or time-based transmission methods.

As such, the contribution of the article can be outlined thus:

1. Building of a computational hybrid model with long term predictions for India, with statewise and datewise views. 2. Proposing the ensemble learning hybrid model that integrates different machine learning techniques and improves prediction accuracy. 3. Applying an autoregressive correlation model to predict the future behavior of Covid-19 data using past and seasonal data. 4. Use of the hybrid model to predict the future Covid-19 status of various countries, along with statewise, datewise data views predicted with the help of seasonal data such as heat, air quality, location, and other dynamically updated inputs.

The organization of the article is as follows: Section 2 provides a brief background on related works and existing research pertaining to the field. Section 3 provides information about the dataset as well as the proposed supervised model. Section 4 provides a gist of the working methodology and Section 5 presents the forecast and prediction results and discussion and is followed by the conclusion as well as the future directions of the research.

authors Al-qaness et al., 4 proposed a model for forecasting that used an adaptive neuro-fuzzy inference system (ANFIS). In this work an enhanced flower pollination algorithm and a salp swarm algorithm based method were proposed. This method demonstrated the best performance with the use of root mean squared relative error, mean absolute percentage error (MAPE), and coefficient of determination. The implementation of this method involved usage of two datasets from China and the USA. The authors Ardabili et al., 5 proposed models based on machine learning techniques such as a multilayered perceptron and an adaptive network-based fuzzy inference system. The proposed model predicted behavior with nationwise and daywise views. The authors Azarafza et al., 6 predicted forecasting results using deep learning techniques for the Covid-19 datasets from Iran. The proposed model used the long short-term memory (LSTM) neural network for forecasting scenarios for the entire country. The authors Mouhamadou and Balde proposed comparative forecasting results using machine learning methods. The classical SIR model 7 was used to fit Covid-19 data using different techniques and tools for forecasting including machine learning with fitting functions. The authors Zlatan et al., 8 proposed a multilayer perceptron and an artificial neural network (ANN) for predicting the spread of Covid-19. This model had 48,384 ANNs of trained data from 16,128 datasets and this model also cross-validated using a k-fold algorithm with the ReLU function used for activation. Authors Dandekar and Barbastathis, 9 proposed a model for Covid-19 prediction with data from four countries, namely, China, Italy, South Korea, and the United States of America, based on the neural network augmented model. Quarantine and isolation-related criterion were used to analyze and forecast the Covid-19 related parameters. The SIR model was used for the assumption of direct transmission. The authors Baldé, 7 proposed a model for short-term projections and the prediction of maximum number of active cases. The logistic and SIR growth model was used for the prediction of statewise data with actual and future predictions with data from four main states, as shown in the implementation.

The authors Iwendi et al. 10 proposed a model for cumulative forecasting using a modified stacked autoencoder. Using this method, multiple-step forecasting was predicted and the trajectory of the forecast was shown to be highed. In this work AI 11 with an encoder for the entire world's Covid-19 data from WHO was used and forecasted. The Modified Autoencoder was used for real-time predictions around 30 countries. The authors Huang et al., 12 15 proposed a model for short-term forecasting based on the trajectory. The Combined Linear and Exponential Predictors were used for the forecast and using this, the prediction of mortality rates and interval errors, and so forth were calculated. The authors Liu et al., 16 proposed a model using the SIR model and machine learning techniques to analyze and present forecasts based on public data. The SIR model was used for predictions of transmission using populations with the help of parametric and nonparametric methods.

The authors 17 Srivastava and Prasanna, proposed a heterogeneous prediction model based on human mobility and quick adaption trends. The training model of the proposed work was originally fit into the initial values. The results of the heterogeneous model were based on the fixed and variable schemes that were fixed with travel data and movement data. Pandey et al., 18 proposed a model for predictions in India using the SEIR model as well as the regression model with the help of John Hopkins University dataset. The proposed model had a lower error prediction rate and made predictions up to 2 weeks. Papastefanopoulos et al., 19 proposed six time series forecasting methods for the prediction of active case populations. The six time series methods were ARIMA, holt-winters additive model, TBAT, automatic forecasting procedure, DeepAR, and N-Beat. With the help of these six methods, deaths as well as recovered and confirmed cases were predicted and compared using data from various countries. The authors Pinter et al., 20 proposed a hybrid machine learning approach for pandemic predictions in which ANFIS and multilayered perceptron-imperialist competitive algorithm were proposed to predict the time series of infected individuals and mortality rate. The experimental results were predicted for 1-month. Rustam et al., 21 proposed different supervised machine learning algorithms for forecasting future Covid-19 trends.

The LR, least absolute shrinkage and selection operator, support vector machine, and exponential smoothing were used for forecasting, but the predictions were made for only 5-10 days. The authors of Kumar et al. 13 29 Datewise data predicted Dynamic parameters not included and predicted short range.

Machine learning 30 Predicted the growth of the wide-ranging in various countries.

Dynamic climate not included and datewise not predicted.

Deep learning using LSTM, GRU, and Bi-LSTM 31

Predicted long time prediction using deep learning Dynamic parameters are not included.

Long short-term memory approach model 32

Predicted using socio-economic factors Not described the socio-economic parameters.

Drone-based network model for predictions. 33 Predicted body area network and drone-based network model to predict the model Influencing factors are not described in this model.

France, India, South Korea, and the UK were presented. The forecasts reflected the impact of the broad spectrum of social distancing measures implemented by the governments. Four control variables were also identified with powerful associations with cases, fatality rates using an optimal regression tree. The authors of References 22,23 used Covid-19 data from the USA and Canada to predict and forecasting was done using deep learning with short term memory for only two successive days. Sengupta et al., 24 proposed machine learning and k-means clustering and hierarchical clustering methods that were used to forecast pandemic situations. In this method, Indian statewise and monthwise data were analyzed for predictions. The authors Sharma et al., 25 proposed a spatial-based transmission and forecasting with the help of the SEIQRD method. Using this method of spatial heterogeneity, the entire population of India in a region was divided into small distinct geographical subregions. The authors Punn et al., 26 proposed a technique based on machine learning and deep learning algorithms for the analysis of Covid-19 data. This techniques is used polynomial regression and RMSE methods. Sun et al., 27 proposed a machine learning method for infections of Covid-19 with the help of different clinical features such as fever and cough that was used for prediction. The authors of Shinde et al. 28 presented different survey methods and it was demonstrated that forecasting methods using AI, machine learning, and deep learning methods. These methods provide the best resolutions for prediction.

The authors of Zheng et al. 29 presented the state transition matrix model for datewise predictions with data predicted for short durations. The authors of Tuli et al. 30 proposed a wide-range Covid-19 data to be predicted across various countries with the help of machine learning. The authors of Shahid 31 proposed a deep learning model for long time predictions using LSTM, GRU, and Bi-LSTM. But this model did not include any dynamic parameters. The authors of Tuli et al. 32 proposed a Weibull based long short-term memory approach model for predictions across various countries. The authors of Kumar et al. 33 described the drone and body area network model to predict Covid-19 spread in the long term. But this work 34 similarly failed to include various dynamic and influential factors such as heat, climate change, air quality, and so forth. 35 The geographical based parameters are influenced the Covid-19. Especially the air pollution affects the lung inflammation. So, the location-based factors are considered for the prediction of the Covid-19. 36 Above reviewed methods predicted only short term future data and did not consider spatial transmission. And also long term predictions also were not proposed. The authors Sharma et al., 25 proposed monthwise and statewise data but did not consider spatial transmission. Clearly, advanced predictions are required to include monthwise and spatialwise predictions for the best data driven decision-making. In this article, it has been planned to incorporate monthwise predictions based on social-spatial transmissions and multiple changing parameters. 37 In addition, most of the previous works cited have failed to consider dynamically updating location-based parameters. In this proposed work additionally, dynamic parameters such as heat, air quality and other location-based factors have been duly considered for predictions. The recent denominating methods for Covid-19 prediction and limitation represented in Table 1 .

A Covid-19 forecasting model has important ramifications since based on the direction of future predictions, different decisions will be triggered. This proposed prediction model is based on a hybrid model and is also called the ensemble learning, autoregressive, and moving regressive (EAMA) model. This EAMA model consists of an ensemble learning, an autoregression model, as well as a moving average model. The ensemble learning is used to combine multiple features and inputs, and considerably improves the accuracy of the predictions. The autoregressive model is used to make future predictions based on previous trends and past data measurements. The moving average model similarly forecasts data by using current and past Covid-19 data. The proposed hybrid model therefore consists of various steps that are involved in the forecasting of Covid-19 related trends.

Steps involved in the proposed method: The diagram of the proposed architecture is shown in Figure 1 where the hybrid model based on the ensemble learning is demonstrated. The essence of the hybrid approach is the combination of the sequential and nonsequential data models used to predict the Covid-19 data scenarios. The EAMA model aims to incorporate features such as timing, moving patterns of patients, past data, and past behavior of the state and country data using ensemble learning. The ensemble learning examines the state and location-specific data. The location and timing features are incorporated in a weight matrix in a supervised learning model. The past inputs, past measurements, and the time-series based predictions, and so forth are saved in the weight matrix based on the locations. The future inputs and a training model are then used to yield the statewise predictions for a single country or across different countries. The EAMA model continuously incorporates new location and timing-based data and find the errors in these predictions. The errors are then rectified in subsequent predictive iterations. The machine learning method and the EAMA hybrid models yield different predictions in the datewise and locationwise views. The proposed model focuses on a hybrid statistical time series method to predict future data based on patterns, time, linear predictive models, and nonlinear input and output data. The main advantages of the proposed work are improved prediction rates over a longer time period and increased prediction accuracy. The governing Equation (1) of the proposed hybrid model is as given below:

Z i (t) denotes the EAMA forecast model and Y i (t) denotes the current forecast data ensemble learning model. X i (t) denotes the observation of data on a time series at time t and location with various seasonal parameters. e i (t) is the data error in the model. The main advantages of the proposed hybrid model compared with previously existing works are predictions over a longer time period and increased accuracy. The materials, the combined supervised hybrid model, and the working procedures of the proposed hybrid method are presented in the upcoming sections.

Dataset: The dataset used in this research work played a crucial role in the accurate prediction of Covid-19 cases. We have collected data mainly from two sources:

1. Ministry of Health & Family Welfare, Government of India: 38 Data pertaining to all the Covid-19 cases in India was taken from the website of the Ministry of Health and Family Welfare which is maintained by the Indian Government. 2. Worldometer: 39 Covid-19 data for the rest of the world was taken from the "Worldometer" website which is run by an international team of developers, researchers, and volunteers. This website is recognized by the American Library Association.

The aforementioned datasets are updated regularly. Consequently, the nature of this data is highly dynamic. We took the latest data available at this point to plot all the graphs and to create various kinds of tables. For plotting the graphs, data up till the month of July 2020 was used. Furthermore, we predicted the Covid-19 cases for the next 3 months (until October 2020) using different machine learning methods for time-series forecasting. Duration of Data:

1. India: From 30 January 2020 to July 2020. 2. World: From 22 January 2020 to July 2020.

The data is classified into three categories:

We have incorporated the above three categories of data in our datasets. To calculate the "Active Cases," we added the total "Recovered Cases" and "Deaths" and then subtracted the resulting number from "Confirmed Cases." In addition, we have added two new columns in our dataset for India, namely; "Death rate per 100" and "Cure rate per 100."

Different supervised machine learning models have been used to predict and analyze future data. Some of the models mainly used in this study are ensemble learning, autoregressive model, and moving average regressive model. Ensemble learning: Ensemble learning is a combination of multiple models such as experts or classifiers that are generated in order to solve a particular intelligence problem. The main usage of ensemble learning is to improve the prediction, classification, and to construct better approximations of functions that need to be learned. Using this method, the prediction performance is improved and unfavorable circumstances arising from the use of poor predictions are eliminated. This learning model is used for decision-making processes, incremental learning, and error correction. In this learning model, "boosting" is employed to increase the weightage for training data that is misclassified so that the existing weak classifier can be strengthened. Use of this boosting concept produces better accuracy. The boosting can be represented as Equation (1) and (2) .

where D 1 denotes the base leaner or training model, L 1 denotes the weight assigned with the corrected classifier, and L 2 denotes the boosted classifier.

D 2 denotes the second base learner. Using this boosted D 1 and D 2 , the best results are achieved with help of voting between these two base learners or averaging them.

Autoregressive model: An autoregressive model predicts the data based on time and measurements taken from previous actions. The previous actions and the statistical correlation between the observations in the past are used to predict the data (in this case Covid-19 data) at future instances. 40 The governing equation of the autoregressive model is represented as Equation (4).

P(t + 1) denotes the prediction value at time t + 1, C denotes constant location, T denotes various logged features, W(t − i) denotes weight, t denotes time.

The rate at which Covid-19 spreads is changing every day. With seasonal changes, Covid-19 predictions fluctuate every day. Ensemble learning is used to train the model and update the effects of data automatically. Ensemble learning takes into account the seasonal changes and corresponding factors are used to boost the training model automatically. The parameters encapsulating seasonal changes such as heat, humidity and air quality are updated in every round of boosting. Based on the boosting, the quality of prediction is improved and incorrectly predicted data from previous iterations is also updated. Equations (2) and (3) are used for training and updates every day. Equation (5) is used to represent a fixed region and changing seasonal parameters represented in Equation (6).

X i (t) denotes the observation of data on a time series at time t and region. ∑ (X) represents different regions and including X are constants. ∑ (CP) represents different changing parameters based on seasonal changes.

Σ(CP) represents different parameters at the initial time, CH 1 (t) denotes heat, CH 2 (t) denotes humidity,CH 3 (t) denotes air quality, CH n (t) denotes upcoming parameters, CP n (t + 1) denotes next time continuous updating values. In Equation (4), ∑ (X) represents different locations and Σ(CP) represents the changing parameters with respect to region and time. These parameters are changing every day. The final prediction of Covid-19 spread per day is as shown in Equation (7).

As mentioned in Equation (5), Σ(X) denote represents different locations with X as constants and Σ(CP) denote represents different dynamic parameters based on seasonal changes. The updated prediction represented in Equation (8) .

P(t + 1) denote the prediction value of t + 1, w(t).f (t) denote predicted features data f (t) is the partial derivation with respect to weighted data w(t), f (t) denote various features of the seasonable change. Using Equation (4), the past trends in Covid-19 data can be extrapolated to predict future qualitative and quantitative behavior.

Ensemble learning is an automated learning model and it supports different parameters and combines different parameters. The two main parameters that are used for training and testing are location and dynamic seasonal parameters. The (X) is considered as location and is essentially constant. The (CP) encodes dynamic parameters with each and every parameter being updated for every iteration of the computations. The initial data was employed as the training data, but later, all the parameters are automatically trained and updated.

Moving average model: The moving average model is a common model for predicting data based on the linear dependence existing between the current and past values. The forecasting model is represented in Equation (9).

where, y t denote the weighted moving average. t denote the error rate. 1 + ... + p denote the different time-series pattern. q denote moving average data.

In the Algorithm 1 represented the various procedure and steps for prediction of the Covid-19. Initially the various input dataset sources are included to the training and testing of the model. Second part of the proposed work, included various parameters such as location and other features are incremented. The help of D1 and D2 simultaneously all the features are boosted continuously.

Step 1: The collected Covid-19 sample data from February to July duration is used as training data.

Step 2: Determine the best sample using the EAMA model based on current sample and training data.

• Ensemble learning combines different inputs such as location, migration data, and past data, and is described in Section 3.2.

• Autoregressive model measures the current and past data and based on this, the future data is predicted as explained in Section 3.2.

• Moving average model, forecasts the future data using past and present values, and produces average values from multiple instances of data.

Step 3: Train the model using the series of data generated and produce boosted classifiers with the help of D1 and D2 recursively.

• The future data is correlated using Equation (3) and weight W(t − i) and time series t is updated continuously.

• Obtain the weighted and fine grained solution using Equations (5)- (8) .

The sample is validated, tested, and predictions are generated.

Step 4: Final predictions are generated using moving average model (with the help of Equation (5)). Continuous prediction and testing are performed in different iterations. 

This section describes the achieved results and presents them in a diagrammatic way. For the implementation of the algorithm and the discussion of the results, two datasets are used, and these are described in detail in Section 3.1. The proposed EAMA hybrid model achieved significant predictive accuracy on three parameters, namely, the number of people affected, the number of recoveries, and mortality figures. The predicted results are shown in Figures 2 and 3 and are also fully tabulated in the appendix. The predictions pertaining to India and various countries from the months of July to October 2020 are shown in Figures 2 and 3 . The proposed hybrid model was predicted from January to July data. For implementation, different metrics such as Reproductive Number (R0), MAPE, RMSE, prediction error are employed. The R0 value is used to measure the closest value and to find the related values. Some of the R0 values that are used in the prediction are as follows: Italy (0.95), France (0.85), Germany (0.85), Spain (0.85), India (0.9-1.25), United states (0.9), United Kingdom (0.85). Among Indian states the maximum R value is more than 1 as seen in Kerala. These regressive base values are also used in the implementation of the model for the prediction of Covid-19 cases. Some other metrics used in the prediction are as shown Equations (10)- (13) . 

The output of the prediction results are based mainly on the twin parameters of affected and forecast data for the India and the world. For the implementation of the proposed EAMA model, we used 80% as training data, 10% for testing, and 10% for validating. The proportions of data used for training, testing, and validation produced the best results compared with other proportions of training, testing and validating data.

The prediction anticipates a linear growth in the spread of Covid-19 cases. This predicted model is based on the location, past data for the number of people affected in the particular geographical area, the movement patterns of people, and so forth. The scope of the prediction is gradually expanded and corresponding real-world data is also gradually increased. The prediction of the corresponding parameters at an international level is shown in Figure 4 .

In this prediction, the positive cases are decreasing in some periods, and again increasing in certain other periods. The overall cases in the world gradually increased and then gradually decreased. In the middle of August month, the positive cases spiked up suddenly due to the renewed migration of people. Table A1 shows the statewise prediction in the India and Table A2 shows daily forecasting data for the India as well as the world. The prediction values are seen to have increased gradually ( Figure 5 ). Figure 6 shows the prediction Covid-19 affected cases worldwide from July to October 2020. The worldwide positive cases, death cases, recovery cases and the confirmed cases are shown in Figure 4 . The confirmed cases and active cases are seen to be high again in the worldwide predictions. In this prediction, the recovery cases are seen to gradually decrease and death cases are also found to gradually increase because the number of affected people in advanced age groups are very high worldwide. Figure 5 shows the forecast predictions of various parameters in the India. For the India, the prediction of the following parameters are made, namely; cure rate per 100 people, death rate per 100 people, active cases, cured cases, confirmed cases, and so forth. Figure7, confirmed Covid-19 cases seem to be always very high when compared with other parameters. The exact values of the cured and affected numbers are mentioned in Appendix A1. The number of deaths in the India have been quite low but at the end of October, it is bound to have a huge jump due to climatic change, geographical changes, and the number of people affected. In the north India and the south India, cases increased due to the geographical and climate changes. Especially in the north India end of the October and starting of the November the huge number of cases are increased due winter starting. But in the south India, due to weather changes and reduction of migration of the people, cases are reduced. So, automatically the affecting people ratio is reduced in the south India. The active cases are gradually decreasing compared with the number of affected patients. In the India, the real prediction and the forecast are both very similar. Table A1 shows the data forecast statewise and as opposed to the previous methods, it shows highly accurate predictions. Similarly, Figure 7 shows the world-wide predictions and it uses four parameters, namely, confirmed cases, deaths, recoveries and active cases. The number of confirmed cases, active cases, and deaths are seen to gradually increase but the recovery rate exhibits a marked reduction. The daily international forecasting values and Indian forecasting values are shown in Table A2 and the countrywise forecasting values are shown in Table A3 . In this way, all the predictions are compared with the actual daily data and the results are also shown to be fairly close. Compared with the previous methods, the main advantages of this work are the high accuracy of the long-term predictions and their marked similarity with real-world data. 

The proposed hybrid model makes it possible to acquire novel features of the data because it predicts Covid-19 cases based on different scenarios such as location, past data, movement patterns of citizens, and so forth. The proposed hybrid supervised model EAMA employed a combination of ensemble learning techniques, autoregressive, and moving regressive models. Using this mixture of techniques helped to easily combine multiple features and inputs, predict future data using past data trends, and produce average aggregate results. The proposed model had 80%, 10%, and 10% of the overall data for training, testing, and validation, respectively. Therefore, the prediction performance was seen to be high, as was the validation accuracy. For implementation, two different datasets were used, the ministry of India dataset and the Worldometer dataset from the months of February to July 2020. Using this hybrid EAMA model, different parameters were predicted at an international and national level such as the number of affected cases, confirmed cases, and deaths. Especially in the India, the number of active cases and deaths were predicted at a statewise granularity. The main novelty of the proposed work lies in its long-term prediction accuracy, as opposed to other methods which work only for a short duration. Using these predictions, we can easily measure the future trajectory of Covid-19 cases and employ it in decision and policy making processes. The future work should incorporate different models such as recurrent neural networks and also use nonlinear methods for predictions tailored to specific geographical locales. 

Covid-19: open-data resources for monitoring, modeling and forecasting the epidemic

Artificial intelligence and machine learning to fight COVID-19

Optimization method for forecasting confirmed cases of COVID-19 in China

COVID-19 outbreak prediction with machine learning. medRxiv

COVID-19 infection forecasting based on deep learning in Iran. medRxiv

Fitting SIR model to COVID-19 pandemic data and comparative forecasting with machine learning. medRxiv

Modeling the spread of COVID-19 infection using a multilayer perceptron

Quantifying the effect of quarantine control in Covid-19 infectious spread using machine learning. medRxiv

COVID-19 patient health prediction using boosted random forest algorithm. Front Public Health

A feature extraction based approach to detect Covid-19 related fake news

Multiple-input deep convolutional neural network model for COVID-19 forecasting in China. medRxiv

Forecasting the dynamics of COVID-19 pandemic in top 15 countries in April 2020: ARIMA model with machine learning approach. medRxiv

A machine learning methodology for real-time forecasting of the 2019-2020 COVID-19 outbreak using Internet searches, news alerts, and estimates from mechanistic models

COVID-19 data repository and forecasting county-level death counts in the United States

A machine learning methodology for real-time forecasting of the 2019-2020 COVID-19 outbreak using Internet searches, news alerts, and estimates from mechanistic models

Data-driven identification of number of unreported cases for COVID-19: bounds and limitations

SEIR and regression model based COVID-19 outbreak predictions in India

COVID-19: a comparison of time series methods to forecast percentage of active cases per population

COVID-19 pandemic prediction for hungary; a hybrid machine learning approach

COVID-19 future forecasting using supervised machine learning models

Time series forecasting of COVID-19 transmission in Canada using LSTM networks

Real-time forecasts and risk assessment of novel coronavirus (COVID-19) cases: a data-driven analysis

Covid-19 pandemic data analysis and forecasting using machine learning algorithms. medRxiv

Spatial network based model forecasting transmission and control of COVID-19. medRxiv

COVID-19 epidemic analysis using machine learning and deep learning algorithms. medRxiv

An interpretable mortality prediction model for COVID-19 patients

Forecasting models for coronavirus disease (COVID 19): a survey of the state of the art

The prediction for development of COVID-19 in global major epidemic areas through empirical trends in China by utilizing state transition matrix model. medRxiv

Predicting the growth and trend of COVID-19 pandemic using machine learning and cloud computing. medRxiv

Predictions for COVID-19 with deep learning models of LSTM GRU and Bi-LSTM

Modelling for prediction of the spread and severity of COVID-19 and its association with socioeconomic factors and virus types. medRxiv

A drone-based networked system and methods for combating coronavirus disease (COVID-19) pandemic. Futur Gener Comput Syst

Forecasting covid 19 growth in india using susceptible-infected-recovered (sir) model

Prediction of criticality in patients with severe Covid-19 infection using three clinical features: a machine learning-based prognostic model with clinical data in Wuhan. medRxiv

Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal

Critical care utilization for the COVID-19 outbreak in Lombardy Italy early experience and forecast during an emergency response

Ministry of Health & Family Welfare Government of India

Artificial intelligence forecasting of covid-19 in China

The authors Masum and Hossain 1 presented the forecasting and predictions for Covid-19 in Bangladesh and used a linear regression (LR) model in order to train with 25 days of data. The data was validated using root mean square error (RMSE) and produced forecast data for a month. The authors Alamo et al., 2 carried out data forecasting based on mobility, demographic variables, government measures, weather conditions, and described various datasets linked to various countries. The authors Alimadadi et al., 3 developed a text and data mining technique to predict Covid-19 parameters and this work was analyzed with the help of a machine learning method for predicting spread, accuracy, and speed of diagnosis. The

This work was supported in part by Zayed University, office of research under Grant No. R18088.

The authors declare no potential conflict of interests.