key: cord-0717432-qtew0woe authors: Lu, Hongfang; Ma, Xin; Ma, Minda title: A hybrid multi-objective optimizer-based model for daily electricity demand prediction considering COVID-19 date: 2020-12-11 journal: Energy (Oxf) DOI: 10.1016/j.energy.2020.119568 sha: 6f099398566c173a24aea51a3f086d3577f5dc69 doc_id: 717432 cord_uid: qtew0woe Electricity consumption has been affected due to worldwide lockdown policies against COVID-19. Many countries have pointed out that electricity supply security during the epidemic is critical to ensuring people’s livelihood. Accurate prediction of electricity demand would act a more important role in ensuring energy security for all the countries. Although there have been many studies on electricity forecasting, they did not consider the pandemic, and many works only considered the prediction accuracy and ignored the stability. Driven by the above reasons, it is necessary to develop an electricity consumption prediction model that can be well applied in the pandemic. In this work, a hybrid prediction system is proposed with data processing, modelling, and optimization. An improved complete ensemble empirical mode decomposition with adaptive noise is used for data preprocessing, which overcomes the shortcomings of the original method; a multi-objective optimizer is adopted for ensuring the accuracy and stability; support vector machine is used as the prediction model. Taking daily electricity demand of US as an example, the results prove that the proposed hybrid models are superior to benchmark models in both prediction accuracy and stability. Moreover, selection of input parameters is discussed, and the results indicate that the model considering the daily infections has the highest prediction accuracy and stability, and it is proved that the proposed model has great potential in real-world applications. ). In this context, many countries have introduced lockdown policies to prevent people from contact and thus control the spread of the epidemic [2] . At the same time, the global energy sector is also profoundly affected by the epidemic. According to the statistics of IEA, except for the slight increase in the demand for renewable energy in the first quarter, the rest of the energy has declined to vary degrees, of which oil has fallen the most, reaching 9%. Moreover, electricity demand fell by 2.5% [3] . Some scholars have suggested that energy and medical care are equally important during the epidemic. Among these energy sources, electricity may be most relevant to people's lives. Many countries have issued policies requiring the power supply department to provide uninterrupted power supplies and allow users to delay payment [4] . In this case, the requirements for the power supply department to accurately allocate power resources are higher than usual. Thus, in electricity management, the importance of accurate prediction of electricity demand is self-evident. In recent years, numerous studies on energy demand forecasting have emerged. Table 1 lists the relevant works in recent two years and gives information for the utilized models. It reveals that energy demand prediction methods can be roughly divided into data-driven models and physical simulation models. The data-driven model is used more because the physical simulation model needs to consider too many external factors, and it is difficult to collect relevant data. Moreover, J o u r n a l P r e -p r o o f some data-driven regression models still need to collect data on massive related factors. On the other hand, most data-driven models use machine learning or deep learning algorithms, and in recent years many scholars have adopted hybrid models to make predictions because single models have some drawbacks. Although these models have obtained more accurate prediction results in some cases, the electricity demand prediction is still facing the following problems: 1) according to the literature review, most scholars only considered the accuracy of the prediction model, not the stability; 2) electricity demand prediction studies did not consider major global events such as COVID-19; weather and other factors considered may not be essential factors in this particular period. In other words, these models may lack applicability in the context of major global events. Driven by the problems described in Section 1.2, the purpose of this paper is to develop a model that can be better applied to the prediction of electricity demand during COVID-19. In this work, COVID-19-related factors are considered in the model design, and the applicability of factors as model inputs is discussed. In the model design, ICEEMDAN is utilized as a data preprocessing tool, MOGWO is used to optimize the SVM, and the accuracy and stability are considered. Thus, the work is innovative in that it discusses the adaptability of various factors related to COVID-19 in the model application. Besides, the proposed prediction model takes into account both accuracy and stability. The main contributions of this paper are as follows: (1) A hybrid model is proposed to predict the daily electricity demand during the COVID-19 pandemic. (2) The proposed model is compared with benchmark models regarding prediction accuracy and stability. (3) The influences of the denoising method and optimizer on prediction are discussed. J o u r n a l P r e -p r o o f (4) The applicability of factors related to COVID-19 as inputs to the prediction model is discussed. (5) The results of one-step ahead, two-step ahead, and three-step ahead predictions are compared. The rest of this paper is organized as follows. Section 2 introduces the relevant theories and implementation of the proposed model. Section 3 describes the collected data and prediction steps. Section 4 gives the prediction results. Section 5 discusses four critical issues related to this work. Finally, the primary conclusions and future works are summarized in Section 6. The proposed in this paper, ICEEMDAN-MOGWO-SVM, is a hybrid model with the structure of "data cleaning method-optimizer-basic prediction model". The relevant theories associated with the different methods are introduced in this section. Data decomposition breaks down the raw data into multiple datasets but does not distort the original data. The decomposed data is usually smoother, which is helpful for the execution of prediction. ICEEMDAN is a method that appeared in 2014 [24] , and its predecessors include EMD, EEMD, CEEMDAN, and so on [25] . EMD is an adaptive signal time-frequency processing method suitable for nonlinear signals. It can decompose complex signals into a finite number of IMFs, and each IMF contains local characteristic signals of different time scales of the original signal; EEMD is developed based on EMD to overcome the mode mixing problem; CEEMDAN eliminates the noise involved in the reconstructed signal by adding white noise, and improves the efficiency of EEMD; ICEEMDAN is another innovation based on CEEMDAN, it has high efficiency and can avoid the generation of spurious modes. Its implementation process is as follows: (1) Perform I times EMD decomposition on the original signal: where is the original signal; • is the k-th mode component generated by EMD; is Gaussian noise; is noise added signal; is noise amplitude. (2) Calculate the first residue and the first mode: where is the k-th residue; is the k-th mode; • is local average of signal. (3) Calculate the second residue and the second mode: (4) Calculate the k-th residue and the k-th mode: (5) Repeat step (4) until the termination condition of decomposition is satisfied. MOGWO is developed based on the grey wolf optimizer (GWO) [26] . GWO is a meta-heuristic algorithm inspired by the hunting behavior of wolves [27] . Each wolf in the population can be regarded as a solution to the problem. The optimal solution, optimal solution, suboptimal solution, and other solutions correspond to the wolf swarm's four levels. When wolves find their prey, they approach them. Its position equations are: where is the distance between the wolf and prey; and ' are coefficient vectors; " # and " % are the position vectors of the prey and grey wolf, respectively; $ is the current iteration. GWO keeps the best three solutions, and continuously updates the position of the grey wolf by the following formula to find the best solution: " # $ + 1 = / 0" + " + " / 1 where 2, 3 and 4 are grey wolves of different levels. MOGWO has two changes compared to GWO [26] . First, the update method has changed, and an archive is introduced to store the current best individual. After each iteration, the new individual generated is compared with the individual in the archive. In addition, to avoid too many similar individuals, all individuals are grouped according to the distance of the objective function value. Secondly, the selection mechanism of the leader wolf has changed. That is, using roulette to directly select the leader wolf in the archive, solving the problem that it is difficult to directly determine three non-dominant solutions through Pareto method. The probability of each hypercube can be calculated by Eq. (11) . More information can be found in the literature [26] . where 7 is a constant; " is the number of Pareto optimal solutions; 5 is the probability of the hypercube. SVM is one of the most popular machine learning models. It has a strong statistical foundation and is very suitable for small samples. Related theories can refer to the literature [28] . SVM has a wide range of applications in energy [29] , environment [30] , hydrology [31] , and economy [32] . It is not only used as a target model for research, but also as a benchmark model. In regression problems, the training set can be defined as [33] : 809 : , < : 1 9 : , < : ∈ > ? , @ = 1,2, ⋯ CD where 9 : and < : are input and output, respectively. The specific form of the SVM model is: where F is weighted vector; H 9 is nonlinear mapping function; 7 is deviator. In the SVM model, the penalty factor and the kernel width are two hyperparameters that affect the prediction performance. Many scholars use optimizers to optimize the original SVM model. For example, Fan et al. [34] utilized WOA, BA, and PSO to optimize SVM to predict solar radiation; Zhang et al. [35] employed CS to optimized SVM to predict short-term electricity load; Li et al. [36] used MOMVO to optimize LSSVM to predict air quality indicator. In this section, the validity of the proposed model is verified through a case study. Considering that the United States is the second-largest energy-consuming country in the world, and is the most affected in this pandemic (as of May 29, 2020, the number of infected people accounts for about 30% of the world), the case study is set up for the daily electricity demand of the United States. In this work, the daily electricity demand data of the United States come from EIA (https://www.eia.gov/). Since the proposed model considers the impact of COVID-19, data on the number of daily infections, the number of daily deaths, and GRSI are collected. The data for these three factors are derived from Our World In Data (https://ourworldindata.org/). It is worth noting that GRSI is an indicator of the degree of lockdown proposed by Oxford University after the outbreak [37] . It is a comprehensive indicator of nine factors, as shown in Fig.2 . Its total score is J o u r n a l P r e -p r o o f 100, and the higher the score, the stricter the lockdown. The horizons of the four types of data are daily, from January 19 to May 15, 2020 (see Fig.3 ). Their statistical description is shown in Table 2 . (1) Data decomposition ICEEDMDAN is used to decompose the raw data into multiple IMFs, so that the decomposed data can be in smaller ranges (smaller fluctuation ranges), as shown in Fig.4 and Table 3 . According to the related theory of ICEEMDAN, the termination of decomposition needs to satisfy the condition that the last IMF has less than three local extrema. However, for some data, the termination condition may not be met, so the maximum number of decompositions is set to 5000. If 5000 decomposition times still do not meet the condition, the decomposition is terminated. In the end, the raw dataset is broken down into six IMFs. (2) Data normalization Because the dimensions of datasets may be different, to eliminate the influence of dimensions and improve the accuracy and speed of prediction, normalization is executed using the following equation: where I ? is normalized data at interval [0,1]; I Q is raw data, respectively; I R ? and I RST are minimum and maximum of the raw data, respectively. The SVM optimized by MOGWO is output from the training set and imported into the test set for prediction. Therefore, before optimization and prediction, the data set needs to be segmented. In this work, the ratio of the training set to test set is 7:3. Besides, the one-day ahead prediction is performed in this case study (see Fig.6 ), and the relevant theories are shown in the literature [38] . J o u r n a l P r e -p r o o f Since the prediction is performed in the normalized datasets, denormalization processing is required to convert them into real values after the prediction is completed. The equation for denormalization is [39] : where $ is real prediction value; $ ? is normalized prediction value. Because the prediction is made in each IMF after being decomposed, according to the principle of ICEEMDAN, the final prediction result is the sum of the prediction results in each IMF: where 5 k is the final prediction result; $ lmn L is prediction results in each IMF; 7 is the number of IMFs decomposed from the raw data. where > _ and 5 _ are the real and prediction values at time t, respectively; x is the sample size. To Table 4 . Note: Bold denotes the data with best performance in the current dataset. Fig.8 shows the relative errors of every point for six models in the test set. It indicates that the relative errors of the proposed models are all around y = 0, the maximum value is 1.06%, and the minimum value is −0.21%. Compared with the proposed model, the distributions of relative error points for benchmark models are more chaotic, and relative error the ranges are larger. STDRE is employed to evaluate the stability of the prediction comprehensively, it can be implied from Fig.9 that the STDRE of the proposed model is 0.389%, which is much lower than other models. It indicates that the prediction stability of the proposed model is the best among the evaluated models. Although some error indicators can reflect the difference in prediction accuracy of the models, the results obtained may be misleading because some of the difference in accuracy is caused by the data's feature. Therefore, using a DM test can further measure the difference in accuracy between the models [48] . Suppose the two competing models are 1 and 2, respectively, and the true series is < _ . The prediction result of the first model is < _ m , and the prediction result of the second model is < _ m , then their prediction errors z _ and z _ are: The null hypothesis | and the alternative hypothesis | are: where j" is loss function of the square error. DM test statistics are calculated according to Eq. (24): where • is an estimation of the variance of [j" z _ − j" z _ ]. Table 6 shows that the proposed model's accuracy level is very different from the benchmark model, so it further proves that the proposed model is far superior to the benchmark model in prediction accuracy. J o u r n a l P r e -p r o o f The model proposed in this paper is developed based on SVM by introducing a denoising method and optimizer. In this section, the influences of the denoising method and the optimizer on the original model are further discussed. Thus, two other models are considered: MOGWO-SVM and ICEEMDAN-SVM. Table 7 implies that MAPE of SVM can be reduced by about 9.4% when SVM is combined with ICEEMDAN, and 72.2% when MOGWO is combined with SVM. Similar rules can be found in STDRE. They indicate that the multi-objective optimizer is better than the noise reduction method in improving the prediction performance of the original SVM. The same conclusion can be obtained by comparing ICEEMDAN-SVM, MOGWO-SVM, and ICEEMDAN-MOGWO-SVM. Nevertheless, the denoising method is still vital in some problems that require high accuracy and stability. As shown in Fig.10 , after the introduction of ICEEMDAN in MOGWO-SVM, the prediction accuracy and stability are greatly improved on the original basis. In the case study, ED is the predicted target, and the three factors of DI, DD, and PRSI are considered. Correlation analysis (to improve the reliability of the results, three correlation coefficients are used, as shown in Eqs. (25)- (27) ) proves that these three factors are indeed closely related to ED, as shown in Table 8 . where 7oe• •, y is covariance between X and Y; ž • and ž ' are the standard deviation of X and Y, respectively; ' is number of samples; > • and > y are the ranking of • and y in their respective column vectors; C 6 and C J are the number of concordant pairs and discordant pairs, respectively. Filtering input variables is critical in prediction. Excess or missing factors may make the prediction model perform poorly. In this section, six more cases are set up to explore the influence of the COVID-related factors on the prediction results, as shown in Table 9 . Fig.11 and Table 10 imply that when ED and DI are considered in the model input, its prediction accuracy and stability are the highest. However, according to the correlation analysis, the correlation between GRSI and ED is the strongest, which indicates that the factors with strong correlation as the input of the model do not mean the best prediction results. For the prediction of electricity demand in the United States during the COVID-19 pandemic, the prediction accuracy ranking of models considering different factors is shown in Fig.12 . In the practical application of daily electricity demand prediction, if managers can predict more days, the benefits for management are more significant [49] . Thus, this paper additionally examines the performance of the proposed model in two-day ahead and three-day ahead predictions. As shown in Fig.13 , the MAPE for the two-day ahead prediction is 2.06%, and the MAPE for the three-day ahead prediction is 1.86%. Although their performance is not as good as one-day ahead prediction, the prediction accuracy is still about 2%, indicating that the proposed model not only has high accuracy in one-step prediction, but also has great application potential in multi-step prediction. In practical applications, single-step prediction results and multi-step prediction results can be combined to measure the future short-term electricity consumption. If the forecast result is higher than the planned consumption, a policy to restrict electricity use can be introduced. J o u r n a l P r e -p r o o f Fig.13 . One-day ahead, two-day ahead, three-day ahead prediction results. The test of real-world data indicates that the model proposed in this work can be used to predict the daily electricity consumption in a pandemic. In the real-world applications, the real-time prediction can be carried out by establishing a prediction system. The system includes three modules: input module, model training module, and prediction module. Note that the prediction assumes that there is no significant change in energy policy. The model proposed in this paper can be used as a power system management tool, which has the following practical or energy policy-oriented functions: (1) It can predict the electricity demand during the pandemic, and the power sector can reasonably allocate the power resources according to the prediction results (such as one-step ahead, two-step ahead, and three-step ahead), so as to ensure the security of power supply during the pandemic; (2) The supply and demand of electricity determine the price, and accurate forecasting of electricity consumption can make prices more reasonable. On the other hand, price setting can balance the relationship between supply and demand, and can also help the government to better formulate policies; (3) During the pandemic, the performance of renewable energy power generation is more outstanding and more flexible, and accurate electricity consumption forecasts are conducive to the integration of renewable energy and the power system. J o u r n a l P r e -p r o o f In this work, a hybrid model combines ICEEMDAN, MOGWO, and SVM is presented to predict daily electricity demand during the epidemic. Taking the daily electricity demand in the United States as a case study, the analysis results indicate that the proposed model has higher prediction accuracy and stability than the other five benchmark models. DM test further proved the superiority of the proposed model. In addition, this paper discusses several key issues and draws some valuable conclusions: (1) In the prediction scenario of electricity demand in the United States, the multi-objective optimizer improves the prediction performance of SVM most significantly. (2) When the external factor considered by the model is DI, the accuracy and stability of the model are the highest; however, DI is not the most correlated factor with ED, indicating that the most correlated factor as the input of the model does not mean the best prediction accuracy. (3) The proposed model not only performs well in one-day ahead prediction, but also has high accuracy in two-day ahead and three-day ahead predictions. Therefore, the proposed model has a higher potential in multi-step prediction although the prediction performance is not as good as one-step prediction. The model proposed in this paper aims to be able to accurately predict electricity demand during the COVID-19 pandemic or major global events. Although it has been proved by practice that the proposed model can already obtain high prediction accuracy and stability, there are still some aspects worthy of further study. Thus, future works are summarized as follows: (a) The prediction is made in each IMF, and the final result is obtained by summing up all the results. However, direct addition may not be the best way. Therefore, follow-up research may consider using other result processing methods or developing new denoising methods. (b) In future work, more external factors can be considered as input to the model and test their rationality. Aerosol and surface stability of SARS-CoV-2 as compared with SARS-CoV-1 Impact of Covid-19 lockdown on PM10, SO2 and NO2 concentrations in Salé City (Morocco). Science of The Total Environment Global Energy Review Energy access is needed to maintain health during pandemics A hybrid stochastic model based Bayesian approach for long term energy demand managements Forecasting energy demand, wind generation and carbon dioxide emissions in Ireland using evolutionary neural networks Deep learning framework to forecast electricity demand A novel composite electricity demand forecasting framework by data processing and optimized support vector machine Day-ahead high-resolution forecasting of natural gas demand and supply in Germany with a hybrid model Forecasting China's total energy demand and its structure using ADL-MIDAS model A comparison of models for forecasting the residential natural gas demand of an urban area Regression analysis for energy demand projection: An application to TIMES-Basilicata and TIMES-Italy energy models Short-term load forecasting for microgrid energy management system using hybrid HHO-FNN model with best-basis stationary wavelet packet transform A hybrid data mining driven algorithm for long term electric peak load and energy demand forecasting. Energy A two-step approach to forecasting city-wide building energy demand Improved day ahead heating demand forecasting by online correction methods Short-term electricity demand forecasting using machine learning methods enriched with ground-based climate and ECMWF Reanalysis atmospheric predictors in southeast Queensland Energy load time-series forecast using decomposition and autoencoder integrated memory network Forecasting day-ahead natural gas demand in Denmark A novel hybrid forecasting scheme for electricity demand time series A novel hybrid modelling structure fabricated by using Takagi-Sugeno fuzzy to forecast HVAC systems energy demand in real-time for Basra city Hybrid short-term forecasting of the electric demand of supply fans using machine learning Electricity demand forecasting for decentralised energy management Improved complete ensemble EMD: A suitable tool for biomedical signal processing A complete ensemble empirical mode decomposition with adaptive noise Multi-objective grey wolf optimizer: a novel algorithm for multi-criterion optimization Grey wolf optimizer Least squares support vector machine classifiers. Neural processing letters Short-term wind speed forecasting based on the Jaya-SVM model Predicting permeability changes with injecting CO2 in coal seams during CO2 geological sequestration: A comparative study among six SVM-based hybrid models Use of support vector machines (SVMs) to predict distribution of an invasive water fern Azolla filiculoides (Lam.) in Anzali wetland, southern Caspian Sea Analysis of Timeliness of Oil Price News Information Based on SVM Short-term load forecasting of urban gas using a hybrid model based on improved fruit fly optimization algorithm and support vector machine Hybrid support vector machines with heuristic algorithms for prediction of daily diffuse solar radiation in air-polluted regions Short-term electric load forecasting based on singular spectrum analysis and support vector machine optimized by Cuckoo search algorithm Novel analysis-forecast system based on multi-objective optimization for air quality index Infected Markets: Novel Coronavirus, Government Interventions, and Stock Return Volatility around the Globe Prediction of offshore wind farm power using a novel two-stage model combining kernel-based nonlinear extension of the Arps decline model with a multi-objective grey wolf optimizer Carbon trading volume and price forecasting in China using multiple machine learning models A fast and elitist multiobjective genetic algorithm: NSGA-II Optimal scheduling ratio of recycling waste paper with NSGAII based on deinked-pulp properties prediction The whale optimization algorithm Hybrid whale optimization algorithm with simulated annealing for feature selection Particle swarm optimization Estimation of cetane numbers of biodiesel and diesel oils using regression and PSO-ANFIS models Orthogonal least squares learning algorithm for radial basis function networks Short-term prediction of building energy consumption employing an improved extreme gradient boosting model: A case study of an intake tower Testing the equality of prediction mean squared errors A novel system for multi-step electricity price forecasting for electricity market management This article is funded by the National Natural Science Foundation of China (71901184).