key: cord-0941928-ne6ceiw2 authors: Singla, Pardeep; Duhan, Manoj; Saroha, Sumit title: An ensemble method to forecast 24-h ahead solar irradiance using wavelet decomposition and BiLSTM deep learning network date: 2021-11-17 journal: Earth Sci Inform DOI: 10.1007/s12145-021-00723-1 sha: 9a0fd7f36dfb51331edcb45d04a5859aab06c7be doc_id: 941928 cord_uid: ne6ceiw2 In recent years, the penetration of solar power at residential and utility levels has progressed exponentially. However, due to its stochastic nature, the prediction of solar global horizontal irradiance (GHI) with higher accuracy is a challenging task; but, vital for grid management: planning, scheduling & balancing. Therefore, this paper proposes an ensemble model using the extended scope of wavelet transform (WT) and bidirectional long short term memory (BiLSTM) deep learning network to forecast 24-h ahead solar GHI. The WT decomposes the input time series data into different finite intrinsic model functions (IMF) to extract the statistical features of input time series. Further, the study reduces the number of IMF series by combining the wavelet decomposed components (D1-D6) series on the basis of comprehensive experimental analysis with an aim to improve the forecasting accuracy. Next, the trained standalone BiLSTM networks are allocated to each IMF sub-series to execute the forecasting. Finally, the forecasted values of each sub-series from BiLSTM networks are reconstructed to deliver the final solar GHI forecast. The study performed monthly solar GHI forecasting for one year dataset using one month moving window mechanism for the location of Ahmedabad, Gujarat, India. For the performance comparison, the naïve predictor as a benchmark model, standalone long short term memory (LSTM), gated recurrent unit (GRU), BiLSTM and two other wavelet-based BiLSTM models are also simulated. From the results, it is observed that the proposed model outperforms other models in terms of root mean square error (RMSE) & mean absolute percentage error (MAPE), coefficient of determination (R(2)) and forecast skill (FS). The proposed model reduces the monthly average RMSE by range from 26.04–58.89%, 5.17–31.35%, 23.26–56.06% & 21.08–57% in comparison with benchmark, standalone BiLSTM, GRU & LSTM networks respectively. On the other hand, the monthly average MAPE is reduced by range from 9 to 51.18%, 12.59–28.14%, 30.43–59.19% & 26.54–58.92% in comparison to benchmark, standalone BiLSTM, GRU & LSTM respectively. Further, the proposed model obtained the value of R(2) equal to 0.94 and forecast skill (%) of 47% with reference to the benchmark model. Solar energy is one of the leading renewable energy resources (RER) to generate electricity with zero carbon emission ) and its market is increasing rapidly from the last decade due to its sustainable & supported characteristics . According to the international energy agency (IEA), by the end of 2030, the total capacity of the PV installation will hit the 1700GW level (Labouret and Villoz 2010) . However, this power capacity was 8GW in 2007 and now reported 402 GW in 2017, as per the global energy states report (REN 21) (Hales and Renewables 2018) . Moreover, as per numerous investigations, the power grid will be 100% fully operational on Extended author information available on the last page of the article the RER by the end of the year 2050 (Jacobson et al. 2017 ). But, due to the direct dependency of solar PV output over climatic & geographical conditions, the stability of solar irradiance is always a subject of discussion in regard to grid management (Lan et al. 2019) . Its inherent variable characteristics & uncertainties must be accommodated by the resource planners to make a reliable grid-interconnected power system (Dong et al. 2020) . Therefore, the precise and accurate forecasting of solar components is one of the prime requirements for two basic reasons. (a) to make the reliable grid-interconnected system which accommodates volatile, intermittent & random characteristics of solar generation, (b) to increase the utilization ratio of solar PV output along with maximum returns on the capital investment on solar park development (Zang et al. 2020a ). However, reliable solar irradiance forecasting is not an easy task as it is very susceptible to seasonal and climatic effects. A minor fluctuation in the meteorological variables directly affects the stability & reliability of the dependent systems (AlSkaif et al. 2020 ). In such situations, an appropriate forecasting model is highly requested. Several examinations have already been conducted on the development of methods to forecast the PV power or the solar irradiance components (Yu et al. 2019) . In general, the forecasting methods can be divided into five different sections: based on process type, based on prediction method, based on spatial scale forecasting process, time horizon-based forecasting, prediction form-based forecasting, and method / approach-based forecasting ). The different categories are shown in Fig. 1 . An approach-based forecasting is commonly used over various time horizons, systems, shapes and sizes . The persistence approach assumes the past similar day/h value of the target vector as the value of the presentday/h. If 't' is the present time, then the value of target vector at 't' time is considered as the value of 't + k' where 'k' is the time horizon for which forecasting is being performed (Kumler et al. 2019 ). On the other hand, the physical approach generates the mathematical relationship between the target GHI to the meteorological & geographical parameters. These models performed the forecasting of solar GHI by the use of meteorological and geographical variables instead of historical time series data. However, these models are not popular due to its high computational cost and worst precision (Alonso-Montesinos and Batlles 2015). European Centre for Medium-Range Weather Forecasts (ECMWF) & Weather Research and Forecasting (WRF) models are the two popular models used for forecasting the atmospheric & operational services (Richardson et al. 2020; Perez et al. 2012; Yang and Kleissl 2016) . Conversely, the statistical approach improves the accuracy and minimizes the error of the model by managing the correlation mapping between input and output parameters (Ruhang 2016) . Grey theory (Li and Zhang 2019) , Regression Analysis (Doorga et al. 2019) , Fuzzy method (Reza Parsaei et al. 2020) , time-series method (Bigdeli et al. 2017) and machine learning (ML) ) are the different categories of statistical approach. The regression Auto Regressive Integrated Moving Average (ARIMA) techniques (Shadab et al. 2020) , support vector regression (SVR) (Mohammadi and Aghashariatmadari 2020), Gaussian Progress Regression (GPR) (Sheng et al. 2018 ) and ML approach based models proved to be best in the recent years in terms of forecasting . ML approach such as artificial neural network (ANN) (Jahani and Mohammadi 2019), support vector machine (SVM) (Zeng and Qiao 2013) , and Elman neural network (ENN) (Dumitru et al. 2016) were used historical data to learn the pattern which improves the accuracy of the model. Along with, a standalone model sometimes relies on the limited characteristics of the data and failed to learn hidden characteristics of data. This limited learning of the model leads to the poor performance of the model. To overcome this problem, hybrid models has been developed such as: PLA-k-means-HGWO-RF (Liu and Sun In addition, the deep learning approach emerges with great advantages with better accuracy for the forecasting of (Zang et al. 2020a) . The WT along with LSTM was utilized by Wang F et al. and the results proved that the models accuracy can be improved by WT . The WT and LSTM network was also used by M. Mishra et al. The study predicted the solar power for 1-h to 1-day ahead of time horizon in which WT was used to decompose the input raw data series into different frequency components and LSTM was used to forecast solar power (Mishra et al. 2020 ). In the same line, the GRU network was proposed by B. Gao et al. to forecast the day ahead solar irradiance. The model predicted the day ahead solar GHI with the combination of time series and meteorological data as an input. The GRU network provided the minimum RMSE of 122.45 W/m 2 with 42.01% of forecast skill (Gao et al. 2019) . Several other studies on deep learning are also available in the literature and most of them used LSTM, CNN for PV power forecasting & wind forecasting Kumar et al. 2020; Hu and Chen 2018) . But, a few studies are available with the BiLSTM network to forecast the solar irradiance. However, BiLSTM network have already been used in electric load & price forecasting (Cheng et al. 2019) , tourism demand forecasting (Kulshrestha et al. 2020 ), user's next location forecasting (Bao et al. 2020) , stock & demand forecasting (Kim and Moon 2019) , wind speed forecasting Hu and Chen 2018) as well as covid-19 (Zeroual et al. 2020) predictions. With respect to BiLSTM, Li C. et al. predicted the solar irradiance using the standalone BiLSTM network. The BiLSTM network utilized the meteorological data to predict the hourly solar radiation. The results of the study clearly showed that the BiLSTM network is superior to the LSTM, SVR and LR models with RMSE of 98.44 W/m 2 , MAE of 71.49 W/m 2 (Li et al. 2021 ) . Rai A. et al. predicted the solar radiations for the time horizon of two months using the CNN and BiLSTM. The CNN was used to extract the features of the input data time series; whereas, BiLSTM exploited the dependencies of the time series. This study proved that the CNN-BiLSTM model performed better than LSTM, GRU, CNN-LSTM and GRU-LSTM (Rai et al. 2021 ). Therefore, inspired from the above work, this paper proposes an ensemble model to forecast 24-h ahead solar GHI using wavelet transform (WT) and BiLSTM with an objective to improve forecasting accuracy. For the justification of superiority of proposed model, the performance is compared with the reproduced naïve predictor and other proven deep learning networks such as: unidirectional LSTM, unidirectional GRU, BiLSTM, traditional wavelet-based BiLSTM (WT-BiLSTM(T)) and modified wavelet-based BiLSTM (WT-BiLSTM(Mod)). In brief, the major contributions are as follows: • A brief literature of solar forecasting based on deep learning network by considering almost all important factors related to forecasting. • Reproduction of naïve predictor (benchmark), three different standalone deep learning models: unidirectional LSTM, unidirectional GRU and BiLSTM network. • With the aim to fill the space of WT-BiLSTM process, the various scenarios of WT based BiLSTM models are analyzed. For this, the traditional WT is combined with BiLSTM (WT-BiLSTM (T)) network to forecast the solar GHI. In the other scenario, the scope of WT is extended by combining the different wavelet decomposed subseries (IMF) and fed to BiLSTM network (WT-BiLSTM (mod)) to forecast solar GHI. • Implementation of proposed model based on forecasting of wavelet decomposed component using BiLSTM (WT-BiLSTM(CF)). The proposed model is developed to improve the forecasting using the best use of WT with BiLSTM. The model allocates an individual BiLSTM network to the best combination of wavelet decomposed subseries. The forecasted value obtained from BiLSTM for each subseries are reconstructed to generate the final forecast. • Performance analysis of proposed model in comparison with other reproduced models along with benchmark model. The proposed model observed as a best performer among all models with lesser annual RMSE of 45.61 W/ m 2 , MAPE of 6.48% and R 2 of 0.94. The forecast skill of the proposed model against naïve predictor is also observed and found 47%. Therefore, the paper is structured in such a way that the theoretical context of the wavelet transformation, LSTM and BiLSTM deep learning network is defined in Section 2. Section-3 deals with the framework of the suggested model while Section-4 addresses various experimental scenarios. All experiments are analyzed as a discussion in section-5. Finally, the conclusion is drawn in section-6. This section provides a brief overview of the pre-processing i.e. WT and deep learning networks related to the proposed forecasting model. It is a tool to process the highly random & time-varying data signals by which time domain signal can be directly transformed into the time-scaled frequency domain . In other words, WT decomposes nonlinear and the non-stationary data in to set of different frequency profiles. These frequency profiles extract the appropriate statistical characteristics of the input data series. The high frequency components provide the short term variations in the data and can be used to improve the forecasting accuracy for short term prediction. The WT can be categorized into two types according to the input signal: Continuous wavelet transform (CWT) and Discrete wavelet transform (DWT) (El-Hendawi and Wang 2020). This CWT can be expressed mathematically as: Whereas, the DWT for a time series GHI (t); t = 1, 2 3, can be expressed as a mother wavelet 'ψ m, k ' & father wavelet 'φ m, k ' and expressed mathematically as (El-Hendawi and Wang 2020): In DWT, the convolution of father wavelet 'φ' with original series 'GHI(t)' gives the approximation components 'A m,k '. Whereas, the convolution of mother wavelet 'ψ'with original series 'GHI(t)' gives the detail components 'D m,k '. Or in simple terms, for a finite length of series with finite decomposition is (Saroha and Aggarwal 2018): Where "D1, D2, D3 ,…., Dn" are the detailed components of input time series and "A1, A2, A3,…,An" are the approximation components of input time series. For processing the sequence data, J. J. Hopfield proposed a structure in 1982, called Recurrent Neural Network (RNN). Unlike conventional ANN, the output of RNN is connected back to the input through feedback which acts as dynamic memory (Zang et al. 2020b ). This network performed best for the short term forecasting but becomes unstable for long term forecasting. This instability is due to exploding of gradient i.e. sudden large variations in training weights (Hochreiter and Schmidhuber 1997) . The solution of gradient exploding was provided by LSTM network by allowing memory cells in the hidden layer(s). These memory cells are used to appropriately store the information or discard the relevant information of the data. The basic architecture of the LSTM is shown in Fig. 2 . Each cell of the LSTM consists of forget gate (f t ), input gate (i t ) and output gate (O t ) to accept or discard of any information (Kulshrestha et al. 2020) . For a forward movement function, the previous cell state 'C t-1 ' has been discarded by the network. At present time 't', the LSTM (Fischer and Krauss 2018) . Therefore, the activation values can be expressed as: The following equations are used by the network to decide whether the information about data has to be discarded or retained. Now, the final output of the memory cell becomes: are weight vectors of LSTM network and σ is sigmoid function ranges from '0' to '1'. The BiLSTM network is composed of forward and backward LSTM in which data can be processed in forward as well as in backward direction. The backward direction processing captures the hidden characteristics and pattern of the data which was generally ignored by LSTM (Yildirim 2018) . The The forward hidden layer 'L f ', the backward hidden layer 'L b ' and output sequence 'GHI o (t)' used to update the network. The network updates iteratively in backward i.e. from 'T' to '1' and forward direction i.e. '1' to 'T'. The updated parameters of the network can be expressed mathematically as: Where, "L f ", "L b " & "GHI o (t)" are forward pass, backward pass and final output layers respectively. 'W' is the weights coefficient and ' (Cheng et al. 2019 ). The basic idea of this work is to improve the forecasting accuracy of Solar GHI using WT based BiLSTM network in which different scenarios of WT pre-processing are being applied. The Fig. 4 shows the pictorial representation of proposed model and description of each step is given below: A) Quality control of data: Initially, the collected data from any site is available in the raw form and has a great influence on the model's performance. There might be possibilities of presence of negative or incomplete data records due to weak pyranometer response. So, these GHIOt-1 GHIOt GHIOt+1 BiLSTM architecture entries must be removed or corrected before the application to the forecasting model (Yousif et al. 2013 ). In addition to this, the night hours data are also removed from the dataset due to absence of solar irradiations in night time. The data just before the sunset and just after the sunrise is also a culprit factor in the model's performance due to cosine error of instruments and must be removed. In a single statement, this process can also be understood as the removal of data having solar zenith angle greater than 80 o (Lauret et al. 2015) . B) Data Stationarity: The majority of the solar data collected from the site is often random and nonlinear in nature which consists of periodicity and seasonality (Singh et al. 2018) . Therefore, to enhance the model's efficiency, the data has to be rendered stationary before providing to the forecasting model. This paper considers the clear-sky index (CSI) calculation to render the data stationary or to turn in the "0-1" range. The CSI can be calculated as (Benali et al. 2019) : Fig. 4 Structure of proposed forecasting model III) Data decomposition: Using data quality control, once the time series has been cleaned & converted into CSI. The WT is applied on the input data series to decompose into approximation and detail components. This work observed the Daubechies (db7) wavelet provides the required smoothness of considered data. However, the wavelet decomposition up to level 1-10 was observed experimentally where level-7 with db7 found best for the dataset used in the study. The pictorial representation of level-7 decomposition is shown in Fig. 5; whereas, Fig. 6 represents the wavelet decomposition results for the original solar GHI time series. Here, GHI(t) = D1 + D2 + D3 + D4 + D5 + D6 + D7 + A7 IV) Selection of hyperparameters: One of the significant tasks in case of a deep learning algorithm is the tuning of hyperparameters. However, no such thumb-rule is present in the literature to set these hyperparameters at optimal range. But, the study obtained the optimal hyperparameters using the grid search algorithm by varying each parameter's value with in a specific bound. The specific process of searching the best hyperparameter is shown in Fig. 7 using a flow graph whereas, Table 1 provides the best values along with their corresponding bounds. This study kept these fine-tuned hyperparameters constant throughout the work to maintain the fairness in the comparison of different models. The specific rules to select best hyperparameters are mentioned below: 1) Set the initial hyperparameter at a default value. 2) Set the optimum learning rate. 3) Choose the perfect epoch with batch size. 4) Choose the relevant optimizer. 5) Select the appropriate activation function. 6) Choose suitable numbers of the hidden layer. E) Forecasting Process: Input data is decomposed at level 7 using db7 wavelet in which eight components: A7 and D7-D1 are obtained. So, as input features, the eight components D1 to D7 and A7 with sufficient time lags are configured. This study performed a comprehensive experiment to obtain best solar GHI forecast for different combinations of decomposed components as input to the predictor for different level's decomposition (refer Table 5 ). For the same, the input dataset have been divided into two phases: the training phase and forecasting phase. With the aid of one year training data, forecasting is performed on monthly basis up-to next 12 months with moving window mechanism. The proposed model forecasts 24-h ahead of hourly data for each month. The output sequence is reconstructed to have the actual CSI at this stage. Although, using the following equation, this sequence is further transformed into real solar GHI from CSI. Where, "CS(t)" & "CSI" are Clear sky GHI & clear sky index respectively. (17) MAPE = 1 n n ∑ i=1 | | | | | | GHI i − GH ∼ I i GHI i | | | | | | (18) RMSE = √ √ √ √ 1 n n ∑ i=1 GHI i − GH ∼ I i 2 (19) R 2 = 1 − n ∑ i=1 GHI i − GH ∼ I i 2 n ∑ i=1 GHI i − mean GHI i 2 (20) FS = 1 − Error Indicator proposedmethod Error Indicator ref .method This Because of the large investments in the renewable sector and exponentially increasing scope in India, this work considers the Indian location data set for the forecasting. This work utilizes the hourly data gathered from NSRDB, for This experiment is all about the development of benchmark & standalone deep learning models such as: Naïve, standalone LSTM, GRU and BiLSTM. The experiment is conducted for the forecasting of different month-wise 24-h ahead solar GHI where input initialization and finetuning of the hyperparameters of deep learning networks is one of the important tasks to achieve the better accuracy. The 15 time lags of input data series are provided to these deep learning based standalone models as an input features; whereas, the solar GHI is used as an output feature. Finally, the value of MAPE obtained by naïve, standalone LSTM, GRU and BiLSTM ranges from 2.41%-30.03%, 3.31-32.97%, 3.66-33.18%, and 2.51-18.96% respectively; RMSE ranges from 32.02-185.44 W/m 2 , 40.33-161.84 W/ m 2 , 32.73-164.01 W/m 2 and 30.94-113.06 W/m 2 respectively. Moreover, the correlation coefficient R 2 ranges from 0.33-0.95, 0.40-0.95 and 0.41-0.97 for naïve, standalone LSTM, GRU and BiLSTM respectively. Figure 8 shows the annual average RMSE & MAPE comparison of this scenario. From these observations, It is clear that the BiLSTM (Li et al. 2021; Rai et al. 2021 ). The traditional WT is used in this experiment to decompose the input data time series. This experiment utilizes all the decomposed components of WT along with the fifteen time lags of input data as input features of the BiL-STM model. The results in the Tables 2, 3 and 4 shows that the WT-BiLSTM(T) achieved lesser RMSE, MAPE and improved R 2 in comparison to standalone LSTM, GRU and BiLSTM networks. This model achieved MAPE ranges from 2.39-17.77%, RMSE ranges from 25.53-107.17 W/m 2 and R 2 ranges from 0.72-0.98. Moreover, the average annual RMSE, MAPE and R 2 obtained from the same is 51.77 (W/ m 2 ), 7.49% and 0.91 respectively. # Scenario 3: Forecast using Modified wavelet decomposition and BiLSTM (WT-BiLSTM (Mod)) Unlike, the traditional wavelet where all decomposed components were used as input features to the BiLSTM network. This experiment employs the different combinations of decomposed components as an input feature with the time lag input data. A thorough analysis has been done to observe the effect of these combinations of decomposed components as an input feature to the performance of model. Table 5 shows the observations recorded for different combinations of decomposed components as input features. Here, the observations from these combinations are shown only for the month of January. It is evident that the BiLSTM network provides lesser RMSE of 31.91 W/m 2 and MAPE of 3.43% with the combinations A7, D7 and sum (D1-D6). In this combination, the detail components D1-D6 are added and resultant subseries is considered as a single input feature to the model instead of six subseries (D1 to D6). Finally, A7, D7, sum (D1-D6) with time lag input series are used as an input features to the BiLSTM network to forecast the solar GHI. It is evident from Tables 2, 3, 4, this model obtained MAPE ranges from 2.39-17.46%, RMSE ranges from 23.78-106.65 W/m 2 and R 2 ranges from 0.73-0.98. Moreover, the average annual RMSE obtained from the same is 50.36 (W/m 2 ), MAPE is 7.28% and R 2 is 0.92. From the results, it can be seen that this model improves the forecasting accuracy in comparison to naïve, LSTM, GRU, BiLSTM, and WT-BiLSTM (T). # Scenario 4: Forecast using proposed WT BiLSTM Model (WT-BiLSTM (CF)) The proposed model is implemented in this scenario with an aim to achieve credible improvements in the forecasting accuracy. Unlike, all above experiments, this experiment allocated the separate BiLSTM network to each of the resultant decomposed series i.e. A7, D7 and Sum (D1 to D6) to forecast the future subseries's values. The fifteen-time lags of every decomposed component series are considered as the input features to their respective networks. The final prediction is obtained by the wavelet reconstruction process where outputs of each BiLSTM networks are further added to each other. From Tables 2 and 3, the MAPE obtained by the proposed model ranges from 2.02-14.66% whereas RMSE is from 21.24-96.38 W/m 2 . Table 4 shows the value obtained for R 2 which is ranges from 0.78-0.98. Moreover, this model achieved the lowest annual average RMSE (45.61(W/m 2 )), MAPE (6.48%) and R 2 (0.94). The research has been conducted to forecast the 24-h ahead solar GHI for the location of Ahmadabad, Gujarat, India in which various experiments are carried out to obtain the precise forecast model with improved forecasts. In order to show the superiority of proposed model, the forecasting performance has been compared with naïve predictor, unidirectional LSTM, unidirectional GRU, BiLSTM and waveletbased BiLSTM models. From the prospective of overall results, For a deep look into the results, the Fig. 10 provides the graphical analysis of real and forecasted GHI for three consecutive days (2nd to 4th day) of two months (March & August). These two months are to visualize the results of month with lesser RMSE (March) and larger RMSE (August). Nevertheless, for a clear presentation, only real GHI and forecasted GHI curves of proposed model is presented for selected months. It is clear from the Fig. 10 ; the large variations in the real GHI curve provided the larger error in the results. For instance, the month of March has a smooth curve due to clear sky conditions and easily traceable by the model. Unlike this month, the real GHI is frequently varying in the month of August due to presence of cloudy or rainy days, and is hard to trace by the model results in maximum errors. From these curve traces, it can be concluded that higher is the variations in real GHI, lower is the similarity in real and forecasted GHI. Similarly, lower are the variations in the real GHI; higher is the degree of similarity between real and forecasted GHI. But, the proposed model also follows uncertainties associated with real GHI with in tolerance band of error. Therefore, these observations yield that the proposed model is a suitable forecasting model not only for stable type of seasons but also for unstable seasons. For further explanation of forecasting performance, Table 6 shows the FS% results of proposed model and WT-BiLSTM (mod) for annual RMSE & MAPE with reference to benchmark model. Lastly, to validate the performance of the proposed model against previously reported models is presented in Table 7 . The superiority of the proposed model is shown in terms of RMSE, MAPE and FS. According to the Table 7 , the proposed model appears to be the best and excellent model to forecast solar GHI over latest developed models. The model performed shows its superiority in terms of lower RMSE & MAPE and higher forecasting skills over LSTM, GRU, CNN-LSTM, DFT-PCA-Elman and standalone BiLSTM. Therefore, the overall results suggested that the proposed model can be a good choice to forecast solar GHI for practical solar power systems also. In the renewable energy markets, solar energy has its own space & significance. The stability and efficiency of solar forecasting are therefore crucial for the regular operation of interconnected grid systems. Most of the time, the standalone models are unable to capture the fluctuating inherent solar time series characteristics. So, an ensemble model to forecast the 24-h ahead hourly solar GHI is proposed using the WT and BiLSTM network. The performance of the proposed model is rigorously evaluated by the GHI forecast on monthly basis for a year 2014 using moving window. The WT is used to extract the statistical features of the input time series using low and high frequency time series (IMF) while BiLSTM network is employed to the forecast. Meanwhile, the forecasting performance of the proposed model is compared with the naïve predictor, standalone LSTM, GRU, BiLSTM and two different WT based BiL-STM model. The results show that the proposed model outperforms the other models with minimum annual average RMSE (45.61 W/m 2 ) and MAPE (6.48%). The forecast skill and the R 2 for the proposed model are 47% and 0.94 respectively. Finally, the maximum percentile improvement in RMSE by proposed model is 58.38%, 58.11%, 56.06% & 31.35% over benchmark model, standalone LSTM, GRU and BiLSTM respectively; whereas, maximum percentile improvement in MAPE is 51.76%, 58.92%, 59.19% & 28.14% over benchmark model, standalone LSTM, GRU and BiLSTM respectively. Therefore, proposed model is proved to be the best forecasting model for 24-h ahead solar forecasts with consideration of all seasonal as well as monthly aspects. However, the study observed some challenges while designing the models such as: selection of precise hyperparameters and simulation time. So, in future by considering these challenges, more accurate and efficient results can be obtained within quick simulation time. Data availability The data that support the findings of this study are available from the corresponding author upon reasonable request. The codes of this study are available from the corresponding author upon reasonable request. Not Applicable. On behalf of all authors, the corresponding author states that there is no conflict of interest. Combining solar irradiance measurements, satellite-derived data and a numerical weather prediction model to improve intra-day solar forecasting Solar radiation forecasting in the short-and medium-term under all sky conditions A systematic analysis of meteorological variables for PV output power estimation A BiLSTM-CNN model for predicting users' next locations based on geotagged social media Deep learning framework to forecast electricity demand Solar radiation forecasting using artificial neural network and random forest methods: application to normal beam, horizontal diffuse and global components Time series analysis and short-term forecasting of solar irradiation, a new hybrid approach Very-short-term power prediction for PV power plants using a simple and effective RCC-LSTM model based on short term multivariate historical datasets A hybrid electricity price forecasting model with Bayesian optimization for German energy exchange Machine learning regressors for solar radiation estimation from satellite data A gated recurrent unit neural networks based wind speed error correction model for short-term wind power forecasting A novel convolutional neural network framework based solar irradiance prediction method Modelling the global solar radiation climate of Mauritius using regression techniques Solar photovoltaic energy production forecast using neural networks An ensemble method of full wavelet packet transform and neural network for short term electrical load forecasting SolarNet: a sky image-based deep convolutional neural network for intra-hour solar forecasting Deep learning with long short-term memory networks for financial market predictions Predicting day-ahead solar irradiance through gated recurrent unit using weather forecasting data Hourly forecasting of solar irradiance based on CEEMDAN and multi-strategy CNN-LSTM neural networks A novel composite neural network based method for wind and solar power forecasting in microgrids Long short-term memory 2018) A nonlinear hybrid wind speed forecasting model using LSTM network, hysteretic ELM and differential evolution algorithm Data-driven short-term solar irradiance forecasting based on information of neighboring sites 100% clean and renewable wind, water, and sunlight all-sector energy roadmaps for 139 countries of the world A comparison between the application of empirical and ANN methods for estimation of daily global solar radiation in Iran BiLSTM model based on multivariate time series data in multiple field for forecasting trading area Bayesian BIL-STM approach for tourism demand forecasting Artificial intelligence based forecast models for predicting solar power generation Forecasting of solar and wind power using LSTM RNN for load frequency control in isolated microgrid Extreme gradient boosting and deep neural network based ensemble learning approach to forecast hourly solar irradiance A physics-based smart persistence model for intra-hour forecasting of solar radiation (PSPI) using GHI measurements and a cloud retrieval technique Review of photovoltaic power output prediction technology A survey of machine learning models in renewable energy predictions Day-ahead spatiotemporal solar irradiation forecasting using frequency-based hybrid principal component analysis and neural network A benchmarking of machine learning techniques for solar radiation forecasting in an insular context A novel grey forecasting model and its application in forecasting Hourly solar irradiance prediction using deep BiLSTM network Random forest solar power forecast based on classification optimization Deep learning and wavelet transform integrated approach for short-term solar PV power prediction Estimation of solar radiation using neighboring stations through hybrid support vector regression boosted by Krill Herd algorithm Shortterm irradiance variability: Preliminary estimation of station pair correlation as a function of distance Hourly day-ahead solar irradiance prediction using weather forecasts by LSTM A CNN-BiLSTM based deep learning model for mid-term solar radiation prediction A new prediction model of solar radiation based on the neurofuzzy model Evaluation of the consistency of ECMWF ensemble forecasts The restriction research for urban area building integrated grid-connected PV power generation potential Wind power forecasting using wavelet transforms and neural networks with tapped delay Spatial forecasting of solar radiation using ARIMA model Estimating the diffuse solar radiation using a coupled support vector machine-wavelet transform model Short-term solar power forecasting based on weighted Gaussian process regression Short term load forecasting using artificial neural network Solar photovoltaic generation forecasting methods: A review. Energy Conversion and Management 156 Wavelet decomposition and convolutional LSTM networks based improved deep learning model for solar irradiance forecasting A review of deep learning for renewable energy forecasting. Energy Conversion and Management 198 An attention-based CNN-LSTM-BiL-STM model for short-term electric load forecasting in integrated energy system Preprocessing WRF initial conditions for coastal stratocumulus forecasting A novel wavelet sequences based on deep bidirectional LSTM network model for ECG signal classification Comparison of solar radiation in Marsaxlokk An LSTM short-term solar irradiance forecasting under complicated weather conditions Dayahead photovoltaic power forecasting approach based on deep convolutional neural networks and meta learning Short-term global horizontal irradiance forecasting based on a hybrid CNN-LSTM model with spatiotemporal correlations Short-term solar power prediction using a support vector machine Deep learning methods for forecasting COVID-19 time-series data: a comparative study A photovoltaic power forecasting model based on dendritic neuron networks with the aid of wavelet transform Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations Pardeep Singla 1 · Manoj Duhan 1 · Sumit Saroha 2 1 Deenbandhu Chhotu Ram University of Science & Technology, Sonepat, India