Abstract
Nowadays, most of the energy produced globally comes from fossil fuels. However, this type of energy generation harms the environment by emitting various toxic residues in natural bodies. In recent years, an area that has gained strength is producing and consuming clean energy. One that stands out is solar energy production because it is easy to implement and relatively cheap compared to other clean energy production. Therefore, several projects were created focusing on generating energy from sunlight, and using it. Due to this growing number of enterprises in the area, the creation of applications to help manage production and its use has increased. The use of Deep Learning techniques to help this industry has also gained strength; predictive models for energy consumption have been widely studied to help enterprises in future planning. In this work, we developed a deep learning model using Long Short-Term Memory (LSTM) capable of predicting energy consumption from solar power plant data, using the data to train the model and make inferences in the future. We explored configuration combinations such as data filtering with smoothing techniques, model hyperparameters, number of layers, number of neurons, and optimal prediction horizon. The achieved results demonstrate the validity and effectiveness of the implemented methodology.
Access provided by University of Notre Dame Hesburgh Library. Download conference paper PDF
Similar content being viewed by others
1 Introduction
The growth of the global population has generated an increasing demand for energy [24]. However, most of the energy currently generated comes from fossil fuels [14], which are known for their high emissions of toxic elements into the environment. This results in several problems, such as global warming, pollution of water bodies and the atmosphere, and the decrease of green areas [7, 21]. In response to this situation, there has been a significant increase in the search for clean energy sources in recent years [24].
Data collected by a study by the International Renewable Energy Agency in 2020 point to a 7% growth in the use of clean energy sources worldwide. Of this increase, 90% is related to the increased use of solar energy [13]. Therefore, solar power plays a crucial role in the search for energy sources that do not harm the environment. Another point that draws attention to this form of energy generation is that it is relatively cheap when compared to other sources [31], which is why the industry field has turned its eyes to the horizon of investments in this area.
Another area highlighted in the industry field is Machine Learning [3]. Machine learning techniques have been used in several ways to solve various problems in the industry, such as fault diagnosis [12], quality detection [15, 22], performance optimization [5, 18], process control [20], signal processing [25], outcome prediction [16].
Among these applications, one that stands out in solar energy is the forecast of the amount of energy generated in a given period [17, 33]. Predictions can be used for financial planning, low-generation detection, electrical crisis management, and energy distribution planning. This information can be used for the general planning of power generation plants, providing a service with excellent customer stability and higher quality [17].
Furthermore, another point investigated in the field of solar energy is the prediction of energy consumption [1]. This type of application, as well as generation, brings insights and better planning in the management of a power plant, mainly in managing and avoiding energy crises [26]. Therefore, consumption prediction applications are of paramount importance for the field of solar energy producers.
Based on that, we proposed a methodology using Deep Learning techniques to predict the amount of energy consumed by a solar plant based on time. We use data transformation techniques, search for optimal hyperparameters, and prediction horizons to create a robust tool with a clear purpose. With this tool, we aim to provide an artifact that brings valuable information to industries related to solar energy consumption. Based on the data obtained with a device, enterprises can get insights and manage energy crises in addition to other business planning.
2 Related Works
This section will examine previous studies and works related to this research topic, providing a broader and deeper view of the subject addressed.
Predicting future energy consumption is an issue frequently explored in the scientific community. This is evident in the study conducted by Amasyali et al. [2], providing a comprehensive analysis of the most commonly employed methods in temporal forecasting of energy consumption. Their conclusions highlight the strong results obtained by various forecasting approaches.
In the literature on the subject, Ekononomou et al. [9] stands out for using a Multi-Layer Perceptron (MLP) and an Artificial Neural Network (ANN) to predict energy consumption, incorporating factors such as temperature and Gross Domestic Product (GDP). In contrast, Hu et al. [11] presented the Adaboost-ESN model, developed to predict the monthly consumption of industrial electricity. In addition to considering previous consumption, this model included factors such as sales volume and import value, achieving promising results.
A notable approach was proposed by Du et al. [8]. They introduced a deep learning framework based on an encoder-decoder framework with BI-LSTM to predict multivariate energy consumption time series. The experiments, controlled on five different datasets, revealed that the model achieved robust performance in short- and long-term predictions.
Furthermore, it is essential to highlight studies such as Mahjoub et al. [19] and Olu et al. [23], which go beyond conventional approaches by exploring Deep Learning for time series forecasting. These studies stand out notably in the results achieved, showing the substantial potential of these techniques to address this problem.
Within this scenario, the long short-term memory (LSTM) architecture has gained prominence, as evidenced by the study by Wang et al. [29]. This work concludes that the LSTM model has a significant potential to predict future energy consumption. Such a model stands out for its ability to accurately capture the data’s periodicity and exhibits a predictive performance superior to traditional models.
In the literature, smoothing techniques have been explored to improve the accuracy of time series predictions. A notable example is the study by Wibawa et al. [30], in which the application of smoothing in the data significantly enhanced the performance of his Convolutional Neural Network (CNN), surpassing several state-of-the-art models.
In a study conducted by Xiao et al. [32], the impact of data pre-processing and smoothing techniques on predicting time series regarding energy consumption in buildings was investigated. The results showed that the smoothing method resulted in a 0.91% reduction in the evaluated metric. Furthermore, a gradual decline in model performance over time was observed, suggesting the need for updating every seven weeks.
3 Methodology
In this section, we detail the methodology that supports our study. Our proposed approach is visually captured in Fig. 1, which encompasses three main blocks: input data, pipeline, and the model evaluation. The following subsections will describe each of the steps in this process.
3.1 Dataset
The data used in this study was collected from a power plant, which we refer to by its pseudonym: SRI. The information was collected for approximately 390 days. These data were registered daily by specialized equipment, reflecting the energy consumption of each station during this period. The data used is in Fig. 2.
Figure 3 illustrates the decomposition of consumption data into its seasonality, trend, and residual components. We observed that the data trend presents variations, not following a continuous linear trajectory. These oscillations suggest the existence of patterns of growth followed by periods of decline in the data, possibly influenced by external factors such as climate change or economic events.
Furthermore, in Fig. 3, it is possible to notice a marked seasonality repeated approximately every 30 days, indicating a monthly seasonality. It is also possible to observe many residuals, showing that we could not solve the problem with linear models.
3.2 Data Preparation
We split the dataset into two subsets, one for training and the other for testing. We performed this split temporally; we separated 80% of the dataset for training, equivalent to approximately 330 days, and we separated the remaining 20% for model validation, about 60 days.
For prediction purposes, we divided the test set into four groups, which covered the horizons of 14, 30, 45 and 60 days. We performed this division of the test set to evaluate the model’s efficiency in different prediction horizons. The first 14-day horizon is chosen because it encompasses two complete weeks, which helps the model capture the typical patterns of weekdays and weekends. The second horizon, representing a whole month, allows the model to account for monthly variation, such as regular events that occur once per month. The third horizon represents a month and a half, making it possible to observe trends that may not be visible at shorter intervals. Finally, the fourth horizon represents two whole months, providing an even broader view and enabling the capture of more seasonal effects or periodic changes.
We used the Z-score method to standardize the data. This procedure recalibrates an attribute to align with 0, and its values are readjusted by dividing them by their standard deviation. As a result, the mean of the attribute is normalized to 0 and its standard deviation to 1. We apply the Eq. 1 to perform this transformation on the data.
In Eq. 1, x denotes a value of a specific attribute taken from the training set. Here, mean(x) represents the mean of these attribute values, and std(x) indicates their standard deviation. Before the preprocessing step, this attribute’s value is expressed as \(x_i\). After undergoing standardization, it transitions to its standardized form, denoted by \(x_{i_{std}}\).
It is important to emphasize that, while both training and test data undergo standardization, the parameters—specifically mean(x) and std(x)—used in Eq. 1 are derived solely from the training set.
3.3 Regression
Within the scope of this study, we evaluated different state-of-the-art models to find the architecture that best fits the problem. The models were AutoRegressive Integrated Moving Average (ARIMA) [4], Prophet [27], NeuralProphet [28], eXtreme Gradient Boosting (Xgboost) [6], and LSTM [10].
This model selection represents a comprehensive approach encompassing traditional methods and recent innovations. It aims to achieve a holistic understanding of these models’ performance regarding the complexities of the analyzed time series.
We trained all models initially using default hyperparameters and tested their performance over a thirty-day prediction horizon. Then, we submit the model that obtained the best results for optimization by the grid search method.
3.4 Metrics
This subsection will present the metrics used to evaluate the trained models.
Mean Absolute Error (MAE). The MAE quantifies the mean error in terms of the magnitude of discrepancies. The corresponding mathematical formulation is given by the Eq. 2.
In this context, n denotes the total number of observations, \(y_i\) refers to the observed value, and \(\hat{y}_i\) refers to the value predicted by the model. The MAE, often used in regression problems, stands out for its direct interpretation: high values indicate a higher average of absolute errors, indicating a lower model performance. Furthermore, the MAE uses the same unit as the analyzed variable. In this study, for example, an MAE of 0.1 implies that, on average, the model estimates differ by 0.1 kWh from the actual values.
Root Mean Square Error (RMSE). The RMSE is calculated from the square root of the mean squared error, as defined by the Eq. 3.
Unlike the MAE, the RMSE incorporates the squared error in its formula, amplifying the influence of more significant errors. This characteristic makes the RMSE especially valuable in situations where it is crucial to penalize more expressive deviations. Thanks to the inclusion of the square root, RMSE maintains the same unit as the original data. Like MAE, an RMSE of 0 represents a perfect fit where all model predictions are exact. Higher RMSE values indicate a declining performance of the model.
3.5 System Configuration
The device used for the tests has the following hardware configurations: Ryzen 5 5600x processor, 12 GB RTX 3060 graphics card, and 16 GB of RAM. Table 1 details the versions of libraries and software adopted in this study.
4 Results and Discussion
In this section, we address the results achieved by the proposed methodology to forecast energy consumption; our proposal seeks to find the combination between the best method, use the appropriate pre-processing technique on the data, and adjust the hyperparameters.
4.1 Modeling of the Regression Method
In Table 2, we present the results of energy consumption prediction using different models; we used the default settings for all models. The model that demonstrated the best performance was the LTSM due to its lower values of MAE and RMSE, indicating its effectiveness in the prediction task. The ARIMA model showed the worst results, with an MAE of 84.40 kWh and RMSE of 96.67 kWh. The other models showed similar metrics but were significantly lower than the LSTM.
4.2 Smoothing and Horizon
We set out for specific tests and adjustments for the LTSM model. We analysed two hypotheses: applying the smoothing technique to remove noise, and searching for an optimal prediction horizon threshold. For the smoothing step, we used a convolution-based smoothing, using a window of 30 days. The forecast horizon is analyzed to observe the limitation of how many steps (days) the model can predict without knowing more data, thus simulating a real environment. The results of applying the method in combination with different prediction horizons are shown in Table 3.
Table 3 shows that the data without smoothing showed a significant increase in the MAE in the 30-day and 45-day horizons. However, the MAE remained relatively stable when the prediction horizon was extended to 45 and 60 days. Based on these results, we used the 30-day threshold as the optimal prediction horizon. Beyond this point, accumulated noise and errors have a more significant impact on the non-smoothing data.
Furthermore, in Table 3, it is noted that data with smoothing were less affected by the increase in the prediction horizon, allowing the choice of longer prediction horizons. We, therefore, chose to use a 45-day forecast horizon with smoothed data, as they demonstrate the lowest MAE among all configurations.
4.3 Hyperparameters Selection
Later, as the last step, we searched for the best hyperparameters in the network; we used grid search techniques and calculated the model metrics for the 45-day horizon with smoothed data. The hyperparameters used were the number of layers and the number of neurons per hidden layer. The results are displayed in the Table 4. The configuration that obtained the best result was using five hidden layers with 100 neurons in each layer. This configuration achieved an MAE of 45.52 kWh and RMSE of 54.57 kWh, demonstrating greater accuracy than the other tested configurations.
Analyzing the other results, we found that the variation in the number of hidden layers and neurons per layer had different impacts on the model’s performance. Configurations with five hidden layers and 25 neurons obtained results close to those of the highlighted layer. Meanwhile, the increase to 20 hidden layers resulted in lower performance, likely due to overfitting during the training step. Adding more layers to the model increases its learnability, which can lead to better adaptation to the training data. However, in some cases, this greater flexibility can cause the model to memorize the training data instead of learning more general patterns, impairing generalization to new data.
Based on these results, the configuration with five hidden layers and 100 neurons in each layer stood out as the most suitable for use in a 45-day horizon and smoothed data. This configuration has struck a balance between the network’s learnability and generalization ability to new data, resulting in good forecasting performance for the specific scenario. Figure 4 shows the results obtained for the plant.
5 Conclusion
During the modeling development, we can explore different combinations of configurations such as Data transformation techniques, hyperparameter search, and search for an optimal prediction horizon. We built an artificial intelligence model from this exploration using Deep Learning techniques that can predict solar plants’ energy consumption. The results found with the training from the database were satisfactory, with the model being able to generalize and make predictions, achieving good metrics.
The best model found has achieved an MAE of 45.52 kWh and an RMSE of 54.57 kWh and uses five hidden layers with 100 neurons. The optimal prediction horizon seen was 45 days, which provides a good generalization for a satisfactory time window. From this, a user can get good results for forecasting energy consumption up to 45 days into the future of the current day. Such a window provides a good window for insights and planning decisions for companies and industries.
Therefore, industries in the field of solar energy can take advantage of this artifact for future planning based on forecasts created by the model. Therefore, we hope that the modeling developed in this study can help the field of solar energy with helpful information.
In addition, studies with different databases and exploration of other solutions to the problem of predicting solar energy consumption are needed. For future work, we plan to use other databases to test the model, explore other modeling techniques, such as classical regression, and seek to use other Deep Learning models for better results.
References
Abd El-Aziz, R.M.: Renewable power source energy consumption by hybrid machine learning model. Alex. Eng. J. 61(12), 9447–9455 (2022)
Amasyali, K., El-Gohary, N.M.: A review of data-driven building energy consumption prediction studies. Renew. Sustain. Energy Rev. 81, 1192–1205 (2018)
Bertolini, M., Mezzogori, D., Neroni, M., Zammori, F.: Machine learning for industrial applications: a comprehensive literature review. Expert Syst. Appl. 175, 114820 (2021)
Box, G.E., Pierce, D.A.: Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. J. Am. Stat. Assoc. 65(332), 1509–1526 (1970)
Chan, S.L., Lu, Y., Wang, Y.: Data-driven cost estimation for additive manufacturing in cybermanufacturing. J. Manuf. Syst. 46, 115–126 (2018). https://doi.org/10.1016/j.jmsy.2017.12.001, https://www.sciencedirect.com/science/article/pii/S0278612517301577
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Dincer, I.: Environmental impacts of energy. Energy Policy 27(14), 845–854 (1999)
Du, S., Li, T., Yang, Y., Horng, S.J.: Multivariate time series forecasting via attention-based encoder-decoder framework. Neurocomputing 388, 269–279 (2020)
Ekonomou, L.: Greek long-term energy consumption prediction using artificial neural networks. Energy 35(2), 512–517 (2010)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Hu, H., Wang, L., Peng, L., Zeng, Y.R.: Effective energy consumption forecasting using enhanced bagged echo state network. Energy 193, 116778 (2020)
Iqbal, R., Maniak, T., Doctor, F., Karyotis, C.: Fault detection and isolation in industrial processes using deep learning approaches. IEEE Trans. Industr. Inf. 15(5), 3077–3084 (2019)
Irena, A.: Renewable capacity highlights. Proc. Int. Renew. Energy Agency (IRENA), 1–8 (2020)
Jefferson, M.: World energy outlook to 2100. In: World Petroleum Congress, pp. WPC–26015. WPC (1994)
Jennings, C., Wu, D., Terpenny, J.: Forecasting obsolescence risk and product life cycle with machine learning. IEEE Trans. Compon. Packag. Manuf. Technol. 6(9), 1428–1439 (2016). https://doi.org/10.1109/TCPMT.2016.2589206
Ji, S., Wang, X., Zhao, W., Guo, D.: An application of a three-stage XGBoost-based model to sales forecasting of a cross-border e-commerce enterprise. Math. Probl. Eng. 2019 (2019)
Lima, M.A.F., Carvalho, P.C., Fernández-Ramírez, L.M., Braga, A.P.: Improving solar forecasting using deep learning and portfolio theory integration. Energy 195, 117016 (2020)
Maggipinto, M., Terzi, M., Masiero, C., Beghi, A., Susto, G.A.: A computer vision-inspired deep learning architecture for virtual metrology modeling with 2-dimensional data. IEEE Trans. Semicond. Manuf. 31(3), 376–384 (2018). https://doi.org/10.1109/TSM.2018.2849206
Mahjoub, S., Chrifi-Alaoui, L., Marhic, B., Delahoche, L.: Predicting energy consumption using LSTM, multi-layer GRU and drop-GRU neural networks. Sensors 22(11), 4062 (2022)
Mezzogori, D., Zammori, F.: An entity embeddings deep learning approach for demand forecast of highly differentiated products. Procedia Manuf. 39, 1793–1800 (2019). https://doi.org/10.1016/j.promfg.2020.01.260, https://www.sciencedirect.com/science/article/pii/S2351978920303243, 25th International Conference on Production Research Manufacturing Innovation: Cyber Physical Manufacturing, 9–14 August 2019|Chicago, Illinois (USA)
Mirzaliev, S., Sharipov, K.: A review of energy efficient fluid power systems: fluid power impact on energy, emissions and economics, vol. 30 (2020)
Oh, Y., Ransikarbum, K., Busogi, M., Kwon, D., Kim, N.: Adaptive SVM-based real-time quality assessment for primer-sealer dispensing process of sunroof assembly line. Reliab. Eng. Syst. Saf. 184, 202–212 (2019). https://doi.org/10.1016/j.ress.2018.03.020, https://www.sciencedirect.com/science/article/pii/S0951832017303861, Impact of Prognostics and Health Management in Systems Reliability and Maintenance Planning
Olu-Ajayi, R., Alaka, H., Sulaimon, I., Sunmola, F., Ajayi, S.: Building energy consumption prediction for residential buildings using deep learning and other machine learning techniques. J. Build. Eng. 45, 103406 (2022)
Rabaia, M.K.H., et al.: Environmental impacts of solar energy systems: a review. Sci. Total Environ. 754, 141989 (2021)
Scime, L., Beuth, J.: Using machine learning to identify in-situ melt pool signatures indicative of flaw formation in a laser powder bed fusion additive manufacturing process. Addit. Manuf. 25, 151–165 (2019). https://doi.org/10.1016/j.addma.2018.11.010, https://www.sciencedirect.com/science/article/pii/S2214860418306869
Shamshirband, S., Rabczuk, T., Chau, K.W.: A survey of deep learning techniques: application in wind and solar energy resources. IEEE Access 7, 164650–164666 (2019)
Taylor, S.J., Letham, B.: Forecasting at scale. Am. Stat. 72(1), 37–45 (2018)
Triebe, O., Hewamalage, H., Pilyugina, P., Laptev, N., Bergmeir, C., Rajagopal, R.: Neuralprophet: Explainable forecasting at scale. arxiv 2021. arXiv preprint arXiv:2111.15397
Wang, J.Q., Du, Y., Wang, J.: LSTM based long-term energy consumption prediction with periodicity. Energy 197, 117197 (2020)
Wibawa, A.P., Utama, A.B.P., Elmunsyah, H., Pujianto, U., Dwiyanto, F.A., Hernandez, L.: Time-series analysis with smoothed convolutional neural network. J. Big Data 9(1), 44 (2022)
Wilberforce, T., Baroutaji, A., El Hassan, Z., Thompson, J., Soudan, B., Olabi, A.G.: Prospects and challenges of concentrated solar photovoltaics and enhanced geothermal energy technologies. Sci. Total Environ. 659, 851–861 (2019)
Xiao, Z.: Impacts of data preprocessing and selection on energy consumption prediction model of HVAC systems based on deep learning. Energy Build. 258, 111832 (2022)
Yagli, G.M., Yang, D., Srinivasan, D.: Automatic hourly solar forecasting using machine learning models. Renew. Sustain. Energy Rev. 105, 487–498 (2019)
Acknowledgments
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. Also Pedro Pedrosa Rebouças Filho acknowledges the sponsorship from the Brazilian National Council for Research and Development (CNPq) via Grant 301455/2022-8.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chaves, J.M. et al. (2025). Predicting Energy Consumption Data Using Deep Learning: An LSTM Approach. In: Paes, A., Verri, F.A.N. (eds) Intelligent Systems. BRACIS 2024. Lecture Notes in Computer Science(), vol 15413. Springer, Cham. https://doi.org/10.1007/978-3-031-79032-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-79032-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-79031-7
Online ISBN: 978-3-031-79032-4
eBook Packages: Computer ScienceComputer Science (R0)



