Machine Learning and Time Series Analysis to Forecast Hotel Room Prices

Oliveira, Francisco B.; Silva-Filho, Moesio W.; Barbosa, Gabriel A.; Freitas, João Paulo; Penna, Chris; Miranda, Péricles B. C.

doi:10.1007/978-3-031-79035-5_25

Francisco B. Oliveira⁹,
Moesio W. Silva-Filho¹⁰,
Gabriel A. Barbosa¹⁰,
João Paulo Freitas⁹,
Chris Penna⁹ &
…
Péricles B. C. Miranda¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 15414))

Included in the following conference series:

Brazilian Conference on Intelligent Systems

436 Accesses
2 Citations

Abstract

The hospitality industry’s dynamic nature demands accurate forecasting of hotel room prices to optimize revenue management strategies. This paper presents an experimental study assessing machine learning techniques and time series analysis for forecasting hotel room prices. We enhance prediction accuracy by leveraging historical booking data, seasonal patterns, and hotel characteristics. We employ time series models, including AutoRegressors and Prophet, to capture underlying trends and seasonal variations. We also evaluate machine learning models such as random forest, gradient boosting machine, extra trees regressor, and neural networks. These models are trained on features like booking lead time, historical hotel occupancy, room nights, and number of adults to capture complex relationships influencing prices. Our methodology is demonstrated through a case study using over 40,000 reservations from a Brazilian hotel over a decade. Experimental results show that tree-based models performed best, with the Gradient Boosting achieving 6.94% of NRMSE and 31.26 of MAE. Our findings contribute valuable insights for price estimation in the hospitality industry, offering robust methods to enhance revenue management strategies.

Access provided by University of Notre Dame Hesburgh Library. Download conference paper PDF

A Machine Learning Model for Occupancy Rates and Demand Forecasting in the Hospitality Industry

Forecasting hotel reservations with long short-term memory-based recurrent neural networks

Article 04 December 2018

A Study of Machine Learning Based Approach for Hotels’ Matching

1 Introduction

The dynamic and competitive nature of the hospitality industry necessitates effective strategies for forecasting hotel room prices to optimize revenue management. Accurate predictions are crucial for hotel operators to make informed decisions, respond to market trends, and maximize profitability. In this context, machine learning and time series analysis offer promising approaches, capturing complex patterns in hotel pricing data.

Time series analysis has long been employed to model and understand temporal patterns in various fields, including hospitality [2, 5, 16]. Techniques such as Autoregressive Integrated Moving Average (ARIMA) and Seasonal Decomposition of Time Series (STL) have been instrumental in capturing inherent trends and seasonal variations in hotel room prices [7, 10].

In parallel, the advancement of machine learning techniques has provided opportunities to enhance forecasting accuracy by leveraging the power of algorithms that can capture intricate relationships in large and diverse datasets [4, 12]. Models such as Random Forests, Gradient Boosting, and Neural Networks have shown promise in capturing non-linear patterns and incorporating a wide range of features, including booking lead time, historical demand, and external factors like local events and economic indicators [2, 4, 7, 20].

This work presents an empirical investigation that examines the application of machine learning techniques and time series analysis to forecast hotel room prices. The proposed methodology exploits historical booking data, seasonal patterns, and external factors to augment the accuracy of price predictions. The investigation commences by scrutinizing the utilization of time series analysis to model and comprehend the temporal patterns intrinsic to hotel room pricing. Different time series models are deployed to capture underlying trends and seasonal fluctuations in pricing data. Additionally, machine learning models, namely random forests, gradient boosting, and neural networks, are subjected to evaluation. These models are trained on a comprehensive set of features, encompassing booking lead time, historical demand, local events, and economic indicators, encapsulating the intricate relationships influencing hotel room prices. The efficacy of the proposed methodology is exemplified through a case study employing authentic hotel pricing data. Experimental results show that tree-based models, like Gradient Boosting Machine, demonstrate the best results.

Besides, we present a discussion of the practical implications of the achieved results for hotel revenue managers, emphasizing their potential to optimize pricing strategies, enhance revenue streams, and improve overall profitability. Insights derived from the study can inform decision-making processes and help hotel operators stay competitive in a rapidly evolving market.

The paper is structured as follows. Section 2 introduces the problem of forecasting room prices in the hospitality domain and the main research related to the theme. Section 3 details the proposed methodology to assess forecasting and ML algorithms for the problem at hand. Section 4 discusses the achieved results and their implications for hotel revenue managers. Section 5 presents the conclusion and future work.

2 Background

Forecasting poses a significant challenge for hotels, particularly those with limited resources that prevent them from investing in advanced revenue management software [18]. According to [18], less than ten percent of hotels currently utilize a revenue management system, and even when implemented, these systems often need more sophistication than those observed in other industries. Complicating matters further, the data about a guest’s spa bookings is typically stored in a separate database from information related to room bookings and/or email marketing efforts. This fragmentation hampers hotels’ capacity to assess customer lifetime value, a capability readily available in the airline industry [9].

The pivotal role of revenue managers in shaping hotel pricing is undeniably significant; however, it introduces the potential for customers to perceive the pricing strategies as inequitable [14]. When hotels establish prices perceived as unrealistic or unfair by consumers, there is a consequential shift in demand towards alternative accommodation options such as peer-to-peer arrangements [14]. Consumers may gravitate towards peer-to-peer options, perceiving them as more cost-effective alternatives [6]. Despite these challenges, many hotels employ conventional price forecasting models, including the naive (same-time/last year) method [15]. Furthermore, it is noteworthy that forecasted prices often undergo adjustments, commonly overridden through qualitative methods rather than quantitative pricing models [13].

In forecasting, several studies have employed moving average and exponential smoothing methods [11]. However, a significant drawback associated with moving average models is their sluggish responsiveness to rapid fluctuations in the data [11]. These models exhibit limited adaptability to dynamic changes in pricing, often neglecting the intricate relationships inherent in hotel data [2]. Moreover, the moving average method demonstrates sub-optimal performance in addressing price fluctuations and fails to adequately account for the seasonal nature of hotel operations, which inherently exhibit such patterns. In response to these limitations, recent hotel revenue management studies have increasingly turned to ARIMA-based models [21].

ARIMA models, while proficient in modeling trends, exhibit limitations in capturing the complex nonlinear nature of data, as highlighted by [22]. Their effectiveness diminishes when confronted with sudden changes in data values, resulting in notable increases in forecast error [22]. Additionally, ARIMA models are sensitive to outliers and lack robustness in predicting extreme values, as observed in [2]. Identifying an appropriate ARIMA model can be challenging and often computationally demanding. Furthermore, ARIMA-based models overlook the intricate nonlinear aspects of data, assuming data stability and disregarding external factors [22]. Given the vulnerability of the hotel industry to external influences such as economic changes and pandemics, this oversight is particularly relevant [22]. Traditional models like ARIMA prove ineffective in capturing turning points, leading to significant errors in forecasting [22].

Given the dynamic nature of the hospitality industry, pricing assumes a pivotal role in the operations and survival of hotels [4, 19]. Consequently, there is an imperative need to develop new forecasting methods (e.g., machine learning algorithms) to effectively navigate the continual changes and challenges inherent in the industry.

Machine Learning algorithms are good at detecting the complex nature of data and any sudden changes [1, 4]. They can also be trained on rich features, including booking lead time, historical demand, local events, and economic indicators, to capture the complex relationships influencing hotel room prices [1, 4]. Besides, Machine learning models are inherently designed to operate efficiently with extensive datasets, although they also exhibit proficiency in handling smaller datasets. Moreover, their capacity to manage data noise is noteworthy, as it can be mitigated through optimization strategies [21]. Researchers have demonstrated substantial enhancements in prediction accuracy by employing machine learning models, since traditional ones (e.g., support vector machines, random forest, artificial neural networks) to sophisticated ones (e.g., deep learning, Long Short-Term Memory network - LSTM), compared to results obtained from statistical models considering the hospitality accommodation price [1, 4, 5, 22].

The contribution of this work is an experimental investigation of traditional forecasting methods (e.g., ARIMA, SARIMA) and machine learning algorithms applied to hospitality accommodation price prediction problem. The study’s initial phase involves examining time series analysis to model and comprehend the inherent temporal patterns in hotel room pricing. Diverse time series models are employed to capture underlying trends and seasonal variations in pricing data. Furthermore, various machine learning models are trained on a comprehensive set of features to comprehensively capture the intricate relationships influencing hotel room prices. The efficacy of the proposed methodology is substantiated through a case study employing authentic hotel pricing data from a real-world scenario.

3 Experimental Methodology

This section details the experimental methodology proposed to assess the machine learning and forecasting algorithms for price prediction. Thus, we present the datasets adopted, the algorithms involved in the experiments, how they were evaluated and the software and hardware setup used.

3.1 Dataset

The dataset used in this study is called HospBR. It originates from the Property Management System (PMS) of a Brazilian leisure hotel, anonymized to ensure the privacy of its guests and owners. This dataset contains 41,915 reservations made between 2010 and 2019. The data was split into training and testing periods, with the training set covering 2010 to 2018, consisting of 36,988 reservations, and the testing set covering 2019 with 4,927 reservations. We chose the dataset features that were consistently available throughout the investigated period; they are outlined in Table 1.

Table 1. Description of Features

Full size table

Several preprocessing steps were conducted on the original data. First, only confirmed reservations were selected, excluding those marked as “canceled” or “pending.” Next, missing values in the feature “Id Type HU” were replaced with 0, indicating unrecorded data for those reservations. Subsequently, the “Occupation” information was included, defined as the ratio of occupied hotel units (HU) on the reservation creation date (Entry date - Lead time) to the hotel’s inventory (available rooms). Hotel inventory typically shows little variability, as modifying the number of HUs is not straightforward, although conversions between sophisticated and simpler units can occur.

Finally, data cleaning was performed on the daily rate values (Fee), which exhibited considerable noise due to manual adjustments by the hotel staff. A variable named “Mean Fee” was created, representing the daily rate (fee divided by Length of Stay, LOS). Extreme outliers were removed, defined as values exceeding the third quartile plus fifteen times the interquartile range (IQR). Thus, reservations with a Mean Fee above 853.57 BRL, totaling 128 reservations (0.3% of the dataset), were excluded. Similarly, reservations with a Mean Fee below the first quartile minus one IQR were also excluded, amounting to 22 reservations (0.05% of the dataset) below 16.69 BRL.

Table 2. Statistical Summary of Features

Full size table

Table 2 displays the distribution of the available data within the reservations. Most fees ranged between 292 and 630 BRL, influenced by inflation over a decade. The dataset contains outliers, with some fees being exceptionally high, possibly due to extremely long LOS. Most reservations were made less than a month in advance, indicating last-minute booking behavior among guests. The “Id HU” information, representing room numbers, is a proxy for room quality and includes the floor level. The “Id Type HU” feature directly indicates the room type but includes many zeros where this information was not recorded. Typically, the hotel accommodates two adults without children; on average, it operates at 50% of occupancy. The data shows some inconsistencies, with certain dates exceeding 100% of occupancy, suggesting incorrect reservation dates or inventory data errors.

We employed the Python library PyCaret [3] for feature engineering using the original dataset. This process involved (i) One-Hot Encoding the “Id Type HU” variable; (ii) decomposing the “Entry date” variable into numerical features representing the day of the month, day of the week, month, and year; (iii) generating polynomial features by squaring numerical features and multiplying pairs of numerical features, and (iv) removing features with low variance or high multicollinearity. The result is a set of 71 features, which the machine learning models utilized.

3.2 ML and Forecasting Algorithms

In this study, we aimed to employ a wide variety of machine learning techniques to automatically predict the daily rate for a given reservation. These strategies encompass supervised regression, deep neural networks, and time series forecasting.

Regressor Models. For supervised regression techniques, we focused on implementations from the Python library scikit-learn, specifically the ExtraTrees Regressor (model name: et), Random Forest (model name: rf), Gradient Boosting Machine (model name: gbm) models, and linear regression (model name: lr). The processed dataset, prepared using the PyCaret library, was utilized for all cases. Hyperparameter optimization was conducted on tree-based models using the Optuna library, facilitating automated hyperparameter tuning through efficient search algorithms. The optimization process involved exploring the designated hyperparameter space for each model (see Table 3), with 25 trials per case, to identify the model achieving the best performance on a validation subset randomly extracted from 30% of the training data. The optimal model was then used to predict the test dataset, comprising reservations from 2019, and the model’s performance was evaluated using various metrics. This procedure was repeated ten times for each model to analyze the mean and standard deviation of the results.

Table 3. Hyperparameter Space for Regressor Models

Full size table

Multi-layer Perceptron. We trained a neural network (model name: dnn) using a sequence of linear layers followed by ReLU activations to predict hotel room rates based on reservation data. The model was implemented with the PyTorch library in Python. It consisted of four hidden layers with dimensions 32, 64, 32, and 16, respectively, and was trained with a batch size of 128 for 64 epochs. The data was normalized using MinMax scaling based on the train set. This neural network was designed to be simple, aiming to assess the capability of such a model to identify patterns necessary for accurate predictions. The hyperparameters were configured based on the authors’ experience with similar problems.

Time-Series Forecasting. The prediction of hotel room rates in time series data has unique challenges due to the wide range of rates for each reservation date. For instance, guests booking well in advance typically pay lower rates than those booking closer to the date. Similarly, the price can vary based on the number of nights reserved or the number of adults. To isolate these effects, we calculated the average daily rate by dividing the total rate for a given date by the number of reservations and adults for that date. This provided the average rate per day per adult. These values formed a time series from 2010 to 2018, which we used to predict rates for 2019.

To predict the rate for a specific reservation, we summed the predicted daily rates for all days covered by the reservation and multiplied by the number of adults. For example, reservation 96384 had three adults, starting on 2019-02-03 and ending on 2019-02-05. One model predicted rates of 69.82 for 2019-02-03, 69.31 for 2019-02-04, and 68.68 for 2019-02-05. Thus, the final predicted cost was (69.82 + 69.31 + 68.68) * 3 = 705.96, while the actual cost was 705. We tested two-time series models: the AutoRegressor from the Python statsmodels library with lags ranging from 30, 60, 180, and 365 (model names: ar_30, ar_60, ar_180, ar_365, respectively) and the Prophet library (model name: prophet) [17].

3.3 Evaluation Methodology

All algorithms, machine learning and forecasting ones, were evaluated following a hold-out methodology, considering the train and test splits, to predict the average daily price for a given booking. The experiments were executed on Vertex AI Workbench, in a machine with a CPU Intel Xeon (8 vCPUs) @ 2.800GHz and 32GB RAM. The metrics adopted to assess the algorithms are presented as follows:

BIAS. Refers to the systematic error or tendency of a forecasting method to consistently overestimate or underestimate the actual values. It indicates the direction and magnitude of the forecast errors. A positive bias implies that the forecasts are consistently higher than the actual outcomes, while a negative bias indicates that the forecasts consistently fall below the actual values. BIAS equation can be seen as follows:

$$\begin{aligned} \text {BIAS} = \frac{1}{N} \sum _{t=1}^{N} (F_t - A_t), \end{aligned}$$

(1)

where, N is the number of observations, $F_{t}$ is the forecasted value at time t, and $A_{t}$ is the actual observed value at time t. It is important to note that a perfect forecast would have a BIAS equal to zero, indicating that, on average, the forecasts are accurate.

NRMSE. Normalized Root Mean Squared Error, is a metric used to assess the accuracy of forecasting models. It is a normalized version of the Root Mean Squared Error (RMSE), which measures the square root of the average squared differences between forecasted and actual values. Normalization is applied to the error by dividing it by the larger value of a reservation, making it a relative measure. The NRMSE formula is as follows:

$$\begin{aligned} \text {NRMSE} = \sqrt{\frac{1}{N}\sum _{t=1}^{N}\left( \frac{F_t - A_t}{\max (A_t)}\right) ^2} \times 100, \end{aligned}$$

(2)

A lower NRMSE indicates better model accuracy.

NRMAPE. Normalized Root Mean Absolute Percentage Error, is a metric used to evaluate the accuracy of predictive models, particularly in time series forecasting and regression analysis. It combines elements of both the Mean Absolute Percentage Error (MAPE) and the Root Mean Squared Error (RMSE), and it is normalized to provide a scale-independent measure of error.

$$\begin{aligned} \text {NRMAPE} = \sqrt{\frac{1}{n} \sum _{i=1}^{n} \left| \frac{F_t - A_t}{\max (A_t)}\right| } \times 100. \end{aligned}$$

(3)

A lower NRMAPE indicates better model accuracy.

$R^{2}$ The coefficient of determination, is a metric used to assess the goodness-of-fit of a forecasting model. It represents the proportion of the variance in the dependent variable (actual values) that is explained by the independent variable (forecasted values). $R^{2}$ values range from 0 to 1, with 1 indicating a perfect fit where all variability in the dependent variable is explained by the model. The $R^{2}$ formula is given by:

$$\begin{aligned} R^2 = 1 - \frac{\sum _{t=1}^{N}(A_t - F_t)^2}{\sum _{t=1}^{N}(A_t - \bar{A})^2}, \end{aligned}$$

(4)

where, $\overline{A}$ is the mean of the actual values.

MAE. Mean Absolute Error, is a metric used to measure the average magnitude of errors in a set of predictions, without considering their direction. It is the average over the test sample of the absolute differences between predicted values and observed actual values. The MAE is a linear score, meaning that all individual differences are weighted equally in the average. We included this metric to show a perspective at the same scale of the reservation fee values. The MAE formula is as follows:

$$\begin{aligned} \text {MAE} = \frac{1}{N} \sum _{t=1}^{N} |F_t - A_t|, \end{aligned}$$

(5)

A lower MAE indicates a model with better predictive accuracy.

4 Results and Discussion

Table 4 presents the accuracy rates achieved by each model regarding BIAS, NRMSE, NRMAPE, $R^{2}$ and MAE. Bold values mean the best value achieved in the given metric in an absolute sense. As it can be seen, lr reached the best value for BIAS. For NRMSE, NRMAPE, $R^{2}$ and MAE, gb was the best algorithm, achieving 6.94, 3.9629, 0.4778 and 31.26, respectively.

Table 4. Mean of Accuracy rates of the predictive models.

Full size table

Furthermore, a statistical analysis was conducted on the gathered data to ensure a fair assessment of the models’ performance. The null hypothesis posits that there is no statistical discrepancy between the average metrics collected from the models within each dataset. We opted to employ the non-parametric Friedman test with a significance level of $\alpha = 0.05$ to verify the null hypothesis. This approach yielded the p-values for each metric across all datasets. These p-values were less than $\alpha $, indicating rejection of the null hypothesis across all tests and metrics. This rejection signifies that, within each metric and dataset, at least one model exhibits statistically significant differences from the others, forming distinct result groups.

Table 5. Rank of algorithms.

Full size table

We utilized the Nemenyi post hoc test to identify statistically similar result groups, computed average rankings, and created critical difference graphs, depicted in Fig. 1. Each vertical line in these graphs represents a tested model, while horizontal bars touching these lines denote statistical similarity. Vertical lines intersected by horizontal bars within the calculated Critical Difference (CD) value denote models that achieved statistically equivalent performance.

Figure 1 (top-left) shows the CD diagram for BIAS. As it can be seen, lr, prophet, $a\_365$, dnn and $a\_180$ reached the first five positions, respectively, and are statistically equivalent. However, the lr was superior statistically to all the other algorithms. Figure 1 (top-right) presents the CD diagram for NRMAPE. gb, rf, et, $a\_30$, and $a\_60$ tied statistically as the best models, where gb surpassed all other approaches. gb also reached the best value of NRMSE and $R^{2}$, Fig. 1 (middle-left and -right), being statistically equivalent to et, rf, prophet, and $ar\_365$. gb overcame all the other models in both metrics. gb also achieved the best results for MAE, being statistically equivalent to rf, et, $ar\_{30}$, and $ar\_{60}$.

To complement the results presented previously in CD diagrams, Table 5 presents the average rank of the algorithms considering all the metrics adopted. The average rank facilitates the understanding of the general performance of the algorithms. As example, gb was the $8^{th}$ best algorithm regarding BIAS, and $1^{st}$ for NRMAPE, NRMSE, $R^{2}$ and MAE. Thus, gb has an average rank equal to 2.40, being considered, on average, the best algorithm. rf, et and $ar\_{30}$ composed the podium of the four best algorithms for the dataset considered in this work.

The results indicate that tree-based models (gb, rf, et) achieved the best performance in metrics evaluating predictive model performance, namely NRMAPE, NRMSE, $R^2$, and MAE. Moreover, the tests conducted suggest that there is no statistically significant difference in the results of these models, although gb numerically outperformed the others in all cases. This suggests that models with this architecture are better suited to capture the nuances governing price configurations in hotel reservation systems. Even after normalizing values to account for the number of nights and adults in the reservation, time series prediction models did not perform comparably to tree-based models. The influence of booking window information and the type of room, which these models could not capture, likely affected the outcome negatively.

Additionally, it is noteworthy that tree-based models exhibited high BIAS values, all of which were negative. This may indicate that these models failed to identify a trend in daily rates, unlike models such as lr, prophet, and $ar\_365$. This could be attributed to the long training period, during which the influence of Brazilian inflation, totaling 75.78% [8] between 2010 and 2018, may have been observed. Future research should consider normalizing data to account for this factor. Figure 2 depicts each model’s prediction residual distribution. It is evident that tree-based models are notably centered around zero, highlighting the quality of their predictions. However, they exhibit leftward skewness in all cases, indicating a negative BIAS in these models.

5 Conclusion and Future Works

This paper presents an experimental study that evaluates machine learning techniques and time series analysis for forecasting hotel room prices. The methodology utilizes a dataset (HospBR) comprising 41,915 reservations made at a Brazilian hotel from 2010 to 2019. Each reservation includes check-in and check-out dates, number of adults, booking window (days between reservation creation and check-in), room nights (days between check-out and check-in), room type, and hotel occupancy rate on the reservation date. Time series forecasting models, such as AutoRegressors and Prophet, are employed to understand temporal patterns in hotel pricing. Machine learning models are also assessed, including random forest, gradient boosting machine, extra trees regressor, and neural networks. The reservations from 2010 to 2018 are used for training, while those from 2019 are used for testing.

Our findings indicate that the Gradient Boosting Machine (GBM) demonstrated the highest predictive power, closely followed by tree-based models (ExtraTrees and Random Forest) and the Prophet library. This suggests that tree-based models effectively capture the relationships between variables affecting price variations, such as room nights and booking windows. The GBM achieved an NRMSE of 6.94%, NRMAPE of 3.96%, $R^2$ of 0.47, and MAE of 31.26. However, tree-based models also showed high negative BIAS values, indicating a tendency to underestimate prices, likely due to the impact of Brazilian inflation during the test period.

In conclusion, this research contributes to hotel revenue management by providing a robust approach to forecasting room prices. By integrating time series analysis with machine learning, the methodology offers a comprehensive solution for anticipating price fluctuations, enabling proactive and data-driven decision-making in the hospitality industry. Future work could explore the impact of inflation on prices, further investigate tree-based models like LightGBM or XGBoost, and combine regressor and time series models such as Prophet. Another research avenue could be the study of dynamic pricing strategies to optimize hotel revenue.

References

Al Shehhi, M., Karathanasopoulos, A.: Forecasting hotel prices in selected middle east and North Africa region (MENA) cities with new forecasting tools. Theor. Econ. Lett. 8(9), 1623–1638 (2018)
Article MATH Google Scholar
Al Shehhi, M., Karathanasopoulos, A.: Forecasting hotel room prices in selected GCC cities using deep learning. J. Hosp. Tour. Manag. 42, 40–50 (2020)
Article Google Scholar
Ali, M.: Pycaret: an open source, low-code machine learning library in python (2020). https://github.com/pycaret/pycaret. Accessed 23 June 2024
Alotaibi, E.: Application of machine learning in the hotel industry: a critical review. J. Assoc. Arab Univ. Tourism Hosp. 18(3), 78–96 (2020)
MATH Google Scholar
Binesh, F., Belarmino, A.M., van der Rest, J.P., Singh, A.K., Raab, C.: Forecasting hotel room prices when entering turbulent times: a game-theoretic artificial neural network model. Int. J. Contemp. Hosp. Manag. (2023)
Google Scholar
Chi, M., Wang, J., Luo, X., Li, H.: Why travelers switch to the sharing accommodation platforms? A push-pull-mooring framework. Int. J. Contemp. Hosp. Manag. 33(12), 4286–4310 (2021)
Article MATH Google Scholar
Cleveland, R.B., Cleveland, W.S., McRae, J.E., Terpenning, I.: STL: a seasonal-trend decomposition. J. Off. Stat 6(1), 3–73 (1990)
MATH Google Scholar
de Geografia e Estatística, I.B.: Índice nacional de preços ao consumidor amplo (IPCA) (2023). https://www.ibge.gov.br/estatisticas/economicas/precos-e-custos/9256-indice-nacional-de-precos-ao-consumidor-amplo.html. Accessed 23 June 2024
Giousmpasoglou, C., Marinakou, E., Zopiatis, A.: Hospitality managers in turbulent times: the covid-19 crisis. Int. J. Contemp. Hosp. Manag. 33(4), 1297–1318 (2021)
Article Google Scholar
Hyndman, R.J., Athanasopoulos, G.: Forecasting: principles and practice. OTexts (2018)
Google Scholar
Kimes, S.E.: Revenue management: a retrospective. Cornell Hotel Restaurant Adm. Q. 44(5–6), 131–138 (2003)
Article MATH Google Scholar
Knani, M., Echchakoui, S., Ladhari, R.: Artificial intelligence in tourism and hospitality: bibliometric analysis and research agenda. Int. J. Hosp. Manag. 107, 103317 (2022)
Article MATH Google Scholar
Koupriouchina, L., Van der Rest, J.P., Schwartz, Z.: Judgmental adjustments of algorithmic hotel occupancy forecasts: does user override frequency impact accuracy at different time horizons? Tour. Econ. 29(8), 2143–2164 (2023)
Article Google Scholar
Meatchi, S., Camus, S., Lecointre-Erickson, D.: Perceived unfairness of revenue management pricing: developing a measurement scale in the context of hospitality. Int. J. Contemp. Hosp. Manag. 33(10), 3157–3176 (2021)
Article Google Scholar
Pereira, L.N.: An introduction to helpful forecasting methods for hotel revenue management. Int. J. Hosp. Manag. 58, 13–23 (2016)
Article MATH Google Scholar
Salah, A., Bekhit, M., Eldesouky, E., Ali, A., Fathalla, A.: Price prediction of seasonal items using time series analysis. Comput. Syst. Sci. Eng. (2023)
Google Scholar
Taylor, S.J., Letham, B.: Prophet: forecasting at scale (2017). https://github.com/facebook/prophet. Accessed 23 June 2024
Webb, T., Schwartz, Z., Xiang, Z., Singal, M.: Revenue management forecasting: the resiliency of advanced booking methods given dynamic booking windows. Int. J. Hosp. Manag. 89, 102590 (2020)
Article Google Scholar
Zaki, K.: Implementing dynamic revenue management in hotels during covid-19: value stream and wavelet coherence perspectives. Int. J. Contemp. Hosp. Manag. 34(5), 1768–1795 (2022)
Article MATH Google Scholar
Zhang, B., Huang, X., Li, N., Law, R.: A novel hybrid model for tourist volume forecasting incorporating search engine data. Asia Pacific J. Tourism Res. 22(3), 245–254 (2017)
Article MATH Google Scholar
Zheng, T.: What caused the decrease in revpar during the recession? An arima with intervention analysis of room supply and market demand. Int. J. Contemp. Hosp. Manag. 26(8), 1225–1242 (2014)
Article MATH Google Scholar
Zheng, T., Liu, S., Chen, Z., Qiao, Y., Law, R.: Forecasting daily room rates on the basis of an LSTM model in difficult times of Hong Kong: evidence from online distribution channels on the hotel industry. Sustainability 12(18), 7334 (2020)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

CESAR, Recife, Pernambuco, Brazil
Francisco B. Oliveira, João Paulo Freitas & Chris Penna
Universidade Federal Rural de Pernambuco, Recife, Brazil
Moesio W. Silva-Filho, Gabriel A. Barbosa & Péricles B. C. Miranda

Authors

Francisco B. Oliveira
View author publications
Search author on:PubMed Google Scholar
Moesio W. Silva-Filho
View author publications
Search author on:PubMed Google Scholar
Gabriel A. Barbosa
View author publications
Search author on:PubMed Google Scholar
João Paulo Freitas
View author publications
Search author on:PubMed Google Scholar
Chris Penna
View author publications
Search author on:PubMed Google Scholar
Péricles B. C. Miranda
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Péricles B. C. Miranda .

Editor information

Editors and Affiliations

Universidade Federal Fluminense, Niterói, Brazil
Aline Paes
Instituto Tecnológico de Aeronáutica, São José dos Campos, Brazil
Filipe A. N. Verri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oliveira, F.B., Silva-Filho, M.W., Barbosa, G.A., Freitas, J.P., Penna, C., Miranda, P.B.C. (2025). Machine Learning and Time Series Analysis to Forecast Hotel Room Prices. In: Paes, A., Verri, F.A.N. (eds) Intelligent Systems. BRACIS 2024. Lecture Notes in Computer Science(), vol 15414. Springer, Cham. https://doi.org/10.1007/978-3-031-79035-5_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-79035-5_25
Published: 30 January 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-79034-8
Online ISBN: 978-3-031-79035-5
eBook Packages: Computer ScienceComputer Science (R0)

Keywords

Publish with us

Policies and ethics

Machine Learning and Time Series Analysis to Forecast Hotel Room Prices

Abstract

Similar content being viewed by others

A Machine Learning Model for Occupancy Rates and Demand Forecasting in the Hospitality Industry

Forecasting hotel reservations with long short-term memory-based recurrent neural networks

A Study of Machine Learning Based Approach for Hotels’ Matching

1 Introduction

2 Background

3 Experimental Methodology

3.1 Dataset

3.2 ML and Forecasting Algorithms

3.3 Evaluation Methodology

4 Results and Discussion

5 Conclusion and Future Works

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Keywords

Publish with us