key: cord-0510942-sgbdepbh authors: Madushani, L. S.; Talagala, Thiyanga S. title: Hierarchical Forecasting of Dengue Incidence in Sri Lanka date: 2021-12-02 journal: nan DOI: nan sha: ed6e0df2373ee9c3e2178443bcb9144f7700336c doc_id: 510942 cord_uid: sgbdepbh The recurrent thread of dengue incidence in Sri Lanka is still abundant and it creates a huge burden to the country. Hence, the National Dengue Control Unit of Sri Lanka propose a national action plan to prevent and control the dengue incidence. To implement the necessary actions for short and long terms the proposed plan operates under three levels:country-level, province-level, and district levels. In order to optimize resource allocation, the health officers require the forecasts for country, province and district levels, which preserves the aggregate consistency associated with the district, province, and country levels as well as time correlations. Hence, the objective of this study is to forecast the dengue incidence in Sri Lanka using a hierarchical time series forecasting approach based on spatial and temporal hierarchical structures. Hierarchical forecasting involves two steps such as generating base forecasts and reconciliation of these base forecasts. In this study, Exponential smoothing(ETS) and Autoregressive Integrated Moving Average (ARIMA), NAIVE, Seasonal NAIVEapproaches and average method are used to generate base forecasts. Accuracy of forecasts is evaluated using Mean Absolute Scale Error (MASE). We compare the accuracy of hierarchical time series forecasts with other benchmark approaches. The forecast accuracy reveals that the best forecasting approaches for country, provinces and districts are not limited to a single approach. Hence, we investigate reasons for the variations of performance in different forecasting approaches based on a time series feature-based visualization approach. Dengue is the most rapidly growing viral infection in the world (WHO 2014) . The causative agent of dengue infection is classified into four serotypes: DENV-1, DENV-2, DENV-3, and DENV-4 whereas infection of one serotype does not result in lifelong immunity against other serotypes (Sanyaolu et al. 2017) . According to the World Health Organization, 50 to 100 million new dengue cases occur annually including 20,000 deaths in more than 100 countries of the world (WHO 2012) . A large number of dengue cases were reported in Central and South America, South-East Asia and Western Pacific regions (Teixeira and Barreto 2009) . Asia represents approximately 70% of the global burden of the disease (Bhatt et al. 2013) . Dengue incidence of Sri Lanka shows a considerable rise during the last two decades, causing economic and social burdens. The first serological confirmed dengue case was reported in 1962 and the first largest dengue epidemic outbreak occurred in 2009 with 35,008 suspected cases and 346 deaths in Sri Lanka (Withanage et al. 2018 ). The largest dengue outbreak was reported in 2017, comprising 186,101 suspected cases and over 320 deaths according to the Epidemiological Unit of the Ministry of Health, Sri Lanka (Chandrakantha 2019) . This unprecedented outbreak is due to the variation of the usual circulating DENV-2 (Ali et al. 2018; Tissera et al. 2020 ). More than 30,000 cases have been reported each year since 2012 and the highest number of dengue cases is reported in the Western province of Sri Lanka (Withanage et al. 2018) . At present, the spread of dengue incidence in districts outside the Western province has indicated a dramatic change. To date, there is no specific treatment methodology to cure the infection (Wijegunawardana et al. 2019) . However, simple fluid replacement and case management assist to reduce the fatality rate from 20% to 1% (Chandrakantha 2019) . Therefore, forecasting dengue incidence is important to implement curative and control programs. In recent, several researchers have made attempts to forecast dengue incidence in Sri Lanka. For example, time series regression models were developed using different climate variables for various districts (Withanage et al. 2018; Goto et al. 2013) . Distributed lag nonlinear modeling approach was used to identify the relationship between climate factors and dengue incidence in Colombo district of Sri Lanka (Talagala 2015) . Talagala and Lokupitiya (2015) performed a wavelet analysis to identify the pattern of dengue incidence in 25 districts of Sri Lanka . More recently, Chandrakantha (2019) developed a model to predict the likelihood of having dengue incidence in Sri Lanka based on climate factors. Furthermore, artificial neural network approach was applied to forecast dengue incidence in Kandy district of Sri Lanka for the period 2010 to 2012 by considering rainfall, humidity, temperature factors, and previous dengue cases (Nishanthi et al. 2014) . Attanayake et al. (2020) used exponential smoothing approach to forecast dengue counts in Colombo. All these studies have been conducted to forecast dengue incidence of the whole country, particular provinces, or districts independently. None of these approaches has focused to cater the needs that are proposed by the national action plan to implement dengue prevention and control programs (MOH 2019) . This plan has assigned officers for national, provincial, and district levels for achieving the proposed objectives of dengue control and prevention activities. Hence, this study mainly focuses on dengue incidence forecasting for districts, provinces, and the whole country to assist the decision-making process of assigned officers and plan and implement the programs at the district, province, and country level. The hierarchical time series forecasting has the ability to deal with time series that can be aggregated at different levels based on geography. This approach provides forecasts that are coherent across the levels of hierarchy. That is, forecasts at the aggregate level are equal to the sum of the relevant forecast at the desegregated level (Wickramasuriya et al. 2019) . Moreover, it allows to capture of the correlation and interaction between the series at each level of hierarchy and it has the ability to capture different information based on each series. In addition, the most significant aspect of this approach is that it ensures aligned decision-making across the entire hierarchical structure (Athanasopoulos et al. 2020 ). Hence, this study considers hierarchical time series forecasting approach to obtain the forecasts of districts, provinces, and country levels of Sri Lanka. Furthermore, forecasts are needed for short-term and long-term periods. Hence, this study gives attention towards the time correlation as well. A simple hierarchical structure is illustrated in Figure 1 . In Figure 1 , each series are observed at time t = 1, 2, . . . , T and forecasts are generated for each series at time T + 1, T + 2, . . . , T + h. The series at the bottom level of the hierarchy, level 2, are denoted as AX, AY, BX, BY, CX and CY . Series at level 1, A, B and C can be obtained by aggregating the corresponding bottom level series. These series are further aggregated to obtain the total series at level 0, the "Total". Furthermore, y i,t denotes the t th series corresponds to node i and m i denotes number of series at level i. The total number of series in hierarchical structure is denoted by m. Hierarchical structure of Figure 1 can be denoted by matrix notation as In general, hierarchical time series can be denoted by the matrix notation y t = Sb t , where y t denotes all series in the hierarchy and b t is the vector of all bottom level series in the hierarchy at time t. S is the summing matrix of order m × m k where m k is the number of series in the most disaggregate level of the hierarchical structure. Several methods have been proposed to reconcile the time series forecasts in a way that respects the hierarchy. However, all existing hierarchical time series forecasting approaches can be illustrated asỹ whereỹ T (h) is h-step-ahead reconciled forecasts,ŷ T (h) is h-step-ahead base forecasts, and P is an m k × m matrix. The role of P is different according to the hierarchical forecasting approach. Top-down and bottom-up approaches are the common approaches used in hierarchical time series forecasting. The bottom-up approach involves forecasting each of the bottom level series and sum these forecasts to obtain other levels of the hierarchy (Hyndman et al. 2011) . However, bottom level data are noisy and therefore it is difficult to model bottom level series (Zotteri et al. 2014 ). In the top-down approach, forecasts are generated for the most aggregated series and then forecasts of the total series are disaggregated based on the proportions. Here, the proportions can be obtained based on average or historical proportions (Athanasopoulos et al. 2009; Hyndman et al. 2011 ). Another approach named as optimal combination approach was proposed by (Hyndman et al. 2011 ) which performs better than top-down and bottom-up approaches. This approach obtains the forecasts of all series of the hierarchical structure and then optimally reconcile based on a regression model. It requires an estimator for the covariance matrix of errors that occur due to incoherence. However, it is impossible to estimate. Hence, minimum trace optimal reconciliation approach (MinT) was introduced to incorporate the information from a full covariance of forecast errors by minimizing the mean squared error of the coherence forecasts under the assumption of unbiasedness and the resulting reconciled forecasts are in the form In practice, estimation of W h is difficult where k h > 0 and I is an identity matrix. The drawback of this substitution is that the scale differences between the levels of the structure are not considered. Another estimation is where k h > 0 andŴ 1 = 1 T T t=1 e t e t . In this case, e t is in-sample residuals of the base forecasts stacked in the same order as the data. Here, the base forecasts are scaled using the variance of residuals. Therefore, this approach is referred to as MinT approach based on a weighted least square (WLS) estimator using variance scaling. Another criterion considers that each of the bottom-level base forecasts are uncorrelated between nodes and they have errors with equal variance k h . Furthermore, each element of the diagonal matrix Λ denotes the number of forecast error variances that contribute to that aggregation level. This estimator focuses only on the structure of the hierarchy as where k h > 0 and Λ = diag(S1) with 1 being a unit vector of dimension m k . Next, where k h > 0 is considered as the estimation. In this estimation, W 1 is full one-step covariance matrix. However, this estimator is not an appropriate estimator when the number of bottom level series is larger than the length of the series T . Hence, shrinkage estimator is provided based on the sample covariance to a diagonal matrix as where k h > 0. In this method,Ŵ * 1,D is a shrinkage estimator defined aŝ whereŴ 1,D is a diagonal matrix that consists of diagonal entries ofŴ 1 . Therefore, this estimator allows us to capture strong interrelations between time series in the hierarchy. Schäfer and Strimmer (2005) proposed the shrinkage intensity parameter asλ = i =j V ar(r ij ) i =jr 2 ij , wherer ij is the ij th element ofR 1 which is the 1-step-ahead sample correlation matrix. In hierarchical forecasting, forecasts are generated for each series at every node of the hierarchy individually using different forecasting modelling approaches. These individual forecasts are referred to as base forecasts. In this study we consider Naive method, Seasonal naive method (SNAIVE), average method, AutoRegressive Integrated Moving Average (ARIMA) approach, and Exponential Smoothing method (ETS) to generate base forecasts. In Naive approach, forecasts are equal to the last observed value in the series. y T +h|T = y T SNAIVE approach considers that the forecasts are equal to the last observed value from the same season of the year. That means, forecasts for time T + h iŝ where m is the seasonal period, k is the integer part of (h−1) m . In the average method, the forecasts of all future values are equal to the average (or mean) of the available data. If y 1 , y 2 , . . . , y T denotes the available data, thenŷ T +h|T =ȳ = (y 1 + y 2 + · · · + y T )/T Auto-correlations in the data are described by ARIMA models. ARIMA model is a collection of auto-regressive and moving average components which captures the effect of past observations and error terms. ARIMA model comprises with 3 terms: p, d, q where p is the order of AR term, q is the order of MA term, and d is the number of differences required to make the time series into stationary. AR term considers the linear regression model which depends only on lagged term as predictors y t = c + φ 1 y t−1 + φ 2 y t−2 + · · · + φ p y t−p + t MA term in ARIMA model describes the effect of lagged error terms y t = c + t + θ 1 t−1 + θ 2 t−2 + · · · + θ q t−q Exponential Smoothing models are constructed by considering the weighted averages of past observations in which high weights are assigned to the most recent observation. It was proposed by Holt (2004) and Winters (1960) . Statistical frame work of exponential smoothing methods was proposed by Hyndman et al. (2002) are illustrated by Table 1 . We evaluate the performance of base forecasting approaches against hierarchical forecasting approaches. Many accuracy measures have been proposed to check the forecast accuracy (Hyndman and Koehler 2006) . However, many of these proposed accuracy measures are not applicable due to the reasons such as scale dependency, producing undefined or infinite value and giving misleading results. Therefore, in this study we use Mean Absolute Scale Error (MASE) to obtain the accuracy of forecasting methods. For h-step-ahead forecasts, is the mean absolute error of the forecast method a, y j andŷ j are the actual and forecast values at period j respectively. Then, where Q = 1 T −m Σ T t=1 |y t − y t−m | is the scaling factor where m is the sampling frequency per year (Athanasopoulos et al. 2017 ). districts. In addition, level 0 is the most aggregate level or dengue counts of the whole country. The geographic structure is illustrated as a map in Figure 2 and the corresponding spatial hierarchical structure is illustrated by In addition, dengue curative and preventive programs require different strategies for different time period. Hence, this study focuses on the temporal structure of data as demonstrated by Figure 5 . Figure 6 represents the time series dynamic of dengue incidence for different aggregated levels in Sri Lanka. According to Figure 6 , the weekly time series depicts the highest seasonal variation. Before fitting the model, we separate the data set into training set and test sets. This study considers four different training sets such as 2006-2019, 2016-2019, 2006-2018, 2016-2018 and test sets are 2020, 2019 respectively. Furthermore, weekly, monthly, quarterly, and yearly frequencies are taken into the account in all these training and test sets. All statistical analysis related to spa- In the case of average, NAIVE, SNAIVE approaches, adjusting base forecasts are not necessary to obtain reconciled forecasts due to the fact that forecasts already follow the aggregate constraints associated with spatial hierarchical structure. In temporal hierarchical structure, forecasts of average and SNAIVE approach only adhere to constraints associated with temporal generalities. Hence, forecast reconciliation is necessary for ARIMA, ETS, and NAIVE methods in temporal hierarchical forecasting whereas forecast reconciliation is needed for ARIMA and ETS method in spatial hierarchical forecasting. We compare the forecasts accuracy for all series in the spatial and temporal hierarchical structure. We compare the accuracy of forecasts generated based of hierarchical forecasting approach against Average method (AVG), NAIVE, SNAIVE, ETS and ARIMA. The computed MASE values over the test sets of all series in the spatial hierarchical structure are summarized from Table 3 to Table 16 . We then identified the best forecasting method for all series at each level of the of the hierarchy. give the best forecast. However, when generating forecasts for the year 2020, the average method gives the best forecasts for Colombo district, Kalutara district (level 2) and Western province (level 1). This could be due to ongoing COVID-19 pandemic. Sometimes healthcare providers faced challenges to distinguish COVID-19 from dengue. Hence, there were some unusual pattern in dengue cases in 2020 specially for population density area. For the country-level series, the lowest MASE was given by the forecasts generated from average method based on the training Similarly, we generated weekly, monthly and quarterly forecasts based on temporal hierarchical structure and compared the accuracy of results with other benchmark approaches. The results are are summarized in Table 2 . Furthermore, Figure 9 and 10 show the best forecasting method for weekly, monthly and quarterly for districts and provinces respectively. For most of the geographical regions forecasts generated based on the hierarchical approach gives the best forecasts. Based on the above results the best forecasting approaches in each series of the spatial hierarchical structure and temporal hierarchical structure are not limited to single approach. According to No-Free-Lunch theorem, there is no single method that fits all situations (Kang et al. 2017 ). Hence, this study attempts to identify the reasons for the best forecasting approach variations based on time series features. In Section 3.1, we identified the best forecasting approaches corresponding to weekly, monthly, Multiplier test, optimal λ value of Box-cox transformation, seasonal peak year, seasonal trough year, spikiness, linearity, curvature, first and tenth autocorrelation coefficient of the remainder component, first and tenth autocorrelation coefficient from the original data, longest flat spot length, average interval between non-zero observations, squared coefficient of variation of nonzero observations, proportion of data which starts with zero and ends with zero, autocorrelation function at lag 1 and lag 10 from the first differenced data, autocorrelation function at lag 1 and lag 10 from the second differenced data, autocorrelation function at lag 1 from first seasonal lag data, hurst coefficient, spectral entropy, Box-Pierce statistic, p-value from Box-Pierce statistic, Ljung-Box statistic, p-value from Ljung-Box statistic, partial autocorrelation function at lag 5, partial autocorrelation function at lag 5 from first differenced data, partial autocorrelation function at lag 5 from second differenced data, partial autocorrelation at the first seasonal lag, KPSS statistic, p-value of KPSS test, Phillips-Perron statistic, p-value from Phillips-Perron test, required number of differences, required number of seasonal differences, stability, lumpiness, the largest mean shift between two consecutive sliding windows, index at which the largest mean shift occurs, the largest Figure 11 , only for Nothern province the best forecasts were given by SNAIVE and average (AVG) method based on the training sets TS4 and TS2 respectively. According to the feature visualization we can see the strength of trend corresponds to the Nothern province is very low compared to other series. Furthermore, the stability corresponds to the Nothern province is considerably higher. According to Figure 12 , Eastern province and Sabaragamuwa province are located far away from the other provinces due to differences in the features. The best forecasting method for Eastern province is NAIVE approach. This could be due to differences in trend, seasonality and lumpiness compared to other series. In Figure 11 to 16, for all series that generates best forecasts with NAIVE or SNAIVE have low value for lumpiness, high value for stability, strength of trend and strength of seasonality. In addition to that in we see outlying stay points that are isolated from other points tends to give best forecasts based on NAIVE, SNAIVE or Average method. . In this study, we have applied hierarchical time series forecasting approaches to forecast dengue incidence in Sri Lanka. Hierarchical time series forecasting approaches are extremely important to obtain optimal decisions which nicely match with the decisions across the levels of hierarchical structure. In hierarchical forecasting, first, we need to obtain forecasts of each series without considering the hierarchical structure. This study obtain the forecasts by ARIMA, ETS, NAIVE, SNAIVE, and average approaches which are referred to as base forecasts. The unprecedented magnitude of the 2017 dengue outbreak in Sri Lanka provides lessons for future mosquito-borne infection control and prevention Hierarchical forecasts for Australian domestic tourism Hierarchical forecasting Forecasting with temporal hierarchies Exponential smoothing on forecasting dengue cases in Colombo, Sri Lanka The global distribution and burden of dengue Risk prediction model for dengue transmission based on climate data: logistic regression approach Analysis of effects of meteorological factors on dengue incidence in Sri Lanka using time series data Forecasting seasonals and trends by exponentially weighted moving averages Optimal combination forecasts for hierarchical time series Forecasting: principles and practice Another look at measures of forecast accuracy A state space framework for automatic forecasting using exponential smoothing methods Thief: temporal hierarchical forecasting Visualising forecasting algorithm performance using time series instance spaces National Action Plan on Prevention and Control of Dengue Prediction of dengue outbreaks in Sri Lanka using artificial neural networks feasts: Feature Extraction and Statistics for Time Series Global epidemiology of dengue hemorrhagic fever: an update A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics Distributed lag nonlinear modelling approach to identify relationship between climatic factors and dengue incidence in Colombo District Wavelet analysis of dengue transmission pattern in Sri Lanka Meta-learning how to forecast time series Diagnosis and management of dengue Severe dengue epidemic Dengue: A Global Threat-Global Answers Dengue and severe dengue Optimal forecast reconciliation for hierarchical and grouped time series through trace minimization Evaluation of the effects of Aedes vector indices and climatic factors on dengue incidence in Gampaha District Forecasting sales by exponentially weighted moving averages A forecasting model for dengue incidence in the District of Gampaha