key: cord-0514257-o111pf7z
authors: Amendola, Alessandra; Candila, Vincenzo; Cipollini, Fabrizio; Gallo, Giampiero M.
title: Doubly Multiplicative Error Models with Long- and Short-run Components
date: 2020-06-05
journal: nan
DOI: nan
sha: eeae4c447abb8f26db4427674801689ec15f335d
doc_id: 514257
cord_uid: o111pf7z

We suggest the Doubly Multiplicative Error class of models (DMEM) for modeling and forecasting realized volatility, which combines two components accommodating low-, respectively, high-frequency features in the data. We derive the theoretical properties of the Maximum Likelihood and Generalized Method of Moments estimators. Two such models are then proposed, the Component-MEM, which uses daily data for both components, and the MEM-MIDAS, which exploits the logic of MIxed-DAta Sampling (MIDAS). The empirical application involves the S&P 500, NASDAQ, FTSE 100 and Hang Seng indices: irrespective of the market, both DMEM's outperform the HAR and other relevant GARCH-type models.

More than forty years have passed since Engle's pioneering work (Engle, 1982) on modeling the conditional variance as an autoregressive process of observable variables. GARCH-type models (Bollerslev, 1986 ) are still playing a significant role in the financial econometrics literature. This is mainly due to the fact that this class of models allows to reproduce several stylized facts, such as the persistence in the conditional second moments (volatility clustering) and, in its extensions, the possibility of taking into account the slow moving or state dependent average volatility level. This empirical regularity can be suitably accommodated assuming that the dynamic evolution of volatility is driven by two components, a high-and a low-frequency one, which combine additively or multiplicatively (Amado et al., 2019 , offer a comprehensive survey of the contributions in this field). As a matter of fact, several suggestions exist in the GARCH literature to model the low frequency component. For instance, Hamilton and Susmel (1994) and Dueker (1997) consider a Markov Switching framework, Amado and Teräsvirta (2008) a Smooth Transition context, Mazur and Pipień (2012) and Engle and Rangel (2008) introduce deterministic functions in order to make the unconditional variance time-varying with high persistence. This latter contribution points to a relationship between a time-varying average level of volatility and macroeconomic events related to the business cycle: since the macrovariables are observed at a lower frequency than that of the asset returns, the MIxed-DAta Sampling (MIDAS) approach suggested by Ghysels et al. (2007) was extended to allow the real economy to influence financial volatility (GARCH-MIDAS model Engle et al., 2013; Conrad and Loch, 2015) . Some extensions are available, such as the Double Asymmetric GARCH-MIDAS (DAGM) introduced by Amendola et al. (2019) , where a variable available at a low frequency drives the slow moving level of volatility and is allowed to have differentiated effects according to its sign, determining a local time-varying trend around which a GJR-GARCH (Glosten et al., 1993, GJR) describes the short-run dynamics. 1 Volatility modeling has encountered a tremendous boost by the availability of ultra-high frequency data, and the ensuing stream of literature related to estimating volatility using tick-bytick data, conveniently aggregated: following the pathbreaking paper by Andersen and Bollerslev (1998) , realized volatility measures have become an ideal target for evaluating volatility forecasting performances. Such forecasts may be generated by GARCH models (for the conditional variances of asset returns) or by models of realized variances themselves (conditional expectations of variances or volatility, or, yet, log-variances), the latter being able to exploit intra-daily information about market movements. For the latter class of models a wide choice exists: the variants of the Multiplicative Error Model (MEM, Engle, 2002; Engle and Gallo, 2006) , the Heterogeneous Autoregressive Model (HAR) by Corsi (2009) , the Realized GARCH (RGARCH, Hansen et al., 2012) , among others, have proven to be effective in translating the refinement of volatility measurement achieved in the realized variance estimators (for a survey on this estimators in reference to forecasting, cf. Andersen et al., 2006) into good out-of-sample model performances relative to the GARCH results (notoriously based just on squared closeto-close returns).

This paper discusses the presence of a long-run and a short-run components of volatility, combining multiplicatively with one another within a unified general framework within the MEM class, which we label DMEM (Doubly Multiplicative Error Model): in it, the short-run component is seen as fluctuating around one and be a function of past volatility or some predetermined variables, all observed at the same frequency. As per the long-run component (which provides the time-varying average level of volatility), it can be assumed as: a constant (giving back the base MEM); a smooth function of time (giving rise to a Spline − MEM in the case of a spline); a specification based on daily data which mirrors the structure of the short-run component with a higher persistence (a novel model, which we label Component − MEM); and the extension of the MIDAS approach to the MEMworld, providing a tool in which weekly or monthly data for the long-run can be combined with daily data for the short-run (another novel model, the MEM − MIDAS). From an empirical point of view we are motivated to compare performance of these models against a few representative models in the GARCH class, in particular those based on a MIDAS approach on the one side and models for realized volatility keeping a base asymmetric MEM (AMEM) as a reference, together with (an asymmetric versio of) the HAR and the RGARCH, all characterized by the absence of such a low-frequency component.

The theoretical discussion shows that both new models have desirable statistical properties for their estimators (both within a Maximum Likelihood and a Generalized Method of Moments framework). From an empirical point of view, we estimate all the competing models for the realized volatility series of four major indices (the S&P 500, NASDAQ, FTSE 100 and Hang Seng). To summarize the results, to a question like Is a long-run component advisable?, the answer is yes: the models that do not use it are dominated by the ones that do within the classes of models for realized volatility on the one hand and models for conditional variances of returns on the other. To a question like Does modeling realized volatility perform better than a GARCH, even when the latter contain a long-term component?, our answer is still yes, pointing to the richness of intra-daily information over the consideration of just returns. Moreover, our results favor the DMEM approach over the HAR in spite of its capability of mimicking long memory features in the data.

Our contribution parallels a number of papers where the issue of a low-frequency component was taken into account. Within the MEM context, has been estimated in several ways: through regime switching and smooth transition functions Gallo and Otranto (2015) , by deterministic splines Brownlees and Gallo (2010) or by a semi-non-parametric vector MEM, where the lowfrequency term affecting several assets is obtained non-parametrically Barigozzi et al. (2014) . A comparison with those models goes beyond the scope of this paper.

The rest of the paper is organized as follows. In Section 2 we suggest the rationale and the notation for the DMEM, introducing the two new models (Component − MEM and MEM − MIDAS).

Section 3 presents the theoretical results on the estimators' properties and statistical inference. Section 4 introduces the market indices used in the empirical estimation, presents the results in terms of in-sample estimation and performs the main forecasting comparison across the competing models. Section 5 contains some concluding remarks.

Let {x i,t } be a time series coming from a non-negative discrete time process for the i-th day (i = 1, . . . , N t ) of the period t (for example, a week, a month or a quarter; t = 1, . . . , T ): this comprises most financial activity-related variables, such as realized volatility, high-low range, number of trades, volumes, durations, and so on.

Let F i,t be the information set available at day i of period t. In its standard version (Engle, 2002) , the MEMassumes that

where: τ is a constant; ξ i,t is a quantity that, conditionally on F i−1,t and by means of a parameter vector θ θ θ , evolves deterministically; ε i,t is an error term such that

meaning that it has a unit mean, unknown variance σ 2 and a probability density function defined over a non-negative support. 2 Therefore, independently of the chosen distribution D + and the function used to build the evolution of µ i,t , we have that

Evaluating expression (3) unconditionally, we can interpret τ to be the unconditional expectation of x i,t if we assume that E(ξ i,t ) = 1, so that x i,t moves around the constant term τ. Correspondingly, the conditional variance can be expressed as

In this paper, we extend the specification for the conditional mean to have a multiplicative component structure, in which both factors of the conditional expectation are time-varying. We have

τ i,t can be seen as a slow-moving component determining the average level of the conditional mean at any given time, or, which is the same, a long-run component. By the same token, since ξ i,t is a factor centered around one, it plays the role of dumping or amplifying τ i,t depending on whether it is < or > 1; for this reason, we label it as a short-run or fast-moving component. Equation (5) with innovation (2) define a Doubly Multiplicative Error Model, or DMEM. 3 Let us start by expressing the short-run component in general terms as the GARCH-type expression typical of a MEM, augmented by the contribution of a predetermined de-meaned (vector) variable z z z (DMEM − X to parallel the GARCH − X, cf. Han and Kristensen, 2014) :

i,t 1 (ri,t<0) .

i,t which takes a non-zero value only if it corresponds to a negative return (for asymmetric effects).

Starting from E(ξ i,t ) = 1, we have

where β * 1 = α 1 + γ 1 /2 + β 1 denotes the persistence. To simplify matters, here we removed the contribution of predetermined variables: explicit inclusion would require assumptions on the correlation between variables x and z z z.

As far as the long-run is concerned, we consider here different alternatives, apart from it being constant (the resulting model would be the standard MEM).

• [Spline − MEM] We can specify τ i,t by means of a spline function (for example a linear or a cubic spline)

as a smoothing spline or a regression spline with a relatively low number of knots so as to guarantee the slow-moving feature. The resulting model is the so called Spline − MEM (the P-Spline MEM of Brownlees and Gallo, 2010 , corresponds to a specific choice of spline functions). Spline − MEM is trend-stationary (stationary around the trend component represented by τ i,t ).

• [Component − MEM] Another possibility is to structure τ i,t in a way similar to ξ i,t , namely

The consideration of two multiplicative components in the univariate GARCH case is discussed by Conrad and Kleen (2020) .

The essential difference in comparison with ξ i,t is that τ i,t is not constrained to move around a unit mean, although the persistence features of the components relative to one another characterize the fact that τ moves differently than ξ .

The model resulting from this specification of τ i,t , which we name Component − MEM, is similar to the model introduced by Brownlees et al. (2012) who use, however, an additive (namely µ = τ + ξ ) specification not examined here. Another specification which makes use of different multiplicative components is the Composite-MEM proposed by Brownlees et al. (2011) to model intradaily volumes.

1 . If all parameters are non-negative, this implies that E(τ i,t ) ≤ µ. Such characteristic comes from the fact that the drivers of ξ and τ equations, namely x i,t , are positively correlated since they both depend on ε i,t . In case of mean-stationarity we have then

Yet another option is to allow τ i,t to have a MIDAS-like structure, adapting the use of mixed frequency data models (Engle et al., 2013; Conrad and Kleen, 2020) to the multiplicative error model context. In its simplest form, for all days i of the same period t, τ i,t can be expressed over a window of K periods as

where X t indicates a variable available only at t times and

Assuming ω 1 = 1 and ω 2 ≥ 1 in (9) identifies cases in which more emphasis is given to most recent observations.

A further refinement is inspired by the DAGM (Pan and Liu, 2018; Amendola et al., 2019) .

Regarding the choice of the MIDAS driver X, one could favor a variable X ⊥ ε as in Conrad and Kleen (2020) , as this simplifies the analysis, although it may be difficult to meet this condition in practice (as acknowledged by Conrad and Kleen, 2020, p.4) ).

Inference on the model defined in Section 2 can be obtained extending the framework suggested by Brownlees et al. (2012, Section 9.2.2) . Assuming that the conditional mean is correctly specified and indicating with θ θ θ the vector of parameters entering it, two estimation strategies are illustrated in what follows: Maximum Likelihood (ML) and Generalized Method of Moments (GMM).

The DMEM Maximum Likelihood estimator θ θ θ ML is defined as the value of θ θ θ maximizing the average log-likelihood function

N t is the number of observations. The portion relative to θ θ θ of the average score function can be expressed as

where

a a a i,t = 1

implies a zero expected score and, so, consistency of θ θ θ ML . This condition is obtained in case of correct specification of the error distribution but, as discussed in what follows, there are choices of f ε (ε i,t |F i−1,t ) able to guarantee (12) despite they are wrongly specified: in this case, θ θ θ ML is said a QML estimator. In what follows we assume that (12) is satisfied by the distribution chosen for ε i,t . The squared portions relative to θ θ θ of the asymptotic OPG (I I I ∞ ) and Hessian (H H H ∞ ) matrices are given by lim N→∞ of, respectively,

where the last equality is implied by (12).

Expressions (13) and (14) are sufficient to derive Avar( θ θ θ ML ) (the asymptotic variance matrix of θ θ θ ML ), but only when the possible free shape parameter in f ε (ε i,t |F i−1,t ), say λ , is "orthogonal" to θ θ θ in the sense that it satisfies

if this not happens, the variance matrix of θ θ θ ML depends also on the asymptotic variance of λ . 4 4 Expressing the full parameter vector as (θ θ θ ; λ ), the corresponding OPG and Hessian matrices are structured in (i, j)-blocks (i, j = 1, 2) corresponding to the two parameters in that order. Since Avar( θ θ θ ML ) is related to the (1, 1)-block of some inverse matrix (being it the asymptotic OPG, Hessian or Sandwich matrix), in general it may depend on the asymptotic variance of λ , right as a consequence of the block matrix algebra. Note that this "orthogonality" condition is trivially implied by

In the following section we discuss two among the possible specifications of the error distribution.

A sensible specification for the conditional distribution of ε i,t is the Gamma(φ , φ ), which guarantees the constraint E(ε i,t |F i−1,t ) = 1 and implies V (ε i,t |F i−1,t ) = 1/φ . This can be seen as a generalization introduced by Engle and Gallo (2006) to the choice of exponential distribution (where φ = 1) within the Autoregressive Conditional Durations (ACD) model by Engle and Russell (1998) and of the χ 2 (1) distribution (where φ = 2) suggested by Engle (2002) . In such a case,

It is important to remark that this choice guarantees condition (12) is satisfied should the Gamma not be the true distribution of the error term (QML property), and irrespective of the value of φ : this makes the results based on assuming the exponential or the χ 2 (1) distributions much more general, upon an appropriate choice of the standard errors. Plugging Equation (16) into (10) provides the θ θ θ -portion of the average score

which, in turn, implies the first order condition

Equation (16) guarantees also the important implication that the shape parameter φ is "or- thogonal" to θ θ θ in the sense of Equation (15):

as a consequence of the unit mean assumption for the error term. This, in turn, implies that the asymptotic variance of θ θ θ ML is uniquely determined by the OPG and the Hessian matrices

Correspondingly, the OPG, Hessian and Sandwich versions of the asymptotic variance matrix are, respectively,

Equivalence among the three expressions is ensured by taking φ = σ −2 (instead of fixing it, like for instance in the exponential and χ 2 (1) cases); hence, a consistent estimator is

where σ 2 is a consistent estimator of σ 2 ,

a a a i,t a a a i,t ,

and a a a i,t means a a a i,t evaluated at θ θ θ ML . The ML estimator of φ solves

where, ψ(·) denotes the digamma function and ε i,t indicates the RHS of (11) where the denominator is evaluated at θ θ θ ML . 5 Of course, this estimator is efficient if the true distribution is Gamma, 5 Considering the unit expectation constraint on ε i,t , we likely have

but it is unfeasible if zeros are present in the data, given that ln ε i,t = ln x i,t − ln τ i,t − ln ξ i,t . An alternative, which is not suffering from this drawback, is provided by using a GMM estimator of σ 2 (discussed below).

Another possible specification for the conditional distribution of ε i,t is the Lognormal(−V /2,V ), which guarantees the constraint E(ε i,t |F i−1,t ) = 1 and implies Var(ε i,t |F i−1,t ) = exp(V ) − 1), assuming no zeros are present in the data. In such case,

As noted before, if the Log-normal is the true distribution of ε i,t then condition (12) 

The resulting θ θ θ -portion of the average score is then given by

for the first order condition

Notice that, differently from the Gamma case (cf. Equation (18)), Equation (23) depends on the shape parameter V . This implies that, during estimation, one should alternate between estimation of θ θ θ and V . Another important difference with the Gamma case is that the shape parameter V is not "orthogonal" to θ θ θ , given that the LHS of Equation (15) is now

this implies that Avar( θ θ θ ML ) depends both on V and on the asymptotic variance of an estimator V (more on this below).

Focusing now on the shape parameter, the ML estimator of V solves

which implies 6

Because of (24), the asymptotic variance matrix of θ θ θ ML and V ML depends on their joint behavior. Assuming the correct specification of f ε (ε i,t |F i−1,t ), the joint Hessian matrix is given by

This implies

, which can be estimated by

a a a i,t .

The availability of several different closed form estimators of V (depending on the ε i,t 's) allows for the possibility to build a concentrated log-likelihood by replacing V with the desired V formula: since the concentrated log-likelihood depends only on θ θ θ , this bypasses the need to alternate between θ θ θ and V estimation (e.g. expression (27) as in Cattivelli and Gallo, 2020) . A simpler alternative is maybe to resort to the Method of Moments (MM) estimator (26), which is also in line with the zero expected score requirement in (12). 6 Alternative estimators are possible. For example, the zero expected score condition E (ln ε i,t |F i−1,t ) = −V /2 justifies the Method of Moments (MM) estimator

which is non-negative because of the Jensen's inequality

Another possibility is to refer again to the first order condition (25) but replacing the V 2 /4 addend by the squared average of the ln ε i,t 's (justified by the zero expected score condition, again). This leads to estimate V by the sample variance of the ln ε i,t 's.

A different way to estimate the model, which does not need an explicit choice of the error term distribution, is to resort to Generalized Method of Moments (GMM). Let

Under model assumptions, ε i,t − 1 is a conditionally homoskedastic martingale difference, with conditional expectation zero and conditional variance σ 2 . Following Brownlees et al. (2012, Section 9.2.2 .2), we get that the efficient GMM estimators of θ θ θ , say θ θ θ GMM , solves the criterion equation (18) and has the asymptotic variance matrix given in (19), i.e., the same properties of θ θ θ ML assuming Gamma distributed errors.

In the spirit of a semiparametric approach, a straightforward estimator for σ 2 is

where ε i,t represents here ε i,t evaluated at θ θ θ GMM . Note that this estimator does not suffer from the presence of zeros in the data.

Volatility, our main object of interest, is expressed as the square root of the realized kernel variance (Barndorff-Nielsen et al., 2008 , 2009 ) converted in percentage annualized terms: for the sake of comparison, given that the realized volatility refers to the open-to-close period, we will estimate the GARCH models also in reference to such period. Data on the S&P 500, FTSE 100, NASDAQ and Hang Seng indices have been collected from the realized library of the Oxford-Man Institute (Heber et al., 2009) , which allows us to derive open-to-close returns and their sign. The MIDAS-related macroeconomic variable is the US Industrial Production (IP t ), observed monthly and taken from the Federal Reserve Economic Data database. The variable IPc t is used in month-to-month percentage change (as in Conrad and Loch, 2015) . The period under consideration for all the variables is from 2 January 2001 to 15 May 2020. For reference purposes, some summary statistics (minimum, maximum, mean, standard deviation, skewness and kurtosis) for all variables considered are in Table 1 .

[ Table 1 about here.] Figure 1 depicts the open-to-close log-returns (top panels, black lines) and realized kernel volatilities (bottom panels, blue lines) for the four indices considered over the full sample. We superimposed the US recession periods dated by the NBER in 2001 and then 2008-09, as a reference to periods of slowdown in economic activity (and hence a downturn in industrial production). Although the scales are different, there are features in the dynamics of the series which are common to all four indices, notably the explosion of volatility around the Lehman Brothers demise in September 2008, and other episodes which are more idiosyncratic, although the surge in volatility at the end of 2002 is common to the US and UK indices, and the one in 2015 seems to have affected more the US markets and Hong Kong.

[ Figure 1 about here.]

We include in the set of competing models those having the realized volatility as the dependent variable, namely the multiplicative class (the AMEM plus the two proposed specifications MEM − MIDAS and Component − MEM) and the asymmetric version of the HAR model (AHAR), on the one side; and the GARCH class for the conditional variance of open-to-close returns, namely, GJR, GM, and the DAGM, on the other. To the latter, we add the RGARCH, which is still specified as a GARCH, but makes use of realized variance in its specification. All the functional forms are described in Table 2 .

[ The testing ground for the models includes two different robust loss functions (LFs, Patton (2011)): QLIKE and MSE. All LFs have the realized kernel volatility as their target, and the GARCH models variance forecasts are modified to match that target. The evaluation makes use of the Model Confidence Set (MCS, Hansen et al., 2011) , and the test statistic used in the MCS procedure is the semi-quadratic T SQ , as recently done by Cipollini et al. (2020) , for instance.

The first in-sample period spans from January 2001 to December 2012. Tables from 3 to 6 report the estimated coefficients for each model, some residual diagnostics and the MCS inclusion according to the two LFs. In terms of diagnostics, we consider the Ljung-Box (Ljung and Box, 1978) , applied on standardized residuals (squared standardized residuals for the GARCHbased models) at different lags. Overall, considering higher lags, the tests for the two proposed specifications signal an absence of clustering in the residuals (except for the NASDAQ index), contrary to what happens for many of the other competing specifications. As regards to the inclusion in the MCS, we can notice that MEM-based specifications have a better performance than all the other models. Interestingly, the proposed Component − MEM model is always included in the set of the superior models, independently of the LF adopted. In the case of the FTSE 100, the Component − MEM model is the only specification belonging to the MCS.

[ 

The two DMEM models produce an estimate of the long-run which is at a daily frequency for the Component − MEM, and at a monthly frequency for the MEM − MIDAS: in order for them to be compared, we choose to aggregate the former at the monthly level by averaging to the same scale, with an obvious change of notation for the objects involved, by dropping the subscript i. In Figure 2 we report the four τ t components (for each index) estimated with the Component − MEM (top plot), and with the MEM − MIDAS (bottom plot). It seems that the τ t components have a similar pattern across all the indices, within the same specification (more on this later). To investigate this aspect, in Table 7 , we report the correlations (numbers in regular text) among the τ t terms of the Component − MEM and among those of the MEM − MIDAS (number in italics); on the main diagonal, we reproduce the correlation coefficient between τ t 's estimated by the two different methods: they are all above 0.5 pointing both to the similarity of the two outcomes, but, by the same token, also to the difference of information and approach used to derive them. As far as the correlations across markets are concerned, neither method delivers consistently higher values than the other. By and large, the commonality in the τ t 's for different indices is confirmed and, as expected, the values are higher for the two US and the UK markets 

In the out-of-sample exercise, each model is estimated using a rolling window of twelve years (approximately, 3000 daily observations). Subsequently, the one-step-ahead forecasts are generated for the following two months, conditionally on the parameters' estimates previously obtained. Then, the estimation window shifts forward by two months, new out-of-sample forecasts are produced as in the previous step for the following two months, and so forth until the end of the series. 8 The first estimation period coincides with the in-sample period 2001-2012. The out-of-sample performances of the models, for each index under consideration, are depicted in Tables 8 to 11 . It can be easily noted that the largest gray area (indicating inclusion in the MCS) for all the tables, LFs and out-of-sample periods is for the MEM-based models, followed by some more scattered presence of the AHAR. The consistent presence of these models is reassuring in terms of modeling realized volatility directly, on the one hand, and within that class in terms of the convenience to treat innovation terms as entering multiplicatively. Modeling conditional volatility through the conditional second moments of returns seems to be dominated according to either metric in the loss functions. Somewhat disappointingly, RGARCH seldom enters the MCS.

To gain some further insights as of the behavior of each model in relationship with the observed volatility pattern, we suggest a graphical comparison (Figure 3 ) between the two DMEM models introduced in this paper. To that end, we reproduce, for the last period of our sample (from 2 January 2020 to 15 May 2020), the out-of-sample forecasts next to the realized kernel volatility.

[ 

Two different general approaches can be followed when forecasting asset return volatility: one is the GARCH approach where the conditional variance is estimated from return data, the other is modeling the conditional expectation of volatility using ultra-high frequency measures of realized volatility data. In the first approach, therefore, measurement and modeling are comprised within the same framework, while, in the second, the two aspects are decoupled. The merits of the GARCH model are testified by the hundreds of thousands of theoretical and empirical contributions since the seminal paper by Engle (1982) . This type of approach has been enriched over the years by successive refinements, with the goal to capture some empirical regularities in the pattern of the observed time series. This is the case for the consideration of a time-varying local average in the conditional variance, a feature addressed by Engle and Rangel (2008) , also in reference to its economic interpretation to macro economic fluctuations. As a parallel approach, direct modeling of realized measures of volatility has the advantage to exploit the better theoretical properties of these ultra-high frequency measures (less noisy than squared returns).

For either approach, the consideration of how complicated it is to collect the data and to finetune a model to derive the forecast has to be weighed against the actual reward in an improved forecasting performance. The availability of freely downloadable price data still maintains popularity with the GARCH approach (especially among practitioners), but it is also true that the number of high-frequency data vendors is expanding and that DIY processing and storing tickby-tick data is not a prohibitive task.

A comparison across models can be interpreted as an exercise that aims at assessing the capability of each model to reproduce empirical regularities in the data, but also at establishing how important those stylized facts are when taken on an out-of-sample terrain.

In this context, our paper has two clear outcomes: one is to suggest that modeling realized volatility delivers better results than going through a GARCH-type approach; the second is to show that incorporating the feature that average volatility by subperiod is time-varying provides an advantage in forecasting. For the first outcome, there are clear merits in using a model in which the errors enter multiplicatively, as in the MEM: this mitigates the attenuation bias in realized volatility models as documented by Cipollini et al. (2020) , because it takes into explicit consideration the heteroskedastic nature of volatility measurement errors. For the second outcome, we suggest that doubling the multiplicative components incorporating a slow moving and a short-run components of volatility dynamics delivers better results, at least for our four stock market indices. We contributed two such models, differentiated by the type of information entering the low-frequency component: in the Component − MEM, we use the same daily data, but we allow for a more persistent dynamics; in the MEM − MIDAS, we use a monthly macrovariable (the US industrial production) the variations of which combine in a smooth component which exploits the mixed sampling results by Ghysels et al. (2006) and by Engle et al. (2013) . While our MEM − MIDAS performs better than the corresponding GM or DAGM in a GARCH context, its delivering a τ t which lags behind relative to the bursts of volatility makes it, at times, preferred by another member of the DMEM family, namely the Component − MEM. We can see a convenience in using the MEM − MIDASwithin a scenario-type approach designing prolonged periods of downturns in economic activity (not necessarily limited to our choice of US industrial production): the impact and aftermath of the COVID-19 health emergency on the financial volatility may thus be studied in projecting to the medium term this channel of transmission originating in the real economy.

While refinements are still possible (e.g. the use of a second lag in making use of observed volatility values, or a DAGM extension within the MEM − MIDAS), one indication that emerges from the empirical results is that the components estimated by our models have some commonality that should be exploited -in a common factor sense -by a joint modeling of the series. are the open-to-close log-returns and realized kernel volatility, both expressed in annualized percentage. The monthly variable is the US Industrial Production (IP t ), and is expressed as the annualized month-to-month percentage change (IPc t ), that is 12 0.5 · 100 · ((IP t /IP t−1 ) − 1). Notes: The table reports the estimated coefficients of the models in column. * , * * and * * * represent the significance at levels 10%, 5%, 1%, respectively, associated to QML standard errors. The reported constant for the AMEM model refers to α0 parameter in Table 2 . For ease of notation, the parameter α1 referred to the RGARCH corresponds to the parameter labelled as γ in Hansen et al. (2012) . Moreover, the estimated parameters of the measurement equation of this latter model are not reported for space constraints. LB l represents the p-values of the Ljung-Box (Ljung and Box, 1978) test at l lag, applied on standardized residuals (squared for GARCH models). Last two rows report the averages of the QLIKE and MSE loss functions. 196 0.197 Notes: The table reports the estimated coefficients of the models in column. * , * * and * * * represent the significance at levels 10%, 5%, 1%, respectively, associated to QML standard errors. The reported constant for the AMEM model refers to α0 parameter in Table 2 . For ease of notation, the parameter α1 referred to the RGARCH corresponds to the parameter labelled as γ in Hansen et al. (2012) . Moreover, the estimated parameters of the measurement equation of this latter model are not reported for space constraints. LB l represents the p-values of the Ljung-Box (Ljung and Box, 1978) test at l lag, applied on standardized residuals (squared for GARCH models). Last two rows report the averages of the QLIKE and MSE loss functions. Notes: The table reports the estimated coefficients of the models in column. * , * * and * * * represent the significance at levels 10%, 5%, 1%, respectively, associated to QML standard errors. The reported constant for the AMEM model refers to α0 parameter in Table 2 . For ease of notation, the parameter α1 referred to the RGARCH corresponds to the parameter labelled as γ in Hansen et al. (2012) . Moreover, the estimated parameters of the measurement equation of this latter model are not reported for space constraints. LB l represents the p-values of the Ljung-Box (Ljung and Box, 1978) test at l lag, applied on standardized residuals (squared for GARCH models). Last two rows report the averages of the QLIKE and MSE loss functions. The chosen volatility proxy is the realized kernel. Shades of gray denote inclusion in the MCS at significance level α = 0.25. Notes: The table reports the estimated coefficients of the models in column. * , * * and * * * represent the significance at levels 10%, 5%, 1%, respectively, associated to QML standard errors. The reported constant for the AMEM model refers to α0 parameter in Table 2 . For ease of notation, the parameter α1 referred to the RGARCH corresponds to the parameter labelled as γ in Hansen et al. (2012) . Moreover, the estimated parameters of the measurement equation of this latter model are not reported for space constraints. LB l represents the p-values of the Ljung-Box (Ljung and Box, 1978) test at l lag, applied on standardized residuals (squared for GARCH models). Last two rows report the averages of the QLIKE and MSE loss functions. The chosen volatility proxy is the realized kernel. Shades of gray denote inclusion in the MCS at significance level α = 0.25. 

Models with multiplicative decomposition of conditional variances and correlations

Modelling conditional and unconditional heteroskedasticity with smoothly time-varying structure

On the asymmetric impact of macrovariables on volatility

Answering the skeptics: Yes, standard volatility models do provide accurate forecasts

Volatility and correlation forecasting

Disentangling systematic and idiosyncratic dynamics in panels of volatility measures

Designing realised kernels to measure the ex-post variation of equity prices in the presence of noise

Realised kernels in practice: trades and quotes

Generalized autoregressive conditional heteroskedasticity

Intra-daily volume modeling and prediction for algorithmic trading

Multiplicative error models

Comparison of volatility measures: a risk management perspective

Adaptive lasso for vector multiplicative error models

Realized volatility forecasting: Robustness to measurement errors

Two are better than one: Volatility forecasting using multiplicative component GARCH-MIDAS models

Anticipating long-term stock market volatility

A simple approximate long-memory model of realized volatility

Markov switching in GARCH processes and mean-reverting stock-market volatility

Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation

New frontiers for ARCH models

A multiple indicators model for volatility using intra-daily data

Stock market volatility and macroeconomic fundamentals

The spline-GARCH model for low frequency volatility and its global macroeconomic causes

Autoregressive conditional duration: A new model for irregularly spaced transaction data

Forecasting realized volatility with changing average levels

Predicting volatility: getting the most out of return data sampled at different frequencies

MIDAS regressions: Further results and new directions

On the relation between the expected value and the volatility of the nominal excess return on stocks

Autoregressive conditional heteroskedasticity and changes in regime

Asymptotic theory for the qmle in garch-x models with stationary and nonstationary covariates

Realized GARCH: a joint model for returns and realized measures of volatility

The Model Confidence Set

Omi's realised library, version 0.1, Tech. rep

On a measure of lack of fit in time series models

On the empirical importance of periodicity in the volatility of financial returns-time varying GARCH as a second order APC(2) process

Large sample estimation and hypothesis testing

Forecasting stock return volatility: A comparison between the roles of short-term and long-term leverage effects

Volatility forecast comparison using imperfect volatility proxies

Monthly τ t term: comparison between Component − MEM and MEM − MIDAS

Plot of the Component − MEM (top plot) and MEM − MIDAS (bottom plot) τ t terms