key: cord-0487441-ekga76sn
authors: Labonne, Paul
title: Capturing GDP nowcast uncertainty in real time
date: 2020-12-04
journal: nan
DOI: nan
sha: 07c40868506d7a7b9bb252ad44b21ee6a6acb99f
doc_id: 487441
cord_uid: ekga76sn

This paper shows that modelling comovement in the asymmetry of the predictive distributions of GDP growth and a timely related series improves nowcasting uncertainty when it matters most : in times of severe economic downturn. Rather than using many predictors to nowcast GDP, I show that it is possible to extract more information than we currently do from series closely related to economic growth such as employment data. The proposed methodology relies on score driven techniques and provides an alternative approach for nowcasting besides dynamic factor models and MIDAS regression where dynamic asymmetry (or skewness) parameters have not yet been explored.

GDP is the most comprehensive measure of economic activity. As such it is a critical variable helping policy makers -and economic agents in general -make decisions. But the quality of GDP comes at the cost of timeliness; it is published with a significant delay. Thus economic agents typically make use of forecasting methods to get an idea of the GDP number before its publication by the national statistics office. When the number of interest refers to the current quarter or recent past, forecasting is referred to as nowcasting. Nowcasting methods rely on series related to economic growth but released earlier and possibly more frequently than GDP, such as employment data.

Recent developments in nowcasting have focused on mainly two modelling frameworks: dynamic factor models and MIDAS regression. Dynamic factor models, discussed in more detail in the next section, use a small number of unobserved common factors to capture the relationship between related series and GDP. New observations on the related series lead to updates in the common factors which in turn lead to updates in GDP nowcasts. To handle the mixed-frequency nature of the model, the low frequency variables are modelled endogenously at the highest frequency of observation in the data using unobserved components methods. The unobserved high frequency components are then linked to low frequency observations through temporal aggregation constraints. On the other hand, MIDAS regression (see Ghysels et al. (2007) ) uses related series as independent variables which are aggregated to the frequency of the dependent variable using arbitrary lag polynomials.

Nowcasts are most useful when they are conveyed with a measure of uncertainty attached to them. Both methods discussed above provide distinct ways to use related series to improve not only GDP point nowcasts, but their associated uncertainty as well. In dynamic factor models this is done by introducing time-variation in the variance of the common factors, yielding stochastic volatility models (see notably Marcellino et al. (2016) and Antolin-Diaz et al. (2017 ). In MIDAS models, Pettenuzzo et al. (2016) show how stochastic volatility can be introduced by using a parallel MIDAS regression for the scale parameter of the dependent variable in addition to the one typically used for the location parameter. They use the same high frequency regressors and lag polynomial function for both the conditional mean and conditional volatility.

While stochastic volatility models yield timely changes in the dispersion of density nowcasts, they ignore a third dimension of uncertainty : its occasional asymmetry. This asymmetric behaviour of uncertainty has recently been unravelled and explored mainly in forecasting applications. Using a quantile regression with regressors reflecting financial conditions, Adrian et al. (2019) find increased negative skewness accompanying a rise in volatility in the predictive distribution of US GDP growth in times of recessions. In normal times, however, GDP growth is close to being conditionally normally distributed. While quantile regression is the most popular approach for introducing non-Gaussian features in the conditional distribution of macroeconomic data, Delle-Monache et al. (2020) arrive to similar results using a score driven model relying on Bayesian methods for estimation. Overall, the sharp downturn brought by the coronavirus pandemic and the drastic increase forecasting uncertainty linked to it makes this issue of asymmetry particularly salient. This paper shows how the asymmetric feature of conditional GDP growth discovered in the forecasting literature can be exploited in a nowcasting setting. The approach relies on modelling a common factor directly in shape parameters, which control the asymmetry of the predictive distributions, of both GDP and a timely related series. A common factor also features in scale parameters to introduce dynamic volatility, and in location (nothing new here). Dynamic asymmetry (or shape) implies that the location of the density forecast diverges from its conditional mean, which remains the measure of point forecast. Figure 1 shows how this approach can be used to capture the uncertainty during the onset of the coronavirus pandemic. It was clear in April 2020 that first-quarter GDP growth in the US would be negative or very close to zero, but while the magnitude of the drop was uncertain, a positive surprise was clearly improbable. This forecasting environment is reflected by a skewed predictive distribution towards negative values shown on the left panel. This contrasts with the right panel where only the dispersion and location of the predictive distribution can vary. 

Density nowcast on the 5th March 2020 First estimate of GDP released on the 29th April 2020 Figure 1 : Density nowcast of US GDP growth for the first quarter of 2020. The first estimate of GDP was released by the BEA on the 29th of April. The density in the left panel is generated by a skew Student's t model with dynamic volatility and shape. The right panel is generated by a Student's t model with dynamic volatility only.

The other attempts to introduce dynamic shape in density nowcasts comes from Ferrara et al. (2021) , who uses a MIDAS approach to exploit high frequency data in quantile regression. Quantile regression models make use of indicator variables, typically the growth rates of some indicators reflecting financial conditions, to model the quantiles of GDP growth. In contrast here it is the conditional shape of the indicator variable, as opposed to its growth rate, which is used to model the movements in the conditional shape of GDP growth. In other words -and it is the most distinctive and innovative feature of the proposed methodology -I capture a cross-sectional relationship between the conditional central moments of the series.

To capture cross-sectional dependencies in location, scale and shape parameters, I use score driven techniques (Harvey (2013) and Creal et al. (2013) ). Thus I offer an alternative nowcasting route to dynamic factor models and MIDAS regression where common dynamics in skewness have not yet been explored. Score driven models provide a flexibility close to non-Gaussian state space models while remaining easy to estimate and implement. They have been applied successfully to economic forecasting problems notably by Delle-Monache and Petrella (2017), Creal et al. (2014) , Gorgi et al. (2019) and Delle-Monache et al. (2020) .

The use of score driven models for nowcasting applications has so far been limited. Indeed, score driven models are predictive filters where dynamic parameters are perfectly predictable conditional on past information. This means that contemporaneous information on the related series (i.e data on employment for March when nowcasting first-quarter GDP) cannot in theory be used to improve nowcasts. However, Buccheri et al. (2021) propose to use a filtering equation analogous to the filtering equation of the Kalman filter to update current parameter estimates following the release of contemporaneous data. For this they show that score driven models can be seen as approximate filters and rather than purely predictive models. This is the strategy adopted here.

Although GDP growth and the related series used to improve nowcasting exhibit a high degree of comovement, they can have heterogeneous features. It is important to model openly these heterogeneous features, especially when modelling dynamic shapes where constraining both series to have an identical shape dynamic is unrealistic. Accordingly, Creal et al. (2014) present a score driven dynamic factor model where each series can have a distinct conditional distribution. However, their approach relies on independent prediction errors across series. This assumption is problematic when using correlated macroeconomic series subject to common shocks such as the sharp fall in activity induced by the pandemic. To overcome this problem I use a copula to capture cross-sectional dependencies in the prediction errors. 1 Separately, modelling common components in series aggregated and sampled at different frequencies raises temporal aggregation issues somewhat different for location, scale and shape parameters. While Mariano and Murasawa (2003) popularised a precise approximation for conditional means (or location parameters), similar solutions for scale and shape parameters have not been discussed in the literature. Carriero et al. (2016 Carriero et al. ( , 2018 Carriero et al. ( , 2020 and Huber (2016) model common scale components but not in a mixed-frequency setting. While Gorgi et al. (2019) model a common scale component in a score driven bivariate model, they do so by constraining the scale to be identical in both series and do not discuss the issue of temporal aggregation.

I explore two different strategies to tackle the temporal aggregation problem of scale and shape parameters. First, I show how scale parameters may be aggregated temporally in Gaussian models using a convenient approximation, thus providing an approach for modelling volatility common factors in a wide class mixed-frequency models. The second approach consists of aggregating monthly series into rolling quarterly figures. The mismatch in the frequency of aggregation is thus alleviated, and scale and shape common factors can be modelled in a non-Gaussian setting.

The model presented in this paper is a bivariate model exploiting two US time series : GDP growth and the index of total aggregate weekly hours. GDP is quarterly while the index of total aggregate weekly hours, which provides a very timely indicator of economic activity, is used on a monthly frequency. Both series are extracted from real-time vintages made available by the Federal Reserve Bank of Philadelphia. I apply different specifications of the model in a real-time setting to study the effect of modelling common factors in scale and shape parameters on nowcasting performance, with a particular interest in density nowcasts. The study is centred on the coronavirus pandemic which is notoriously difficult to nowcast.

The results show that modelling common factors in scale and shape parameters improves nowcasting. The shape common factor proves to be particularly important to capture the forecasting uncertainty generated by the coronavirus pandemic. Finally, modelling fat tails in the related series used as fast indicator complicates the identification of turning points in activity when the shape is not allowed to adjust.

2 A Mixed-Measurement (Quasi) Score Driven Model for Nowcasting with Location, Scale and Shape Common Factors I use common factors to exploit the cross-sectional relationships in location, scale and shape parameters between GDP and a timely related series. Capturing underlying relationships in macroeconomic data with a few latent components was first proposed by Rhodes (1937) , who suggests replacing the Business Activity indicator of The Economist by the first principal component of the series used for its calculation. Watson (1989, 2002) , Forni et al. (2000) and Doz et al. (2012) contributed to the revival and refinement of this approach, while Giannone et al. (2008) formalise its use for nowcasting. Using different estimation techniques, I extend the general idea behind dynamic factor models to conditional dispersion and asymmetry. A popular method for estimating unobserved components such as common factors consists of writing the model in state space form and using the Kalman filter to evaluate the model's log likelihood. The Kalman filter provides an efficient approach for estimating nowcasting models for two reasons. First, it handles easily the missing values that arise when modelling jointly series aggregated at different frequencies and released asynchronously. Secondly, predictions can be decomposed into latent components, which notably can be used to capture secular changes in addition to common factors (see Harvey (1989) and Durbin and Koopman (2012) ). A comprehensive presentation of dynamic factor and state space methods for nowcasting is given by Banbura et al. (2013) .

A limitation of the Kalman filter, however, is that it relies on the data being conditionally normally distributed (prediction errors from the model must be normally distributed). Non-Gaussian features can be introduced using importance sampling methods but these can be computationally intensive. Alternatively, Creal et al. (2013) and Harvey (2013) derive a new class of filters relying on the score of the predictive log likelihood function which can arise from a wide range of families. Score driven models provide a general framework for introducing time-variation and latent states in any parameter of the predictive distribution, not only the location.

The remainder of this section extends the score driven dynamic factor model of Creal et al. (2014) in three directions. First, I introduce dynamic factor structures in the scale and shape parameters and discuss their implications in a mixed-frequency setting. Second, I adapt the model to a nowcasting setting by adding a filtering equation following Buccheri et al. (2021) . Third, I relax the assumption of independent prediction errors by making use of a copula. To accommodate this feature I disconnect deliberately the dynamic in the model from the observation density : the model becomes quasi score driven. Overall the resulting model is able to use correlated related series for updating rapidly nowcasting uncertainty.

As in Creal et al. (2014) and Gorgi et al. (2019) , each element of the observation vector y t = (y 1,t , y 2,t ) , where y 1,t is GDP growth and y 2,t the related series, can have a distinct conditional (or predictive) density such that

t = 1, ..., N . The set Y t−1 = {z t , X t−1 , Θ} includes the information available at time t − 1. Both variables are modelled at a monthly frequency. The GDP series, which is quarterly, shows a quarterly figure in the last month of each quarter with missing values in other months. The vector z t includes the dynamic states related to location, scale and shape parameters. Θ is a set of fixed parameters such as autoregressive coefficients and factor loadings. The most straightforward approach to derive the contemporaneous joint log density of the data consists of assuming cross-sectional independence conditional on past observations; this is the strategy followed by Creal et al. (2014) and Gorgi et al. (2019) . In this particular case the joint log density is simply the sum of the log marginal densities, such that

t = 1, ..., N , where δ i,t is zero if observation y i,t is missing and one otherwise. The related series used to improve nowcasts is chosen on the basis of its high comovement with GDP growth. While most of this comovement is captured through the common factors, some residual comovement is likely to remain in the prediction errors, especially round economic downturns like the 2007 financial crisis and the Covid-19 pandemic. Dependence across series in the prediction errors violates the conditional independence assumption on which relies equation (2).

To allow for residual dependencies across series I use a copula, thus reducing the risk of misspecification. A bivariate copula C(F 1 (y 1,t ), F 2 (y 2,t )) is a joint distribution function where F 1 (.) and F 2 (.) are the marginal distribution functions. The corresponding joint density function is ∂ 2 C(F 1 (y 1,t ), F 2 (y 2,t )) ∂y 1,t ∂y 2,t = ∂ 2 C(F 1 (y 1,t ), F 2 (y 2,t )) ∂F 1 (y 1,t )∂F 2 (y 2,t )

∂F 1 (y 1,t ) ∂y 1,t ∂F 2 (y 2,t ) ∂y 2,t ,

= c(F 1 (y 1,t ), F 2 (y 2,t ))f 1 (y 2,t )f 2 (y 2,t ),

where c(F 1 (.), F 2 (.)) is the copula density and f 1 (.) and f 2 (.) are the marginal density functions. Patton (2006) gives a conditional copula theory fitted for modelling the joint distribution of y t = (y 1,t , y 2,t ) conditional on the past observations y t−1 , ..., y 1 .

Using a copula the log density of the observations at time t becomes

t = 1, ..., N . I consider two copulae : the Student's t copula and the Gaussian copula. The bivariate Student's t copula is

where t 2,ν is the cumulative distribution function of a bivariate student's t with degrees of freedom set to ν, mean zero and covariance matrix (or standardised dispersion matrix) equal to R ∈ [−1, 1] 2×2 with ones on the diagonal, and where t −1 ν is the quantile function of a standard student's t with ν degrees of freedom.

When the degrees of freedoms of the copula tend to infinity the Student's t copula is reduced to the Gaussian copula :

where Φ 2 is the cumulative distribution function of a bivariate normal with mean zero and covariance matrix equal to R ∈ [−1, 1] 2×2 with ones on the diagonal, and where Φ −1 is the quantile function of a standard normal distribution. Finally, the independence copula is retrieved when the dependence matrix in the Gaussian copula is equal to the identity matrix. In this case the copula log density is equal to zero and the total log likelihood is simply equal to the sum of each series's log likelihood (equation (2)).

Following Creal et al. (2013) and Harvey (2013) the dynamic in the vector of timevarying parameters z t comes from the conditional score. However, the score here is not derived from the observation density, but from a constrained version of the latter, thus generating a quasi score driven model (see Blasques et al. (2020) ). Specifically, the density used for deriving the score is given by equation (2) which assumes cross-sectional independence (the independence copula). The copula, therefore, does not affect the dynamic in the model which improves estimation. The transition equation is

where s t denotes the scaled first derivative of the log density with respect to the vector z t :

The unknown parameters in the matrix A are estimated via maximum likelihood along the unknown elements of the matrix B and the initial vector z 1 . The matrix A pre-multiplies the scaled score and determines the degree of variation in the unobserved components (their dependence to the dynamic introduced by the score). The matrix S t is a scaling matrix set to the Moore-Penrose inverse of the expected information matrix 2 given by

Following Delle-Monache et al. (2020) and Lucas and Zhang (2016) , I only use the diagonal elements of the information matrix which improves estimation.

Since the constrained log density (2) relies on conditional independence, the score and the expected information matrix can be expressed conveniently as

Unlike state space models, score driven models are fully deterministic conditional on past information. In other words, latent states at time t are defined fully from information in t − 1 (they are predictable perfectly in t − 1). This means that there is no room for improvement when new data on the related series for time t are released; one-step-ahead forecasts are not subject to any uncertainty. In a purely forecasting exercise that would not be a problem; Koopman et al. (2016) show that score driven models and non-Gaussian state space models have similar predictive accuracy for a wide range of model specifications. In a nowcasting exercise, however, updating latent states, and thus nowcasts as well, following the release of contemporaneous information is a critical feature; indeed it is the essence of nowcasting models. There are essentially two strategies for updating nowcasts following the release of contemporaneous information when using a score driven model. The first approach, adopted by Gorgi et al. (2019) , consists of leading the related series by one period. While this enables using the related series at t to affect the nowcast of the target series at time t, it is no longer clear what the common factors between the series captures. The other approach put forward by Buccheri et al. (2021) , which I adopt here, consists of using an updating step similar to the updating step of the Kalman filter. When using this strategy score driven models are seen as approximate filters rather than purely predictive models.

The filtering equation (7) can be split between an updating step and predictive step as

Where D is the new diagonal matrix of gains. Nowcasts are then retrieved by using the filtered vector z t|t rather than one-step-ahead predictions z t . The next section presents the components included in z t .

This section shows how the dynamic factor structure typically employed for location parameters (or conditional means) can be extended to scale and shape parameters. Intuitively, if large prediction errors occur in a series related to GDP, and if these are associated with large prediction errors in GDP, then the dispersion attached to the GDP nowcast should be adjusted accordingly. This is possible directly through the common factor in the scales. On the other hand, modelling dependencies in the shape parameters is useful to capture the asymmetry in prediction errors typically observed at the onset of recessions : While there is an increase in the dispersion of prediction errors, they are likely to be skewed towards negative values. The onset of the Covid-19 pandemic is a good illustration of this fact. Location parameters are decomposed into a time-varying trend representing idiosyncratic secular changes and a common component to all series which take the following form

for all periods t = 1, ..., T . Antolin-Diaz et al. (2017) demonstrate that random-walk specifications for time-varying parameters are robust to discrete breaks. They also stress the importance of capturing secular changes in economic growth, a finding also discussed by Doz et al. (2020) . Common components are usually set as autoregressive processes of order one or two; however, when it comes to modelling extraordinary events like the coronavirus pandemic, suppressing the persistence in the common component of the location helps. Scale parameters follow a relatively similar model :

for all periods t = 1, ..., T . The common factor in the scales captures common volatility shocks, like the one generated by the Covid-19 pandemic, and is specified as a stationary AR(1).

Finally, shape parameters follow :

for all periods t = 1, ..., T . Unlike in location and scale parameters, the idiosyncratic component in the shape is not a random walk but an AR(1) process. This choice tends to improve estimation. The common factor in the shapes captures simultaneous shifts in the asymmetry of prediction errors, like those typically arising at the beginning of recessions, and is also modelled as a stationary AR(1). The next section discusses how these trends and common components are related to location, scale and shape parameters.

Each series follows the predictive model :

for i = 1, 2, t = 1, ..., N , and where v i,t = σ i,t i,t is a prediction error (and i,t its standardised counterpart) following a distinct distribution (discussed in the next section) with time-varying location, scale and shape parameters (µ i,t , σ i,t and α i,t ). These are decomposed into latent states whose dynamics come from the scaled score as outline in the previous section. These latent states include a component common to all variables. However, the data used for estimation are of different nature; while GDP is a quarterly variable, the related series is a monthly variable. Since both are flow variables, it is important to account explicitly for this mismatch in the frequency of aggregation when relating the latent states to the parameters. To address this temporal aggregation mismatch it is useful to start from the accounting relationship between monthly and quarterly variables, specifically a quarterly variable in levels Y i,t must be equal to the three-month sum of its monthly sub-componentsỸ i,t−i , i = 0, 1, 2 :

To account for heteroskedasticity and multiplicative components, however, the data are generally taken in logarithms, such that the variables modelled are

Since the sum of the logarithms is not equal to the logarithm of the sum, the accounting constraint takes a nonlinear form :

This type of non-linearity becomes difficult to handle once the first differences in logs are modelled. It is possible, however, to work from a linear approximation. Salazar et al. (1997) and Mitchell et al. (2005) 

where h(.) is a non-linear transformation and x t a smooth variable. The approximation (19) is a second order approximation because the first order errors sum to zero. When monthly values in a quarter are close to the monthly average over the quarter, which is usually the case with seasonally adjusted figures, the approximation error introduced is negligible. Using this approximation equation (18) can be written as

Using approximation (20) it is possible to write the quarter-on-quarter log difference

Equation (21) is discussed by Mariano and Murasawa (2003) who have popularised its use in mixed-frequency dynamic factor models.

There is a linear relationship between a variable and its location parameter which makes it possible to split the location of a quarterly variable into monthly components and relate them to the location with approximation (21). Importantly this step does not require any assumption about the distribution of the unobserved monthly variable. Monthly locations are modelled as

where the factor loading is constrained to be one for GDP. Location parameters in quarterly series are related to the monthly model using approximation (21) as

while for monthly series it is simply

Diverging from the normal distribution is useful to capture the features of the data in a flexible way, but it introduces challenges in a mixed-frequency framework. Notably, while location parameters can be disaggregated temporally without making any statistical assumptions on the monthly sub-components, that is not possible for scale and shape parameters. This is problematic because the sum of two non-normally distributed variables with known distributions has a distribution which is difficult to evaluate and in many cases unknown. Linking the monthly model to the quarterly model is therefore difficult.

When working in a non-Gaussian framework it is preferable to model the dependencies across series in scale and shape parameters using a rolling quarterly model. To do so the related series is aggregated into rolling quarterly observations (quarter-on-quarter deviations observed monthly). This introduces serial correlation in the series which is addressed by modelling monthly location components, as outlined above. The temporal aggregation strategies for modelling jointly quarterly and monthly observations and modelling rolling quarterly observations are essentially the same; in both cases it is necessary to go back to the underlying monthly model. Labonne and Weale (2020) show that when rolling quarterly series are not subject to important measurement errors it is possible to interpolate the monthly path very precisely with (20). Hence there should be little loss of information when using rolling quarterly observations instead of monthly observations. Separately, while locations can take any real value, scale parameters are positive and shape parameters can take values only in (0;1) in the distribution used below. To incorporate these constraints during estimation, trends and common factors are related to scale and shape parameters as

with the factor loadings constrained to one for the GDP series (i = 1).

In a Gaussian setting it is possible to link monthly variances to quarterly variances using an approximation close to (21) as shown below. If y i,t is conditionally normally distributed, its conditional variance is equal to the variance of the prediction error such that

Assuming that series i is observable quarterly, the quarterly prediction error can be split into its monthly components using approximation (21) as

whereṽ i,t is a monthly prediction error. Using this approximation, the variance of the prediction error can be decomposed into monthly variances as

since, assuming that the monthly errorsṽ i,t are independent, the covariance terms are zero. This step is possible only if the prediction errors are normally distributed. The rolling quarterly scale parameter σ i,t can thus be written as a function of the monthly scale parameterσ i,t as

Hence, it is possible to specify a monthly model for both scales where the monthly components are related to the quarterly scale parameter of GDP using approximation (28) withσ 1,t = exp(λ σ 1,t + Λ σ 1 φ σ t ).

Each series' conditional density comes from the family of asymmetric student-t (AST) density of Zhu and Galbraith (2010) . The distinctive features of the AST are its shape parameter, or skewness parameter, which controls the asymmetry round the central part of the distribution, and its tail parameters which control tail behaviour independently on each side. The log AST density of observation t of series i takes the form

where σ i,t is the scale parameter, α i,t is the shape parameter which can take values in (0, 1), and ν 1,i and ν 2,i are respectively the left and right tail parameters which take positive values. K(ν) = Γ((ν + 1)/2)/( √ νπΓ(ν/2)) (Γ(.) is the Gamma function) and 1(x) is an indicator variable equal to one if statement x is true and zero otherwise. The distribution is skewed towards positive values if α i,t < 0.5 and towards negative values if α i,t > 0.5. When the tail parameters are constrained to be very large and the skewness parameter to 0.5 the AST is equivalent to a (scaled) normal distribution. In the empirical application the tail parameters are constrained to be equal on both sides. When both tail parameters are equal and the shape parameter fixed to 0.5, the AST reduces to a Student's t-distribution. If the shape parameter is not fixed to 0.5, the AST becomes a Skew t-distribution. Finally, fixing the tail parameters to their Gaussian values but estimating a shape parameter yields a Skew normal distribution. Figure 2 illustrates the effects that scale, shape and tail parameters have on the density function. The solid blue line shows the AST density when σ = 1 while other parameters are constrained to be Gaussian (α = 0.5, ν 1 = ν 2 = ∞). The dotted red line shows the AST density when either the scale, tail or shape parameters are varying. The density at the location changes when the scale parameter is altered but is independent to variations in the tail and shape parameters. This is a particular feature of this version of the AST distribution (given in equation (5) of Zhu and Galbraith (2010) ) where the random variable is scaled with B

Graph (a) shows the effect of increasing the scale parameter, specifically to σ = 2. The dispersion increases symmetrically on both sides and can be interpreted as a general (or symmetric) increase in forecasting uncertainty when the model is applied to conditional GDP growth. Graph (b) shows the effect of lowering both tail parameters to three. The probability in the tails increases which can be used to account for extreme events. Graph (c) shows the effect of negative skewness (α = 0.8) on the density function. The central part of the density becomes skewed heavily toward negative values. This statistical behaviour should be especially useful to capture the onset of recessions when coupled with an increase in dispersion : while general economic uncertainty increases, the likelihood of a positive outcome decreases. The real-time analysis of US GDP in the first quarter of 2020 shown in section 7 validates this intuition. Finally, graph (d) shows the opposite behaviour, that is positive skewness (α = 0.8), which should be useful in the early part of economic recovery following the Covid-19 recession. Figure 3 illustrates the response function of the scaled score for location, scale and shape parameters with the prediction error as input. 3 These are useful for illustrating two important properties of the scaled score. First, introducing a skewness parameter yields a discontinuity round zero. For instance, a distribution skewed towards positive values downweights the effect of positive prediction errors and amplifies the effect of negative ones. Secondly and most importantly, the scaled score downweights the effect of large prediction errors when tail parameters are small (low rates of decays in the tails). The Gaussian and skewed models, on the other hand, yield linear or increasing responses when prediction errors increase.

Downweighting the effect of large prediction errors is usually a desirable feature of score driven models because it leads to a more robust estimation. However, economic crises, and the Covid-19 period in particular, are examples of cases where outliers most likely have important and long-lasting effects on means and variances of time series. If the related series' distributions are allowed to have fat tails, the effect of large swings in the economic activity on the time-varying parameters are likely to be downweighted. Consequently the model might not capture turning points in the economic activity and the concurrent increase in the dispersion of prediction errors in a timely way. However, introducing dynamic scale and shape parameters partially alleviates this problem. To investigate thoroughly the effects of fat tails on estimation and forecasting performance in recessionary episodes, I compare models with unconstrained tail parameters with models constrained to have Gaussian tail parameters in the related series.

The estimation strategy relies on the weighted maximum likelihood method of Blasques et al. (2016) , notably applied in a score driven framework by Gorgi et al. (2019) . It accounts explicitly for the fact that, while data related to economic growth are used for estimation, the primary objective is to nowcast the target series (GDP growth), not the related series.

In a misspecified setting, the parameters maximising the total log likelihood (4) are not necessarily those maximising the log likelihood of GDP. This issue is even more prominent in a mixed-frequency framework where GDP is observed once every three months whereas the related series is observed monthly. Indeed, when an observation is missing, the series it relates to has no impact on the model's log likelihood. The log likelihood associated to GDP has consequently less weight on the total log likelihood than the log likelihood of the related series.

Using the weighted maximum method of Blasques et al. (2016) , the vector of unknown parameters Θ which includes the marginal distributions' parameters, the copula parameters, the initial values of the time-varying parameters, the autoregressive coefficients and the gains is estimated aŝ

where logf w (y t |Y t−1 ) is the weighted log likelihood defined as

The weight W is applied to the related series' marginal log likelihood and decreases its contribution to the total log likelihood. It cannot be estimated alongside the other parameters and Blasques et al. (2016) suggest selecting it via cross-validation techniques. Alternatively, Gorgi et al. (2019) set the weight to zero. This complicates the identification of the scale parameters in the related series; a problem they overcome by modelling a unique scale parameter for both series. This would be problematic here since the modelling approach proposed relies on a flexible specification for scale parameters. Furthermore, in their out-of-sample nowcasting exercise Gorgi et al. (2019) do not find that the score driven approach yields a clear benefit when the contribution of the indicator series is null.

Here I use the weighted maximum likelihood approach to offset the implicit downweighting of the GDP series's contribution to the total log likelihood coming from its lower frequency of observation compared to the related series. For each observation of GDP, three observations of the related series are available. This requires to set the weighing factor to one third. The log likelihood contribution of the related series is not decreased further because that might deteriorate excessively the model's capacity to predict the related series in real time. This could yield to prediction errors losing their economic meaning and with it their ability to indicate period of economic depressions and uncertainty, which is the focus of this paper.

This study is centred on quarterly GDP, which is the leading measure of economic growth, and the data are taken from the United States. The estimate of GDP analysed is the Advance Estimate, the most rapid estimate of GDP in the US. The monthly series related to economic growth used to improve GDP nowcasts is the index of total aggregate weekly hours. Both series are occasionally subject to benchmark changes which affect the entire series, such as changes in indexation. To deal with these changes they are taken in first differences in logs. The sample includes data from January 1973 up to June 2021. The series are illustrated in figure 4 using the vintage available at the end of July 2021. There is a clear comovement in the data which is confirmed by the correlation coefficients shown in table (1). Since recessions are of particular interest to forecasters, correlation coefficients are shown for recessionary episodes as classified by the NBER and normal times. The series are highly correlated during recessions which suggests that employment data should be helpful in capturing turning points in the economic activity. In normal times both series remain closely correlated, albeit on a lower magnitude.

The second half of table (1) shows correlation coefficients of absolute deviations, which are a measure of short-term volatility. It is better to analyse absolute deviations and not squared deviations because the latter would give too much weight to outlying observations. The picture is similar to levels; short-term volatilities are highly correlated during recessions while remaining significantly correlated in normal times. This suggests that modelling dependencies in scale parameters should be helpful. Note: The data are in taken in first differences in logs.

To filter short-term fluctuations and get a better idea of the underlying trends in the mean and volatility of each series, Figure 5 shows six-year moving averages of the levels and squares of the series. While there is a strong similarity in the underlying trends of both levels and squares, there are some periods of deviations. These deviations can be captured with the idiosyncratic trend featuring in each parameter. GDP and the index of total aggregate weekly hours are systematically revised over time because the data used for their compilation accrue gradually. Therefore, to investigate the forecasting performance of the model in real time it is necessary to estimate the model recursively using successive vintages of the data. Such vintages are produced by the Federal Reserve Bank of Philadelphia.

The index of total aggregate weekly hours is the most timely series. It is released at the beginning of each month and relates to the previous month. GDP is released at the end of the month following the quarter it relates to. This yields four rounds of revisions in between GDP releases, that is four nowcasting steps. Table 2 illustrates the vintages of the data available at different steps from early February to Late April 2020. × indicate a figure relating to the month specified in the first column; for GDP it is a quarterly figure relating the past three months.

The first nowcasting step is at the end of the first month in the quarter, when the previous-quarter GDP figure is released. The second step is at the beginning of second month in the quarter, when the monthly figure for the index of total aggregate weekly hours is released. The next two steps are approximately one month apart when figures for the index of total aggregate weekly hours are released.

I compare six bivariate models to investigate the potential gains of modelling common factors in scale and shape parameters on nowcasting performance. All models feature a dynamic factor structure in their locations, but specifications for the conditional distributions and scale and shape parameters vary.

The most flexible specification is model (a), which features dynamic volatility and shape (DVS). Thus the related series is used in a timely way to adapt both the dispersion and shape of density nowcasts. Each series follows a distinct Skew t-distribution. This model is labelled DVS t. As discussed in section 2.4, modelling a common factor in the shape parameters requires aggregating the related series into rolling quarterly figures. Model (b) is a constrained version of model (a) where each series are modelled as Skew normal instead of Skew t. This puts more attention on sharp movements in the related series. Model (b) is thus labelled simply DVS.

Model (c) is a dynamic volatility (DV) model with no skewness. The employment series is used monthly and follows a Student's t-distribution. GDP growth, on the other hand, is constrained to follow a normal distribution so that its scale parameter can be disaggregated temporally using approximation (28). Model (c) is labelled DV t. Model (d) is a constrained version of model (c) where the employment series is constrained to be Gaussian and is thus labelled DV. The relative performance of models (a) and (b) compared to models (c) and (d) thus provides an indication of the benefit stemming from modelling a common factor in shape parameters. Finally, models (e) and (f) are constant volatility versions of model (c) and (d) respectively and are referred to as the benchmark and t models. Table 4 summarises the different model specifications.

In this section I estimate all six models using the latest available vintage in this study (July 2021) . While this analysis in-sample cannot be used to infer the models' relative performances in real time, it serves two purposes. First, it is useful for illustrating the benefit of using a copula which capture cross-sectional dependencies in the prediction errors. Second, it is appropriate for studying the profiles that dynamic parameters take over time, especially scales and shapes. Table 3 shows estimation results and information criteria resulting from estimating each model. The total log likelihood of the model when using a copula is also compared to the total log likelihood when assuming conditional cross-sectional independence (independence copula). When comparing the total log likelihoods and the information criteria across models, it is important to bear in mind that the rolling quarterly model (a) and (b) cannot be compared to other models because they have different data : the related series is rolling quarterly whereas it is monthly in other models. The log likelihood attached to GDP, however, is comparable across all models. 

Introducing separately dynamic volatility and fat tails generate very large gains in the model log likelihood which can be observed by comparing the benchmark model with the Student's t and dynamic volatility models. Unsurprisingly, the model with both features is favoured by all information criteria. However, while the Skew t rolling quarterly model with dynamic volatility and shape yields a significant improvement in the total log likelihood, the log likelihood attached to GDP growth is lower than with the Skew normal model.

Using a copula yields a very large increase in the total log likelihood for each of the model's specification, suggesting important cross-sectional dependencies in the prediction errors. The copula comes at the expense of introducing only two additional parameters : the copula dependence parameter θ and the number of degrees of freedom attached to the Student's t copula ν cop . The dependence across prediction errors is greater in the rolling quarterly model with a dependence parameter of about 0.4 compared to round 0.2 for the models featuring a monthly related series.

The common scale component provides a consistent historical picture of forecasting uncertainty

The left panels of Figure 6 show the estimated dynamic scale of GDP growth alongside the scale common factor. The estimates are retrieved from model (a) which features Skew t series. The scale parameter shows an overall decrease in volatility starting in the second half of the eighties which has been documented extensively in the macroeconomics literature since McConnell and Perez-Quiros (2000) . Separately, large spikes occur during the recessions triggered by the 2007 financial crisis and lately by the coronavirus pandemic. These are largely coming from the common component. While the common component at the time of the financial crisis reaches levels not seen since the seventies and early eighties, its level during the Covid-19 induced recession is unprecedented. Overall the common volatility factor carries economic meaning and gives a consistent picture of forecasting uncertainty historically.

Downturns are associated with rising negative skewness and recovery with positive skewness

The right plots of figure 6 show the GDP growth shape parameter and shape common component, again resulting from model (a). The increase in forecasting uncertainty when entering recessions, discussed above, is generally associated with a sharp increase in the skewness parameter : the probability of negative prediction errors increases while positive prediction errors become less likely. This movement in the skewness parameter is then reversed during recoveries. The picture of the shape parameter over time is consistent with the findings of Carriero et al. (2020) and Delle-Monache et al. (2020) who show that, while statistical evidence for skewness in GDP growth is generally weak, this masks an erratic behaviour with periods of positive and negative skewness.

7 Real-Time Exercise Covering the Pandemic I estimate the models recursively using real-time data since the beginning of the pandemic. The first vintage used for estimation is the March 2020 vintage where employment data for February 2020 have just been published. This is effectively the last month before the effect of the pandemic started to emerge in the data. The model is re-estimated each month following the release of new data. It is thus possible to investigate whether capturing dependencies across series in shape parameters is useful for producing accurate density nowcasts of US GDP growth.

The GDP density nowcast in the last step of the nowcasting window is directly given by the one-step ahead prediction error density of GDP at the predicted location. But for earlier steps it is important to account for the uncertainty induced by missing values in the related series. This is done by drawing vectors of observations for each period with missing observations and using the score driven recursion to retrieve time-varying parameters in the next period, which are then used to generate new observations. Specifically, if the target is the GDP nowcast in period t, but the related series is missing from t − 1, then the one-step ahead joint density of the related series at t − 1 are used to draw prediction errors centred round the location estimate in t − 1. The score driven recursion is then applied separately on each prediction to retrieve sets of scale, shape and location parameters for period t. These new parameters yield one-step ahead densities which are used to draw prediction errors round the locations in t. Eventually, each vector of prediction errors in t yields a GDP nowcast for t through the filtering step of the score driven recursion. The density nowcast is given by the empirical density attached to these GDP nowcasts. Figure 7 shows the real-time nowcasted conditional mean and its associated 90% confidence for each model. There are four nowcasting steps per quarter. The grey line shows the first release of GDP. First, the models with dynamic volatility and both dynamic volatility and shape do significantly better than the model with constant scale and shape. This is most striking during the recession and the first quarter of the recovery. The model with constant scale and shape and Student's t series do especially poorly; the conditional mean and associated uncertainty barely adjust during the pandemic. In a score driven setting, fat tails can hinder the ability of a model to adjust quickly to the data because the score downweights the effect of outlying observations, as is illustrated in figure 3 . However, this is less of an issue when scale and shape parameters are allowed to vary. While the models with dynamic volatility (c) and (d) yield better forecasts during the second quarter of last year, the models with dynamic shape are the only models which capture the onset of the pandemic. This is more clearly visible in figure 8 which shows the density nowcasts derived at last step of the nowcasting window, that is approximately four weeks before the release of the first estimate GDP growth. The models with dynamic shape show a clear negative skewness, indicating an increase in the probability of negative prediction errors. Finally, the density nowcasts are compared across models using the average log score, which is given by the nowcasted log density round the observed value (the first estimate of GDP). The better the probabilistic forecast, the greater the log score. The point forecasts, given by the predicted conditional means, are evaluated using the mean absolute error. Figure 9 shows the average log score and mean absolute error at each nowcasting step during the quarter.

Step four follows the release of the previous-quarter GDP while step one (the last nowcasting step) follows the release of employment data for the last month of the quarter. The results are split between the recession phase (the first two quarters of 2020) and the recovery phase (the following four quarters). As more data accrue during the quarter, the average log score should increase monotonically while the mean absolute error should decrease. However, while predictions are generally better towards the end of the nowcasting window, they do not exhibit a monotonic improvement. This can be explained by the relatively low numbers of series modelled and the resulting low numbers of release dates during the quarters.

Nevertheless, the average log score and the mean absolute error remain useful to compare the models' predictive capacities.

In line with the previous results, the models with dynamic shape tend to yield better probabilistic predictions during the recession. This is especially apparent towards the end of the nowcasting window. This result is driven by the singular ability of the models with dynamic shapes to capture the beginning of the pandemic. The models featuring only dynamic volatility, however, do better during the recovery phase. Here using monthly employment data as opposed to rolling quarterly figures seems to matter most. It is therefore not possible to find a model which ranks best during each period and nowcasting step, suggesting that a combining models would be beneficial.

This paper derives a novel means for capturing changes in the dispersion and asymmetry of GDP nowcast uncertainty in a timely way. The approach relies on modelling crosssectional relationships in scale and shape parameters, controlling the dispersion and asymmetry of density forecasts, in series sampled on different frequencies and released asynchronously. A timely signal on GDP growth nowcast uncertainty is extracted from employment data using a bivariate model where both GDP and employment can have distinct conditional distributions. Their location, scale and shape parameters are decomposed into secular changes and common variations using score driven techniques. Residual dependencies across both series, which inevitably occur when unforeseen averse events hit the economy, are modelled through a copula for estimation. A pseudo real-time exercise shows that the model is able to extract a timely indication from employment data of the asymmetry attached to US GDP nowcast uncertainty during the onset of the coronavirus pandemic.

The score with respect to scale parameters is ∆ σ i,t = ∂logf i (y i,t |Y t−1 ) ∂σ i,t = (ν i,1 + 1) 1 + (

√ ν i,1 ) 2 − 1 /σ i,t 1(y i,t < µ i,t ), + (ν i,2 + 1) 1 1 + (

The elements of the information matrix corresponding to scale parameters are given by

The score with respect to shape parameters is

√ ν i,2 ) 2 1 (1 − α i,t ) 3 1(y i,t > µ i,t ).

The elements of the information matrix corresponding to shape parameters are given by I α i,t = E ∆ α i,t ∆ α i,t |Y t−1 = 3 ν i,1 + 1 α i,t (ν i,1 + 3) + ν i,2 + 1 (1 − α i,t )(ν i,2 + 3)

.

The formulae for the information matrix can be found in Zhu and Galbraith (2010) . Finally, by defining the vector a i,t = (µ i,t , σ i,t , α i,t ) , the score with respect to the time-varying parameters of series i is:

while the scaling matrix is Quarterly GDP; monthly IWH Note : The Skew normal distribution here is derived by constraining the AST distribution to have Gaussian tail parameters (ν 1 = ν 2 = ∞). The Skew Student's t is derived by constraining the AST distribution to have a unique tail parameter (ν 1 = ν 2 ). The Student-t is derived by constraining the AST distribution to have Gaussian shape and tail parameters (α = 0.5 and ν 1 = ν 2 = ∞). IWH = Index of Total Aggregate Weekly Hours.

Vulnerable growth

Tracking the slowdown in long-run gdp growth

Advances in Nowcasting Economic Activity:Secular Trends, Large Shocks and New Data

Now-Casting and the Real-Time Data Flow

A new class of robust observation-driven models

Weighted maximum likelihood for dynamic factor analysis and forecasting with mixed frequency data

Filtering and smoothing with score-driven models

Common drifting volatility in large bayesian vars

Measuring uncertainty and its impact on the economy

Assessing international commonality in macroeconomic uncertainty and its effects

Generalized autoregressive score models with applications

Observation-driven mixed-measurement dynamic factor models with an application to credit risk

Modelling and Forecasting Macroeconomic Downside Risk

Adaptive models and heavy tails with an application to inflation forecasting

Business cycle dynamics after the great recession: An extended markov-switching dynamic factor model

A quasi-maximum likelihood approach for large, approximate dynamic factor models

Time series analysis by state space methods

High-frequency monitoring of growth at risk

The generalized dynamic-factor model: Identification and estimation

Midas regressions: Further results and new directions

Nowcasting: The real-time informational content of macroeconomic data

Forecasting economic time series using score-driven dynamic models with mixed-data sampling

Forecasting, structural time series models and the Kalman filter

Dynamic Models for Volatility and Heavy Tails: With Applications to Financial and Economic Time Series

Density forecasting using bayesian global vector autoregressions with stochastic volatility

Predicting time-varying parameters with parameter-driven and observation-driven models

Temporal disaggregation of overlapping noisy quarterly data: estimation of monthly output from uk value-added tax data

Score-driven exponentially weighted moving averages and value-at-risk forecasting

Short-term gdp forecasting with a mixed-frequency dynamic factor model with stochastic volatility

A new coincident index of business cycles based on monthly and quarterly series

Output fluctuations in the united states: What has changed since the early 1980's?

An indicator of monthly gdp and an early estimate of quarterly gdp growth

Modelling asymmetric exchange rate dependence

A midas approach to modeling first and second moment dynamics

The construction of an index of business activity

A monthly indicator of gdp

New indexes of coincident and leading economic indicators

Forecasting using principal components from a large number of predictors

A generalized asymmetric student-distribution with application to financial econometrics

The score with respect to location parameters isThe elements of the information matrix corresponding to location parameters are given byAppendix B Model specifications