key: cord-0679601-qww3ptxo
authors: Tian, Weidong
title: Long Run Law and Entropy
date: 2021-11-11
journal: nan
DOI: nan
sha: e208242a10431f0616497b087d40ccc2f69d1ff4
doc_id: 679601
cord_uid: qww3ptxo

This paper demonstrates the additive and multiplicative version of a long-run law of unexpected shocks for any economic variable. We derive these long-run laws by the martingale theory without relying on the stationary and ergodic conditions. We apply these long-run laws to asset return, risk-adjusted asset return, and the pricing kernel process and derive new asset pricing implications. Moreover, we introduce several dynamic long-term measures on the pricing kernel process, which relies on the sample data of asset return. Finally, we use these long-term measures to diagnose leading asset pricing models.

One of the central assumptions in many leading economics and finance theories is the stationary and ergodic condition for the underlying economic variable(s). This paper presents long-run (asymptotic) properties for a general economic variable by the martingale theory without relying on the stationary and ergodic conditions. 1 We develop a long-run theory of unexpected shocks and derive its novel implications from a long-term perspective.

Specifically, given an economic variable Y that is represented by a process (Y t ), we investigate the following two processes,

n , n ≥ 1,

Here Y t − E t−1 [Y t ] is the unexpected shocks between time t − 1 and t, U n (Y ) is the arithmetic average of unexpected shocks of a variable Y , and V n (Y ) is its multiplicative variation (geometric average). In the additive long-run law, we show that U n (Y ) converges to zero under a condition that "the unconditional variances of the unexpected shocks are bounded from above by a finite positive number". We also show that this condition is both sufficient and necessary to derive meaningful long-run property to relax the stationary and ergodic condition.

This long-run law about U n (Y ) is motivated by the additive Doob-Meyer decomposition of a general stochastic process (Y t ) as follows, 1 Despite the great success of ergodic conditions in literature, several nonergodic and nonstationary models have derived important implications to economics and finance. See, for instance, Durlauf (1993) investigates the nonergodic economy, and Weitzman (2007) studies asset pricing implications in a nonstationary model.

Since the arithmetic average of the martingale (the first term) component in this decomposition converges to zero, the predictable component (the second term) describes the long-run property of the process (Y t ).

This long-run law is different from the long-run (additive) theory in Beveridge and Nelson (1981) , Hansen and Scheinkman (2009) , and Hansen (2012) in several aspects. First, the construction of a permanent (martingale) component in previous literature relies on specific technical conditions such as underlying ergodic factor even though the variable Y is not.

Second, the long-run law in U n (Y ) is about the conditional expectation and forecasting, while previous literature is mainly about the unconditional element. 2 Third, the long-run law offers concrete convergence rate, and finally, the long-run law implies new long-term measures.

Similarly, V n (Y ) is derived from the following multiplicative Doob-Meyer decomposition,

Third, for the stochastic discount factor process m = (m t ), the long-run law motivates a long-term measure, π(m; s) = lim This paper shows that this dynamic measure is bounded (from above or below) by sample of asset returns. In contrast with widely studied one-period (or conditional) measure of the stochastic discount factor in literature (see e.g. Hansen and Jagannathan (1991) , Snow (1991) , Bansal and Lehmann (1997) , Alvarez and Jermann (2005) , and Liu (2021)), π(m; s)

is defined for the entire stochastic discount process.

Fourth, for the risk-adjusted asset return process, the long-run law implies that (m 1 R 1 · · · m n R n ) 1 n converges to e −z∞(mR) , whereas Martin (2012) shows that m 1 R 1 · · · m n R n converges to zero in a generic sense. Moreover, we show that the Casero sum, m 1 R 1 +···+mnRn n , converges to one almost surely, even though m n R n diverges in general.

Fifth, the paper characterizes the long-term entropy of the pricing kernel process in terms of long-term sample excess return (continuously compounding). Therefore, the longterm entropy z ∞ (m) is independent of the specification of the pricing kernel process; instead, it depends only on the sample excess return of assets. Moreover, we demonstrate the relationship between the long-term entropy with other established long-term measures such as in Hansen (2012) , Backus, Chernov, and Zin (2014) , Dybvig, Ingersoll, and Ross (1996) .

Finally, under certain conditions on the pricing kernel process, we show the existence of the long-term short rate, without stationary and ergodic assumptions on the interest rate process.

Sixth, we make use of these new long-term measures to several leading asset pricing models. For the first long-term measure, π(m; s), we find that the long risk model (Bansal and Yaron (2004) ) performs better than the disaster model (Backus, Chernov, and Martin (2011) ). However, with the second long-term measure z ∞ (m), the disaster model performs better than the long risk model. Moreover, the internal habit model (Campbell and Cochrane (1991) ) is comparable to the disaster model. 4 Overall, our empirical results are consistent with several recent key observations that the conditional variance of the stochastic discount factor should contain some non-stationary and non-linear factors.

The remainder of the paper is structured as follows. We present an additive version of the long-run law of unexpected shocks in Section 2. We also introduce several variations of this long-run law in this section. Section 3 presents applications of the long-run law to asset returns. Section 4 shows the applications to the stochastic discount factor and riskadjusted asset returns. In Section 5 we present a multiplicative version of the long-run law and characterize the long-term entropy. Section 6 concludes, and technical developments are in Appendix. More technical details are given in the Online Appendix.

This paper considers a discrete-time economy with an infinite time horizon, t = 1, 2, · · · . The state of nature is represented by (Ω, F , (F t ), P ), where F t denotes the set of all available information up to time t, F = (F t ) is a filtration of sigma-algebras F t , and P is a probability measure. E t [·] denotes the expectation conditional on information available at time t when no misunderstanding may arise.

An economic variable Y is represented by a F -adapted process (Y t ). In this framework, an economic variable can be observable such as an asset price, asset price return, consumption (level) or growth rate, interest rate, inflation, weather and mortality data; and it can be also unobservable such as stochastic discount factor, risk-adjusted asset returns and pricing kernel. If Y is observable, we call each Y t an observation at time t. If Y is not observable, Y t is the realization of the variable Y under certain model assumptions. For a consistent purpose, we name Y t the realized value at time t. Throughout this paper, the process (Y t ) satisfies the following assumption.

is not complete to judge these leading asset pricing models. Instead, our comparison exercise justifies to some extent the extensions of these models as in recent literature. 

is the best forecasting of Y t from the perspective of time t − 1, the forecast is finite by Assumption I. The difference Y t − E t−1 [Y t ] between the realized value and its forecasting value is the one-ahead forecasting error, representing the unexpected shock between time t − 1 and time t. In terminology, we do not distinguish it from shock or martingale difference in this paper.

Define a sequence of random variables,

as the arithmetic average of all one-ahead forecasting errors up to time n. The main result of this section is an asymptotic property of U n (Y ) when n → ∞, a long-run law of the unexpected shocks.

To guarantee the existence of the long-run law, the following assumption is imposed.

Proposition 1 Under Assumption I and II for (P, Y ), then, for any ǫ > 0,

In particular,

Assumption II in Proposition 1 is not only sufficient but also necessary to guarantee the long-run law in general. To demonstrate it, let Y n = nζ n , where ζ n are IID, N (0, 1).

) and (n+1)(2n+1) 6n → ∞, the sequence of normal random variables ζ 1 +···+nζn n does not converges to zero almost surely.

Actually, this sequence does not converge to any random variable almost surely since a limit of normal random variables is a normal random variable. Moreover, the central limit theorem implies that the number 1 2 in Proposition 1 is the best possible exponent.

Assumption II is used to relax the stationary and ergodic assumptions in ergodic theory or numerous technical conditions in the strong law of large numbers. We use two examples to illustrate the long-run law in a nonstationary and nonergodic setting.

Assumption II holds if and only if |β| < 1. There are several ways to extend this standard GARCH(1,1) model in a nonergodic setting (See Kristensen (2009) for characterization of GRCH(p,q) model). For example, ǫ t = σ t z t and z t ∼ N(0, ν 2 t ) and z t is independent from F t−1 . In this case, the conditional variance of the shock is V ar t−1 (ǫ t ) = σ 2 t ν 2 t . Assumption II holds as long as the variance of z t is uniformly bounded from above and |β| < 1. As another example, let ǫ t = σ t z t ω t , here z t are IID with E[z t ] = 0, E[z 2 t ] = 1, but ω t is independent from the sigma algebra generated by {F t−1 , z t } and E t−1 [ω 2 t ] ≤ L, ∀t. In each situation, (ǫ t ) is nonergoric but Assumption II holds, yielding the long-run law in Proposition 1 for the shocks.

Example 2.2 Consider a Bayesian learning model for (Y t ) with a predictive distribution p(Y t |θ t ) for a stochastic and unknown variable θ t . Assuming p(θ t |Y t ) is known, and F t is generated by {Y 1 , . . . , Y t }, we obtain the posterior dsitribution Y t+1 |F t = f (F t ) + ǫ t+1 . In some nonergodic settings with hidden, unknown parameters, the variance of the shock does not necessarily converge to zero but moves inside a finite range (Weitzman (2007) , Bakshi and Skoulakis (2010)). In this case, Assumption II holds, and the arithmetic average of shocks converges to zero.

It should be emphasized that the uniform upper bound condition of the unconditional variance in Assumption II is nothing about the convergence of the conditional variance of the shocks. Clearly, Assumption II does not imply the convergence of the conditional variance.

Moreover, Assumption II could fail even though the conditional variance converges to zero almost surely. For example, let ǫ t = σ t z t , and σ 2 t = 3σ 2 t−1 z 2 t−1 . We assume that z t ∼ N(0, 1). In this case, V ar t−1 (ǫ t ) = σ 2 t , and E[σ 2 t ] = 3 t → ∞; however, V ar t−1 (ǫ t ) → 0, P − a, s. (Nelson (1990) ).

This subsection presents several alternative version of the long-run law of shocks.

Proposition 2 Let 0 < a ≤ 1, under Assumption I and Assumption II, the following equation holds.

Proposition 2 states that the weighted-average of unexpected shocks converges to zero, when a higher weight is associated with a later (closer) sample data. Nagel and Xu (2021) demonstrate implication of forming belief with higher weight to more recent observation (See Section 3 for the application of the long-run law to form expectation).

The next one is a long-run law of unexpected shock under higher moments.

Then, for any ǫ > 0,

in L 4 (Ω, P ). 

in which the predictable component (A n ) determines the long-term rate of (Y t ).

Given a general non-stationary process (Y n ), besides the Doob-Meyer decomposition, there are a number of ways to identify shocks with permanent martingale component. See, 5 Specifically, in addition to certain Lindeberg condition for MCLT, the series

≤ L, ∀t for two positive numbers l and L, this condition for MCLT and Assumption II are satisfied. But the condition for MCLT rules out the case that E t−1 [V ar(Y t )] decays fast in the long run, and a fast decay rate of the unconditional variances of shocks lead to a better convergence rate of the long-run law.

for instance, Beveridge and Nelson (1981) and Hansen (2012) . In a Markov environment with state variable (X t ) and under certain conditions, Hansen (2012, Theorem 3.1) shows

where (M t ) is a martingale permanent component and the second component, g(X n )−g(X 0 ), is stationary. It is shown that M n dominates the fluctuation of Y n over long time horizons.

The number ν represents the trend of the time series data Y n . By Proposition 4,

Equation (5) and (7) demonstrate the difference between these two martingale decomposi- n , thus the long-term rate of Y . In contrast, the martingale component in the Doob-Meyer decomposition enables us to characterize the unexpected shocks,

Hence, by Proposition 4, the long-term rate of Y exists if lim n→∞ To proceed, we use a number of conventions to keep the notation consistently in applications. (i) R t,t+1 or R t+1 denote any risky asset's growth return over the period t to t + 1, and R f,t the risk-free rate of growth return over the same time period. The risky asset can be an equity, equity index or a portfolio. In general, R t,t+n is the grown return over an n-period t to t + n. (ii) (M t ) denotes a pricing kernel (or state price density) process with M 0 = 1, and the financial market might be incomplete. Similarly, m t = Mt M t−1 , t ≥ 1 denotes the stochastic discount factor over the period t − 1 to t. 

The continuously compounding short rate at time t is written as r f,t = −logD(t, t + 1). (iv) Finally, a continuously compounding return of a risky asset over the period t to t + 1 is written as r t+1 = logR t+1 .

This section presents several implications, in the form of "corollaries", of the long-run law of unexpected shocks to asset returns in a financial market from an asymptotic perspective.

We start with a reformulation of Proposition 1 as follows.

Corollary 3.1 Under Assumption I and II for a return process (R t ) and a probability mea-

In spite of its innocuous restatement of Proposition 1, Corollary 3.1 has interesting implication for asset pricing. In the above expression, the firm term R 1 +···+Rn n is the average of the realized sample data which is available for a long-lived agent, so it is termed as a sample mean. Its limit (if exists) is a long-term sample mean. On the other hand, the second term n t=1 E t−1 [Rt] n depends on the probability measure (belief) P and the distribution (model) of asset return. To be different, we name it the long-term expected return under belief P and assumption on the asset return. Corollary 3.1 states that the long-term expected return under any belief and model assumption equals the long-term sample mean.

Corollary 3.2 Under Assumption I and II for a return process (R t ) and a probability measure P , then (if at least one limit exists)

Corollary 3.2 states that the long-term expected excess return equals the long-term sam-

From an empirical perspective, the sample (arithmetic) average process displays a better stable shape than an asset return process (R t ). For instance, the standard deviation of the sample (arithmetic) average of excess returns is 0.32% for daily return (from 1962 to 2020), and 1.11% for monthly return (from 1926 to 2020), respectively, yielding the existence of a long-term sample excess return (see the details in the Online Appendix).

A long-lived agent is able to compute the long-term sample excess return; then, she can use Corollary 3.2 to see whether a model is meaningful concerning on the expected return.

To estimate a model-free long-term expected return, we follow Martin (2017) to use a modelfree lower bound of the expected return with available derivative (S &P 500 index options) data. Specifically, under Martin (2017)'s negative correlation condition (NCC), and let Q be a risk-neutral probability measure, Martin (2017) shows 6 Hence, under Assumption I and II for P and (R t ), but without model assumption about the asset return though, NCC and Corollary 3.2 imply

While Equation (8) can be verified by a long-lived agent, a short-lived agent is only able to approximate it by a large sample of available data. Still, Equation (8) is useful with available sample data. As an illustration, Figure 

where S t denotes the index price at t, F t,t+1 is the index's future value at time t with maturity t + 1, and Call or Put represent the index call or put option. We follow the same method in Martin (2017) to compute the integrals on the right hand using market available index options.

So far, we do not discuss the role of the probability measure P in the long-run law. A decisionmaker can form (subjective) probabilistic expectations by using historical data (empirical probability) or survey respondent probability (Manski (2001) ). Understanding belief and subjective expectation formation from data have been attracted lots of interest recently in asset pricing. In this subsection, we study the difference between the subjective and objective expectation in a long run.

LetP andẼ represent the subjective probability and the corresponding expectation. By contrast, the objective probability is denoted by P . The next result builds a link between the subjective expectation and objective expectation in the long run as follows.

This corollary follows directly from Proposition 1 by comparing with the realized return R t on each term. It states that the long-run expectation difference between any subjective expectation and objective expectation is zero, as long as these formation expectations do not move too significantly (Assumption II holds for both P andP ). Malmendier and Nagel (2011, 2016) demonstrate the difference in inflation between the subjective and objective perspectives. Empirically, Nagel and Xu (2021) also demonstrate the difference between subjective asset return and objective asset return. 7 However, the difference between the long-term subjective expectation and the long-term objective expectation of an economic variable should be merely small and disappears in a long run. Furthermore, the long-term expected excess return is independent of the subjective probability (belief).

Since the long run law of unexpected shocks in Proposition 1 holds for any probability measure, it is natural to consider the risk-neutral probability measure in Proposition 1, assuming the existence of a risk-neutral probability measure Q in a financial market.

Corollary 3.4 Let Q be a risk-neutral probability measure, and for one risk asset with return process (R t ), there exists one positive number L such that

then the long-term sample excess return of this asset is zero.

Under the risk-neutral probability measure Q, Assumption I is evident as

Since Assumption II folds for (Q, (R t )) by Equation (9), Corollary 3.4 follows directly from Proposition 1 and Corollary 3.2 (for the risk-neutral measure). However, Corollary 3.4 seems counterintuitive for the following reason. Let us consider the market (index) return R as an example. On the one hand, the bottom panel of Figure 1 plots the time series of V ar Q t (R t+1 ) between 1996 to 2020. In average, the level of risk-neutral variance is about 1.55 percent, and takes only significant value at certain time period. Therefore, it is reasonable to argue that Equation (9) holds for the market return. On the other hand, it is also empirically solid that the long-term sample excess return of the market index is positive (positive equity premium). Granted, Corollary 3.4 implies that there exists free lunch in the market since there is no risk-neutral probability measure! Example 3.1 Consider a financial market with one risky asset (index) and its return process

and {α t } is a deterministic sequence of real numbers in (0, 1). Assuming R 0 = 0 and the rate of risk-free interest is always zero. It is straightforward to see that Q is an unique martingale measure if and only if

In this example of Schachermayer (1994) , Assumption II holds for (Q, (R t )) if and only if the series ∞ n=1 (1 − α 2 n ) is finite. By Kakutani's theorem (see, Willams (1991, 12.7)) 8 , this infinite series is finite if and only if Q is equivalent to P . Put it differently, if the series ∞ n=1 (1 − α 2 n ) = +∞, then Q and P are mutually singular; and therefore, there is no equivalent martingale measure in this financial market. As a consequence, there is a free lunch with bounded risk. Hence, for this example, Assumption II holds if and only if there is no arbitrage opportunity. 

is sufficiently large, for the market return process (R t ). According to Corollary 3.5, any long-lived agent must see either arbitrage opportunities in the equity market or significant equity market turmoil persistently.

This subsection presents an application to the survey expectation that whether this survey return reflects a risk-neutral expectation or a pessimistic expectation return. Adam, Matveev, and Nagel (2021) demonstrate that both hypotheses are wrong empirically and robustly. As an application of the long-run law, we provide an alternative theoretical argument for why these hypotheses are invalid since the long-term sample excess return of the market (or any risky asset) is positive.

For any agent i with a subjective probability measure P i , Q i represents her martingale measure. Here, we only use the martingale measure, not a stronger risk-neutral measure condition, and do not need equivalence between each subjective probability measure. Actually, these subjective probability measure can be mutually singular. 9Ẽi [·] denotes the subjective expectation under P i .

Following Adam, Matveev, and Nagel (2021), write the following risk-neutral hypothesis and pessimistic hypothesis, respectively 10 ,

where the measurement error ǫ i t captures the fact that the agent empirically measure expectations with noise. Assume each noise ǫ i t ∈ F t andẼ i t−1 [ǫ i t ] = 0, and the variance of ǫ t are uniformly bounded from above for all t.

Corollary 3.6 Under Assumption I and II for (P i , R t ), and the long-term excess return for (R t ) is positive for agent i, then Risk-Neutral and Pessimistic Hypothesis fail for agent i.

This section discusses the application of the long-run law to the pricing kernel process and the risk-adjusted asset return processes. The financial market is non-arbitrage. There exists a pricing kernel or state-price density process (M t ) with a given probability measure P .

We start with the pricing kernel process. Since E t [m t+1 ] = D(t, t + 1), the next result follows from Proposition 1. 

In this subsection, we derive a long-run property of higher moments of pricing kernels, (2021)). Building on this property, we next introduce a long-term measure of the stochastic discount factor to compare several leading asset pricing models and discuss its applications. where R = (R t ) runs through all asset return processes. If 0 < s < 1, then

Corollary 4.2 is closely related to Snow (1991) for s > 1 and Liu (2021) for s < 1. It states that the time series of higher moments of stochastic discount factors are bounded (from above or below) by the arithmetic average of the higher-order moments of asset returns. 11 It is possible to construct non-ergodic term structure model. For example in Ingersoll, Skelton, and Weil (1978), r f,t = r 0 + δN t where N t is a Poisson process with intensity λ and jump size 1. A sequence (a n ) might diverge but its arithmetic average (Cesaro sum) converges. For example, a n = 1, for even n and a n = 0 for odd n. Then lim n→∞ a1+···+an n = 1 2 .

By Corollary 4.2, it is temping to introduce a measure (assuming the limit exists), lim n→∞ n t=1 m s t n , to diagnose asset pricing models. To make good use of this measure, we need to estimate both the stochastic discount factors and higher-order moments of asset return. For the first one, a specification of a pricing kernel is often derived from representative agent's preference and macro-economic data like consumption growth or market return data.

However, to calculate the higher-moments of asset returns is challenge since it depends on distribution assumptions on asset returns.

To avoiding assumption about asset return distribution, we derive a duality result of 

for each pricing kernel process (M t ) and any real number s = 0, s = 1, and name it a long-term higher moments of SDF.

To illustrate, we make use of two leading asset pricing models in (2004)).

where g t+1 is the consumption growth rate, ǫ t+1 ∼ N (µ, σ 2 ), and η t+1 |(J = j) ∼ N (jθ, jν 2 ) and J is a Poisson random variable with the jump intensity parameter ω. ǫ t+1 and η t+1 are independent.

In this disaster model, for any s = 0 (see Liu (2021) , equation (C5)),

Since E t m s t+1 is a constant across the time, the left side in Corollary 4.3 is calculated from the last equation easily. 

where θ = 1−γ 1− 1 ψ , z t = A 0 + A 1 x t + A 2 σ 2 t and the state variable (x t , g t , σ t ) sasisfies

and IID w t+1 , e t+1 , η t+1 ∼ N (0, 1). κ 0 , κ 1 , A 0 , A 1 and A 2 are calculated explicitly by model parameters.

In this long run risk model, following Bansal and Yaron (2004) ,

for three deterministic functions φ(s), α(s) and β(s). Since (σ 2 t , x t ) is stationary and ergodic, E t m s t+1 is also a stationary and ergodic process. Therefore, it is straightforward to obtain π(m; s) = exp φ(s) Figure 2 displays the long-term higher moments of a stochastic discount factor in the disaster and long run risk models. For the disaster model, we use the values of parameters {β, γ, µ, σ 2 , ω, θ, ν 2 } calibrated in (Liu (2021, Then, a larger value of the long-term higher moments of m t+1 leads to a better asset pricing model. As shown, the long run risk model performs better than the disaster model in using though E t [m s t+1 ] is still stationary and ergodic, its conditional variance is determined by the randomness of the process x t and stochastic variance process σ 2 t . Consistent with Backus, Chermov and Zin (2014), and Liu (2021), Figure 2 suggests the importance of the conditional variance of the stochastic discount factor or higher moments in building asset pricing models.

In this subsection, we study the long-run law of the risk-adjusted asset return process.

Notice that E t−1 [m t R t ] = 1, the next result follows from Proposition 1 clearly.

Corollary 4.4 Assume that V ar(m t R t ) ≤ L, ∀t for a positive number L, then

for any positive number ǫ.

Assuming the negative correlation between the pricing kernel with the asset return, the uniform upper bound of the unconditional variance of m t R t follows from the uniform upper bound of the variance of the pricing kernel, and the second moment of asset return. If so, the Casero sum of the sequence m n R n converges to 1, almost surely. As a consequence, if m n R n converges, its limit must be one as well.

Martin (2012) studies the properties of the positive martingale (m 1 R 1 · · · m n R n ) when n goes to infinity (long-dated asset pricing). Specifically, assuming independent risk-adjusted asset return m n R n , then m n R n converges if and only if ∞ n=1 V ar( √ m n R n ) < ∞, and if so, m n R n → 1. On the other hand, if the series ∞ n=1 V ar( √ m n R n ) = ∞, it is shown that lim n→∞ (m 1 R 1 · · · m n R n ) = 0. 12 Clearly, a uniform bound of the variance of the risk-adjusted asset return (Assumption II) is weak compared with a convergent series of the variances in Martin (2012) . Moreover, even though m n R n diverges in general, its Cesaro sum converges to 1 with a converge rate 1 2 − ǫ.

In contrast to Martin's probabilistic approach, we next present an analytical approach to show that m n R n diverges in a generic sense, and m n R n converges to one in certain special cases.

Corollary 4.5 Assume that V ar(m t R t ) ≤ L, ∀t for a positive number L.

1. If m n R n slowly decreases in the sense that lim inf(m k R k − m n R n ) ≥ 0, a.s. when k n → 1, k > n → ∞, then lim n→∞ m n R n = 1, a.s. (non-generic case) 2. If lim inf(m k R k − m n R n ) < 0, a.s., for certain sequence {n, k} such that k n → 1, k > n → ∞, then m n R n diverges, a.s. (generic case) 12 By a non-generic case in Martin (2012) it means that m n R n → 1, a.s. If there exists a positive number δ such that the series

s., then it is shown that m n R n → 1, a.s.. On the other hand, if the product series E t−1 [ √ m t R t ] diverges almost everywhere (generic), it can be shown that m 1 R 1 · · · m n R n → 0, a.s. By Corollary 4.4, the infinite series a n is Cesaro summable 13 , where a n = m n R n − m n−1 R n−1 . In analysis, to show one Cesaro summable series a n is summable under certain conditions is the classical Tauberian theory (Korevaar, 2004) . For instance, in Corollary 4.5, if the risk-adjusted asset short-term return m n R n in the time period [n, n+1] slowly decreases, then m n R n converges to its Casero limit, 1.

Nevertheless, due to the high degree of uncertainty of the risk-adjusted return, the riskadjusted short-term return m n R n does not slowly decrease in general. To illustrate, the condition that the risk-adjusted return in the period [k − 1, k] is strictly smaller than the risk-adjusted return in the period [n−1, n] denotes that there is a reversal of the risk-adjusted return from the period [k − 1, k] to the period [n − 1, n]. For a long-lived asset, there should be infinitely many reversals of the risk-adjusted return; otherwise, the risk-adjusted return would increase eventually, a contradiction to the risky nature of the financial asset return.

Therefore, the sequence of the risk-adjusted asset returns m n R n should diverge in the generic case.

This section develops a theory of the multiplicative version of the long-run property of economic variables and presents its implications to asset returns and stochastic discount factor. Given a non-negative process Y t with E t−1 [Y t ] > 0, a.s., the multiplicative version of

For any F -adapted process (x t ), the conditional entropy

This conditional entropy measures the risk of the F t+1 -variable x t+1 . The entropy process (J t (x t+1 )) measures the dynamic risk of the process (x t ). Moreover,

x n of real numbers x n is classical summable if the partial sum s n = x 1 + · · · + x n has a finite limit wnen n goes to infinity. It is Cesaro summable if the sequence s1+···sn n has a finite limit when n goes to infinity. 

For a non-negative process (Y t ), assuming the existence of the following limit and we include +∞ as plausible limit, define

z ∞ (Y ) is the long-term entropy of (Y t ).

Proposition 5 For a general positive process (Y t ) with Assumption I,

If Assumption II holds for the process (log(Y t )), and z ∞ (Y ) exists, then

The first part of Proposition 5 is non-trivial, and it essentially implies the classical Dybvig, Ingersoll and Ross (1996) ' long forward rate theorem (see its proof in Appendix A).

By the multiplicative Doob-Meyer decomposition as follows (Williams, 1991) ,

where (L n ) is a martingale and (B n ) is predictable process. Therefore, the long-term growth rate of a general process (Y t ) is bounded above by the long-term growth rate of its predictable component, that is,

More importantly, the second part of Proposition 5 determines the long-term growth rate of a general positive martingale precisely. If the conditional variance of log(Y t ) have a bounded expectation, or alternatively, the conditional variance between log(Y t ) and its one-step ahead forecasting E t−1 [log(Y t )] is bounded, then the long-term growth rate of a martingale is its long-term entropy.

Proposition 6 Assuming (Y t = X 1 · · · X t ) is a positive multiplicative martingale, that is,

satisfies Assumption II, then, there exists a subsequence t 1 < t 2 < · · · and a positive random random ζ such that

Proposition 6 shows the existence of the long-term growth rate in a weaker sense. Imposing further technical assumption, Proposition 6 implies the existence of long-term entropy.

Therefore, in the subsequent discussions, we do not document these technical conditions but simply assume the existence of the long-term entropy. 

We consider a risk-adjusted asset return process (M t R 0,t ) for a return process (R t ) and a pricing kernel process (M t ). In the following discussions, the asset's gross return is always strictly positive, so log(R t ) and the relevant long-term entropy is well defined. 

In particular, if E[V ar t−1 (log(m t ))] ≤ L, ∀t, and z ∞ (m) exists, then for R f 0,n−1 = R f,0 · · · R f,n−1 ,

Moreover, if the long-term growth rate of the pricing kernel process lim n→∞ log(Mn) n exists, then there exists a limit,

As stated in Section 3.3, in a non-generic case, the sequence m n R n → 1, a.s., thus (m 1 R 1 · · · m n R n ) 1 n → 1. In this case, the asset is asymptotically optimal growth portfolio and the pricing kernel is the reciprocal of the optimal growth portfolio. However, for the generic case, as shown in Martin (2012) , (m 1 R 1 · · · m n R n ) → 0, a.s.. Corollary 5.1 is stronger in that the geometrical average, V n (mR), converges to e −z∞(mR) , almost surely.

Again, if the short rate process (r f,t ) is stationary and ergodic, then r ∞ = lim n r f,n exists and r f,∞ = r ∞ . Interestingly, under certain condition about the stochastic discount factor, we obtain the existence of r f,∞ . For simplicity, r f,∞ is called a long-term short rate.

The next result characterizes the long-term entropy of the pricing kernel using the excess asset return.

In a no-arbitrage financial market with a stochastic discount factor process (m t ), for any asset return R t such that E[V ar t−1 (logR t )] is uniformly bounded above by a positive constant,

n .

Moreover, if there exists a positive number L such that E V ar t−1 ( 1 mt ) ≤ L, ∀t, then

where R runs through processes of asset returns, that is, (m 1 R 1 · · · m t R t ) is a martingale.

This result states that the long-term entropy of the stochastic discount factor must be bounded from below by any risky asset's long-term excess return (in continuously compounding). It is remarkable to compare this characterization of the long-term entropy with the duality theorem in Hansen-Jaganathan (1991) 

In this subsection, we discuss the relation between the long-term entropy and other long-term measures in earlier literature (Hansen (2012) , and Backus, Chernov, and Zin (2014)).

Following Backus, Chernov, and Zin (2014), define

In particular, for n = 1, since logE t [m t+1 ] = −r f,t , the shortest-horizon entropy is

Moreover, if the limit exists,

Different from Backus, Chernov, and Zin (2014) in notations, we use the script "t" to represent the conditional on time t to calculate the entropy before computing its unconditional mean. Define the long-term yield at time t by

Notice that y ∞ t = −log lim T →∞ D(t, T )

is the long zero-coupon rate introduced in Dybvig, Ingersoll and Ross (1996) . Lastly, we define the long-term growth (or decay) rate

which is an conditional version of the long-term rate in Hansen (2012) .

and (r f,t ) are stationary and ergodic, then z ∞ (m) = I t (1), ∀t. In general, under regularly conditions,

Moreover, I t (∞) and ρ t (M) are decreasing with respect to t. is not discussed in Backus, Chernov and Zin (2014). Since the long forward rate never fall in an arbitrage-free market (Dybvig, Ingersoll and Ross, 1996) , the sequence y ∞ t never fall with time. Therefore, I t (∞) is non-increasing with time t. Equation Under Assumption II for asset return process (logR t ) and (logR f t,∞ ), then

Moreover, if there exists a positive number L such that E V ar t−1 ( 1 mt ) ≤ L, then

where R runs through all asset returns process such that M t R 0,t is a martingale, and r f t,∞ = logR f t,∞ .

According to Corollary 5.3, the long-term entropy of the permanent pricing kernel is

Empirically speaking, the size of the number δ −r f,∞ is very small from the bond market, so z ∞ (M P ) is very close to the long-term entropy. Indeed, Equation (38) states that the long-term entropy is the maximum long-run excess asset return over the infinite-maturity (console) bond return. Finally, by its definition, the number δ is the sample average of the continuous return, R f t,∞ . Compared with the long-term short rate r f,∞ , the number δ concerns the return of long-term bond.

This paper develops the additive and multiplicative version of the long-run law of unexpected shocks for economic variables. These long-run laws of unexpected shocks rely upon only a uniform upper bound of the unconditional variance of the shocks, and this condition is also necessary to derive meaningful asymptotic results. The asset pricing implications of the longrun laws are related to some essential insights of the following theories. (1) The long-dated asset valuation and tail event analysis in the long-term (Martin, Weitzman, Nordhaus) . (2) The long-run theory of stochastic discount factor and risk-adjusted asset return (Hansen and Scheinkman). The long-run analysis implies several long-term measures such as π(m, s) and z ∞ (m). We characterize these measures in terms of sample data of asset returns and interest rate only.

Moreover, we use these new characterizations to several leading asset pricing models. These results suggest the importance of these long-run laws to non-ergodic and non-stationary economies. 1991), the sequence 1 bn n k=1 (b k c k )Z k converges to zero almost surely. If each Y n ∈ L 2 (Ω), thenŨ n ∈ L 2 (Ω), Then by the Doob's martingale convergence theorem again, the sequence 1 bn n k=1 (b k c k )Z k converges to zero in L 2 (Ω). Then our result follows from a L 2 -type Kronecker lemma, which proof can be easily modified from Willams (1991).

To prove the first part of Proposition 5 in a general situation, we need the following lemma, which belongs to Hubalek, Klein, and Teichmann (2002). Lemma 6.1 Given a non-negative random variable sequences X n and X n → X ∞ , a, s.. If lim inf n→∞ E[X n n ]

1 n = C < ∞, a.s., then X ≤ C, a.s..

Let X n = (m 1 R 1 · · · m n R n ) 1 n , then X n n = m 1 R 1 · · · m n R n , so C = lim inf n→∞ E[X n n ] 1 n = 1. Then, any convergence subsequence of (X n ) has a limit X ≤ 1. It implies that lim sup n X n ≤

For the first part, let X n = D n = n t=1 Therefore, by Lemma 6.1, we have shown that lim sup n→∞ D n ≤ 1. The second part follows from Proposition 1 and the definition of z ∞ (Y ).

Proof of Proposition 6.

The Doob-Meyer decomposition of log

. By using the Weizsacker-Kolmos' theorem (Weizsacker, 2004) for non-positive random variables T u , there exists a subsequtence t 1 < t 2 < · · · such that Tt 1 +···+Tt n n → ζ, a.s.. Moreover, by the same proof of Proposition 1, we can show that The top pannel displays the daily time-series risk premium (in annual) of S P500 index over the risk-free rate of return from Jan 4,1996 to Dec 31, 2020, and the time-series risk-neutral variance, n n=1 1 R f,t−1 V ar Q t−1 (Rt) n in the same time period. Under Negative Correlation condition (between asset return and its risk-adjusted asset return), Martin (2017) shows that the expected risk premium is bounded from below by 1 R f,t−1 V ar Q t−1 (R t ). The bottom panel displays the time series of VIX (in percent) in the same time period. The VIX is divided by 10 to have a better comparison with the risk-neutral variance on level. The correlation between the risk-neutral variance and VIX is 0.84, so the risk-neutral variance is also a reasonable measure of the financial market turmoil. This figure displays the long-term entropy of the pricing kernels in a disaster model and a long run asset pricing model. I use the same specification and model parameters of the pricing kernels as in Figure 2 . As shown, the long-term entropy z ∞ (m s ) in a long run model is smaller than that in a disaster model. Indeed, z ∞ (m) = 0.015 in the long run risk model. By Corollary 5.2, the long-run excess mean of monthly asset return (continuously compounding) is bounded by 1.5%, which is clearly too small. In the disaster model, the long-term entropy is z ∞ (m) = 0.0885. Equivalently, the long-run excess mean of monthly asset return (continuously compounding) is bounded by 8.85%. The reason of a small long-term entropy in the the long-risk model is due to a too high long-term short rate r f ∞ = 2.3166 (annually). In contrast, the long-term short rate is 2 %. Therefore, a better asset pricing model should have a small long-term short rate but a large long-run excess mean of asset return, from a long run perspective. 

Do Survey Expectations of Stock Return Reflect Risk Adjustments?

Robust Economic Implication of Nonlinear Pricing Kernels

Using Asset Prices to Measure the Persistence of the Marginal Utility of Wealth

Disaster Implied by Equity Index Options

Sources of Entropy in Representative Agent Models

A New Formula for the Expected Excess Return of the Market

Do Subjective Expectations Explain Asset Pricing Puzzles?

Growth-Optimal Portfolio Restrictions on Asset Pricing Models

An Empirical Evaluation of the Long-Run Risks Model for Asset Prices

Risks for the Long Run: A Potential Resolution for Asset Pricing Puzzles

Rare Disasters and Asset Markets in the twentieth Century

Conditioning Information and Variance Bounds on Pricing Kernels

A New Approach to Decomposition on Economic Time Series into Permanent and Transitory Components with Particular Attention to Measurement of the 'Business Cycle

By Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior

Catching Up With the Joneses: Heterogeneous Preferences and the Dynamics of Asset Prices

Nonparametric Stochastic Discount Factor Decomposition

Parameter Learning in General Equilibrium: The Asset Pricing Implications

Asset Pricing when 'This time is Different

Beyond Arbitrage: Good-Deal Asset Price Bounds in Incomplete Markets

Nonergodic Economic Growth

Long Forward and Zero Coupon Rates can Never Fall

What is the Consumption-CAPM Missing? An Information-Theoretic Framework for the Analysis of Asset Pricing Models

Dynamic Value Decomposition in Stochastic Economies

Consumption Strikes Back? Measuring Long-Run Risk

Implications of Security Market Data for Models of Dynamic Economics

Long Term Risk: Am Operator Approach

Central Limit Theorems for Martingale with Discrete or Continuous Time

A General Proof of the Dybvig-Ingersell-Ross Theorem: Long Forward Rates Can Never Fall

Duration Forty Years Later

Long-Run Risk through Consumption Smoothing

Tauberian Theory, a Century of Developments

On Stationarity and Ergodicity of the Model with Applications to GARCH Models

Index Option Returns and Generalized Bounds

Measuring Expectations

O the Valuation of Long-Dated Assets

What is the Expected Return on the Market?

Asset Pricing with Fading Memory

Stationary and Persistence in the GARCH(1,1) Model

The Economics of Tail Events with Application to Climate Change

Higher-Order Effects in Asset-Pricing Models with Long-Run Risks

Martingale Measures for Discrete-Time Processes with Infinite Horizon

Diagnosing Asset Pricing Models using the Distribution of Asset Returns

A Bayesian Approach to Diagnosis of Asset Pricing Models

The Representative Agent of an Economy with External Habit-Formation and Heterogeneous Risk-Aversion

Subjective Expectations and Asset-Return Puzzles

On Modeling and Interpreting the Economics of Catastrophic Climate Change

Can One Drop L 1 -Boundedness in Kolmlos Subsequence Theorem?

Probability with Martingales

Premium Time-series Average of Risk-Neutral Variance and Risk Premium Risk-neutral Variance Risk Premium

In this Appendix, I present the proofs of major results. In the Online Appendix I provide the proofs of other propositions and all corollaries.

Claim: For any monotonic positive real numbers sequence b n ↑ +∞, c n ↓ 0 with ∞ n=1 c 2 n < ∞, we haveOn the one hand, choosing b n = n α , c n = 1 n α , α > 1 2 , we obtain Proposition 1. On the other hand, let c k = 1 k , b k = ka −k then n c 2 n < ∞ and b n ↑ ∞ since 0 < a ≤ 1. Then,

It remains to prove the "Claim".By its definition,Ũ n is a martingale. Moreover,Then, by the Doob's martingale convergence theorem (William, 1991),Ũ n converges almost surely to a finite variable with finite moment. By the Kronecker lemma (William,