key: cord-0446894-zp7132e9
authors: Wang, Haixu
title: Large and moderate deviations for a discrete-time marked Hawkes process
date: 2020-03-11
journal: nan
DOI: nan
sha: 95d061fa680200c15e1c52f793f861a283d98778
doc_id: 446894
cord_uid: zp7132e9

The Hawkes process is a simple point process with wide applications in finance, social networks, criminology, seismology, and many other fields. The Hawkes process is defined for continuous-time setting. However, data is also recorded in a discrete-time scheme. In this paper, we study a discrete-time marked Hawkes process first proposed in (Xu et al., 2020), which is appealing for applications. In particular, we study large and moderate deviations for the model.

Hawkes process is a self-exciting simple point process named after [Haw71] . Hawkes processes originate from statistical literature to model the occurrences of earthquakes and shocks after earthquakes, see [VJ75] . In contrast to a standard Poisson process, the intensity of Hawkes process depends on its entire history, which can model the self-exciting or clustering effect. In finance, most applications of Hawkes processes are about high-frequency trading [BH09, CDM12] . Furthermore, Hawkes processes have been used to model credit default and the arrival of company defaults in a bond portfolio [EKS13, GGD11] . Recently, Hawkes models have been applied in social networks. For example, [FSS + 16] modelled the rate of sending email for each officer at the West Point Military Academy. The more applications of Hawkes process can be found in seismology, neuroscience, cosmology, ecology, and epidemiology. For a list of references for these applications, see [BT07, Zhu13b, Lin09] .

Next, let us introduce the Hawkes process. Let N be a simple point process on R and let where λ(·) : R + → R + is locally integrable and left continuous, h(·, ·) : R + × X → R + is integrable, ℓ denotes the mark variable, and ||h|| L 1 = ∞ 0 X h(t, ℓ)q(dℓ)dt < ∞. Here X is measurable space with common law q(dℓ). h(·) and λ(·) are referred as exciting function and rate function, respectively. Local integrability assumption of λ(·) ensures that the process is non-explosive and left continuity assumption ensures that λ t is F t -predictable. The integral in equation (1) stands for (−∞,t)×X h(t − s, ℓ)N (ds, dℓ) = τ i <t h(t − τ i , ℓ i ), where (τ i ) i≥1 are the occurrences of the points before time t, and the (ℓ i ) i≥1 are i.i.d. random marks, ℓ i being independent of previous arrival times τ j , j ≤ i.

When λ(·) is linear, it is called a linear Hawkes process. There were extensive studies on the stability, law of large numbers, central limit theorems, large deviations, Bartlett spectrum, etc. In particular, [BDHM13] proved the functional law of large numbers and the functional central limit theorems. [BT07] derived large deviations of Hawkes process. For a survey on Hawkes processes and related self-exciting processes, Poisson cluster processes, affine point processes, etc., see [DVJ03] .

When λ(·) is nonlinear, it is known as a nonlinear Hawkes process. Because of the lack of immigration-birth representation and computational tractability, nonlinear Hawkes processes are much less studied. However, there were some efforts in this direction. A nonlinear Hawkes process was first introduced by [BM96] .

The central limit theorems, the large deviation principles for nonlinear Hawkes processes can be found in [Zhu15, Zhu3a, Zhu3b, Zhu14] .

Hawkes process can also be extended to the multivariate setting. For a survey of multivariate processes and a short history of Hawkes process, we refer to [Lin09] .

In contrast to the continuous setting, in reality, the arrivals of events are often recorded in a discrete-time scheme. For example, the data is collected on a fixed phase or the data only shows the aggregate results. Continuous-time Hawkes processes can model the unevenly spaced the arrival of events in time, while modeling the evenly spaced events in time requires a discrete-time type model. Therefore, discrete-time Hawkes processes are appealing for certain applications. However, there are few works on discrete-time Hawkes type models.

[XZW20] proposed for the first time a discrete-time self-exciting and mutuallyexciting model analogous to Hawkes process. More recently, the discrete-time selfexciting model was also applied to study the infection and death of COVID-19 in [BSM + 21] . [Wan20] extended the model of [XZW20] in the univariate case and studied its limit theorems. Following the model in [Wan20] , let α(t) : N → R + be a positive function on N and define X 0 = N 0 = 0. We define α 1 := ∞ t=1 α(t) as the ℓ 1 norm of α. Conditional on X t−1 , X t−2 , . . . , X 1 , we define Z t as a Poisson random variable with mean

where ℓ t,j are positive random variables that are i.i.d. in both t and j. Finally, we define N t := t s=1 Z s and L t := t s=1 X s . Throughout the paper, we assume that α 1 E[ℓ 1,1 ] < 1. It can be derived that the law of large numbers hold:

in probability as t → ∞, and the central limit theorem also holds, see [Wan20] :

in distribution as t → ∞ under the assumptions that

and the first four moments of ℓ are finite.

In this paper, we are interested in studying the large and moderate deviations for the above discrete-time marked Hawkes process. Before we proceed, we will briefly review the large deviation principle, the moderate deviation principle, and the existing results for Hawkes models.

Other related literature. A discrete-time Hawkes-type model with 0-1 arrivals was proposed by [Seo15] and the limit theorems were studied. Let (X n ) ∞ n=1 be a sequence taking values on {0, 1} defined as follows. LetN = N {0} and assume that for i ∈ N, α i > 0 is a given sequence of positive numbers and ∞ i=0 α i < 1. (i) X 1 = 1 with probability α 0 and X 1 = 0 otherwise. (ii) Conditional on X 1 , X 2 , ..., X n−1 , we have X n = 1 with probability α 0 + n−1 i=1 α n−i X i , and X n = 0 otherwise. Define S n := n i=1 X i . [Seo15] showed a law of large numbers theorem, i.e. S n n → µ :

in probability as n → ∞. In addition, with assumption √ n ∞ i=n α i → 0 as n → ∞ and 1 √ n n i=1 iα i → 0 as n → ∞, the central limit theorem follows:

in distribution as n → ∞.

Following [DZ98] , we introduce the definition of large deviation principle. A family of probability measures {P n } n∈N on a topological space (X, T ) satisfies the large deviation principle with rate function I(·) : X → [0, ∞] and speed a n if I is a lower semi-continuous function, a n : [0, ∞) → [0, ∞) is a measurable function which increases to infinity, and the following inequalities hold for every Borel set A:

Where A o is the interior of A and A is the closure of A. We say that the rate function I is good if for any m ≥ 0, the level set {x ∈ X : I(x) ≤ m, m ≥ 0} is compact. In addition to [DZ98] , we also refer to [Var84] for a survey on large deviations.

We first review some large deviations results for Hawkes processes in the literature. We recall that the intensity of a unmarked linear Hawkes process with empty past history, i.e. N (−∞, 0] = 0, is given by

where ν > 0. The integral in equation (7) stands

are the occurrences of the points before time t. If h L 1 = ∞ 0 h(t)dt < 1, the linear Hawkes process has an immigration-birth representation, and by ergodic theory, the law of large numbers for the linear Hawkes process (see, for instance, [DVJ03] ) is derived as

[BT07] showed that, if 0 < h L 1 < 1 and ∞ 0 th(t)dt < ∞, then P( Nt t ∈ ·) satisfies the large deviation principle on R with the good rate function:

where S denotes the total number of descendants of an immigrant, including the immigrant itself.

The large deviation principle of a marked linear Hawkes process with empty history can be found in [KZ15] . Recall the notation of a general marked Hawkes process introduced in section 1, The intensity of a marked univariate linear Hawkes process is given by

Under the above assumption, there exists a unique stationary version of the linear marked Hawkes process defined by equation (9). And by ergodic theorem, a law of large numbers is derived as

.

If there exists some θ > 0, so that X e θH(ℓ) q (dℓ) < ∞. [KZ15] proved that P (N t /t ∈ ·) satisfies a large deviation principle with rate function:

where θ ⋆ and x ⋆ satisfy the following equations

. For nonlinear Hawkes processes, [Zhu14] established the level-3 large deviation principle first and then used the contraction principle to obtain the large deviation principle for P (N t /t ∈ ·). [Zhu15] proved the large deviations for Markovian Hawkes processes and generalized the proof to the case when h(·) is a sum of exponentials starting with the case of exponential h(·).

For any √ n ≪ c n ≪ n, a family of probability measures {P n } n∈N on a topological space (X, T ) satisfies a moderate deviation principle with rate function J(·) : X → [0, ∞] if J is a lower semi-continuous function and for any Borel set A,

That is, P n satisfies a large deviation principle with speed c 2 n n . For example, let X 1 , · · · , X n be a sequence of i.i.d random variables commonly distributed as X and assume E e θX < ∞ for θ in some ball around the origin. Then, P n := 

[Zhu3b] proved the moderate deviation principle for a univariate linear Hawkes process, defined by formula (7) in section 1.2. With the assumption sup t>0 t 3/2 h(t) = C < ∞, the moderate deviation principle holds with the rate function

The moderate deviation principles for a marked linear Hawkes process was studied in [Seo17] . Recall the definition of a marked linear Hawkes process in section 1.2. [Seo17] showed the moderate deviation rate function is

The other related literature. The large deviations of Cox-Ingersoll-Ross process with Hawkes jumps can be found in [Zhu13a] . [ZBGG15] studied limit theorems of affine jump diffusion processes with Hawkes jumps. Gao and Zhu [GZ18] studied large deviations of the Hawkes process with large initial intensity and also discussed the applications of the model to insurance and queue systems. [Yao18] studied the moderate deviation principle for multivariate unmarked linear Hawkes processes. And moderate deviation principles have been studied in mixing processes, Markov processes, martingales, etc. (see [Gao96, Che01, Dem96] )

Organization of this paper. The rest of the paper is organized as follows. In section 2, we state our main results. The proof of the main results can be found in section 3.

Recall the discrete-time Hawkes model introduced in section 1, (2) and X t is a compound Poisson random variable defined by equation (3). This section states the large deviations and moderate deviations of the discrete-time marked Hawkes process.

The formal definition of the large deviation principle has been introduced in section 1.1. For the discrete-time Hawkes process, we prove the following the large deviation principles.

Theorem 2.1. P(N t /t ∈ ·) satisfies a large deviation principle with the rate function

Theorem 2.2. P(L t /t ∈ ·) satisfies a large deviation principle with the rate function

In terms of the moderate deviations of the discrete-time Hawkes process, we assume sup t>0 t 3/2 α(t) = C < ∞. Recall the equation (4), where µ andμ denote the limits in the law of large numbers for N t and L t , respectively. We obtain the following moderate deviation principles for the discrete-time Hawkes process.

Theorem 2.3. For any Borel set A and time sequence c(t) such that √ t ≪ c(t) ≪ t, we have the following moderate deviation principle.

Theorem 2.4. For any Borel set A and time sequence c(t) such that √ t ≪ c(t) ≪ t, we have the following moderate deviation principle.

This section states the proof of our main results. Before we proceed, let's recall a version of Gärtner-Ellis theorem which will be used in our proof. 

Then (µ n ) n ∈ N satisfies the LDP with rate function I, which is Fenchel-Legendre transform of Λ, 

Proof of Theorem 2.1. For any θ ∈ R, we can compute that

where we used the fact that Z t is Poisson with parameter λ t conditional on F t−1 , the natural filtration up to time t − 1. By the definition of λ t , we have

By the definition of λ t−1 , we get

By induction on t, we get

f 2 (θ) = θ + log E e ((e f 1 (θ) −1)α(1)+(e θ −1)α(2))ℓ 1,1 , and more generally, for every s ≥ 1,

This implies that

Let x = e f∞(θ) . Thus, equation (25) can be rewritten as (26) x = E e θ+(x−1) α 1 ℓ 1,1 .

It means we need to show the solution of equation (26) exists. First, it is not hard to see when θ ≤ 0, e ft(θ) is decreasing in t and 0 < e ft(θ) ≤ 1. Thus, the limit of e ft(θ) converges to a finite limit x ⋆ as t → ∞, which satisfies equation (26).

Next, when θ > 0, f t (θ) is increasing in t. We need to determine for what values of θ the solution of equation (26) exists. Let

It is easy to see that G(x) is increasing in θ and

By the assumption E [ α 1 ℓ 1,1 ] < 1, we have G ′ (1) < 0. It implies min x>1 G(x) < 0. Hence, there exists some critical θ c > 0 such that min x>1 G(x) = 0. In other words, with θ c , we can find critical value x c such that G(x c ) = G ′ (x c ) = 0. Thus, we can find (28) θ c = − log E α 1 ℓ 1,1 e (xc−1) α 1 ℓ 1,1 where x c > 1 satisfies xE α 1 ℓ 1,1 e (x−1) α 1 ℓ 1,1 = E e (x−1) α 1 ℓ 1,1 . Therefore, equation (26) has finite solutions if and only if θ < θ c . G(x) is strictly convex in x. Hence, there are at most two solutions for equation (26). When 0 < θ < θ c , equation (26) has two solutions. It's not hard to check G(1) = e θ − 1 > 0 and G ′ (1) = E α 1 ℓ 1,1 e θ − 1 < 0. f t (θ) is increasing in t and for t = 0, e f 0 (θ) = e θ > 1 and (29)

G(e θ ) = e θ E e (e θ −1) α 1 ℓ 1,1 − 1 > 0.

Thus, as t → ∞, e ft(θ) converges to a finite limit. It must converges to x ⋆ which is the smaller solution of equation (26). Similarly, when θ < 0, we can check G(1) < 0 and G ′ (1) < 0. f t (θ) is decreasing in t and at t = 0, e f 0 (θ) = e θ < 1 with G(e θ ) < 0. Thus, as t → ∞, e ft(θ) converges to a finite limit. It must converges to x ⋆ which is also the smaller solution of equation (26).

If θ > θ c , then lim t→∞ 1 t log E e θNt = ∞. Finally, we need to check the essential smoothness condition of ν e f∞(θ) − 1 .

E e (e f∞ (θ) −1) α 1 ℓ 1,1 − e f∞(θ) E α 1 ℓ 1,1 e (e f∞ (θ) −1) α 1 ℓ 1,1

By equation (28), it is not hard to find |f ′ ∞ (θ)| → ∞ as θ → θ c , the conclusion then follows Gärtner-Ellis theorem.

Proof of Theorem 2.2. For any θ ∈ R, we can compute that

where we used the fact that X t is compound Poisson with intensity λ t conditional on F t−1 , the natural filtration up to time t − 1. By the definition of λ t , we have

By the definition of λ t−1 , we get

By induction on t, we get

where g 0 (θ) = E[e θℓ 1,1 ], g 1 (θ) = E[e (θ+(g 0 (θ)−1)α(1))ℓ 1,1 ], and more generally, for every s ≥ 1, g s (θ) = E e (θ+(g s−1 (θ)−1)α(1)+(g s−2 (θ)−1)α(2)+···+(g 0 (θ)−1)α(s))ℓ 1,1

This implies that

Similar as before, we have lim t→∞ 1 t log E e θLt = ν(g L (θ) − 1), where g L (θ) is the minimal solution to the equation x = E[e θℓ 1,1 + α 1 ℓ 1,1 (x−1) ] for any θ ≤ θ c , where θ c satisfies the equation E[ α 1 ℓ 1,1 e θcℓ 1,1 + α 1 ℓ 1,1 (xc−1) ] = 1, where x c satisfies the equation x c = E[e θcℓ 1,1 + α 1 ℓ 1,1 (xc−1) ]. If θ > θ c , then lim t→∞ 1 t log E e θLt = ∞. We can check the essential smoothness condition similar as before.

it is not hard to find |g ′ ∞ (θ)| → ∞ as θ → θ c , the conclusion then follows Gärtner-Ellis theorem.

Proof of Theorem 2.3. First, for any θ ∈ R, we prove that

where µ is defined by equation (4). By the proof of Theorem 2.1, we get

where f 0 (θ t ) = θ t := c(t) t θ, f 1 (θ t ) = θ t + log E e (e θ t −1)α(1)ℓ 1,1 , and f 2 (θ t ) = θ t + log E e ((e f 1 (θ t ) −1)α(1)+(e f 0 (θ t ) −1)α(2))ℓ 1,1 . More generally, for every s ≥ 1, f s (θ t ) = θ t + log E e ((e f s−1 (θ t ) −1)α(1)+(e f s−2 (θ t ) −1)α(2)+···+(e f 0 (θ t ) −1)α(s))ℓ 1,1 .

Then we can rewrite the above equation such that e fs(θt) = e θt E e ((e f s−1 (θ t ) −1)α(1)+(e f s−2 (θ t ) −1)α(2)+···+(e f 0 (θ t ) −1)α(s))ℓ 1,1 .

We write G t (s) instead of G(s) to indicate its dependence on t because of the term, c(t) t . As the proof of Theorem 2.1 shows and c(t) t is sufficient small so that we have c(t) t θ ≤ θ c where θ c = − log E α 1 ℓ 1,1 e α 1 ℓ 1,1 (xc−1) and as s → ∞, we get that G t (∞) is the minimal solution to the equation

where G 1 (s) satisfies

and G 1 (0) = 1, and G 2 (s) satisfies

and G 2 (0) = 1/2. Then we can substitute G t (s) in terms of equation (39) By equation (36),

Then we can rewrite the above equation in terms of equation (38),

Now let's compute t−1 s=0 G 1 (s). By equation (39),

After rewriting the above equation, we get

For the first term in equation (41), we can compute νθ c(t)

According to the assumption sup t>0 t 3/2 α(t) = C < ∞, we can find ∞ i=t α(i) ≤

By Lemma 3.1, G 1 (t) is uniformly bounded. Then for the second term in equation (41),

And we can compute

Furthermore, we also get

According to Lemma 3.2, G 2 (t) is uniformly bounded in t. Then we can compute lim t→∞ 1 t t−1 s=0 G 2 (s) = 1 2

Now we can have

Thus, we can prove

Applying the Gärtner-Ellis theorem, we conclude that, for any Borel set A,

where

Proof of Theorem 2.4. First, let's prove

whereμ is defined in equation (4).

As the proof of Theorem 2.2 shows, we can find 

where θ t = c(t) t θ, g 0 (θ t ) = E e θtℓ 1,1 , g 1 (θ t ) = E e (θt+(g 0 (θt)−1)α(1))ℓ 1,1 , and in general for every s ≥ 1, g s (θ t ) = E e (θt+(g s−1 (θt)−1)α(1)+(g s−2 (θt)−1)α(2)+···+(g 0 (θt)−1)α(s))ℓ 1,1

Because of c(t) t , we writeG t (s) instead ofG(s) to indicate its dependence on t. According to the proof of Theorem 2.2, c(t) t is sufficient small so that we have c(t) t θ ≤ θ c , where θ c satisfies E α 1 ℓ 1,1 e θcℓ 1,1 + α 1 ℓ 1,1 (xc−1) = 1 and as s → ∞, we get thatG t (∞) is the minimal solution to the equationx t = E e c(t) t θℓ 1,1 +ℓ 1,1 ∞ i=1 α(i)xt − 1. Because E [ℓ 1,1 ] α 1 < 1, it is easy to see that

, we haveG t (s) = O((c(t)/t)) uniformly in s. By Taylor's expansion,

We can let

where G 1 (s) satisfiesG 

Then by equation (50), we can rewrite the above equation,

Now let's compute t−1 s=0G 1 (s). By equation (51),

After rewriting the above equation, we get

.

And we can compute νθ c(t)

For the first term on the right hand side of the equation (54), we can compute

According to the assumption sup t>0 t 3/2 α(t) = C < ∞, we get ∞ i=t α(i) ≤ ∞ i=t Ci −3/2 < 2C(t − 1) −1/2 . Therefore,

(1 − α 1 E[ℓ 1,1 ]) 2 → 0, as t → ∞.

Next, by Lemma 3.1,G 2 (t) is uniformly bounded in t, then for the second term on the right hand side of the equation (54), we have lim sup t→∞ νθ c(t)

Then we can get lim sup t→∞ ν |θ| c(t)

Therefore, we can compute 

Thus, we can derive lim t→∞ Applying the Gärtner-Ellis theorem see [DZ98] , we conclude that, for any Borel set A, Hence, we proved that for every s ∈ N, G 1 (s) ≤ Now, let's assume that G 2 (s) ≤ 1 + Var(ℓ 1,1 ) α 2

Scaling limits for Hawkes processes and application to financial statistics

Modelling financial high frequency data using point processes. Handbook of Financial Time Series

Stability of nonlinear Hawkes processes

Simple discrete-time self-exciting models can describe complex dynamic processes: A case study of covid-19

Large deviations of poisson cluster processes

High-frequency financial data modeling using hawkes processes

Moderate deviations for markovian occupation times

Moderate deviations for martingales with bounded jumps

An Introduction to the Theory of Point Processes, Vols. I and II

Large Deviations Techniques and Applications

An analysis of CDS market liquidity by the hawkes process. SSRN eLibrary

Modeling e-mail networks and inferring leadership using self-exciting point processes

Moderate deviations for martingales and mixing random processes

A top-down approach to multi-name credit

Large deviations and applications for Markovian Hawkes processes with a large initial intensity

Spectra of some self-exciting and mutually exciting point processes

Limit theorems for marked Hawkes processes with application to a risk model

Multivariate Hawkes processes

Limit theorems for discrete Hawkes processes

Moderate deviations for marked Hawkes processes

Large Deviations and Applications. SIAM, Philadelphia

Stochastic Models for Earthquake Sequences

Limit theorems for a discrete-time marked Hawkes process

Deposit and withdrawal dynamics: A data-based mutually-exciting stochastic model

Moderate deviations for multivariate Hawkes processes

Affine point processes: approximation and efficient simulation

Central limit theorem for nonlinear Hawkes processes

Moderate deviations for Hawkes processes

Limit theorems for a cox-ingersoll-ross process with hawkes jumps

Nonlinear Hawkes processes

Process-level large deviations for nonlinear Hawkes point processes. Annales de l'Institut Henri Poincaré-Probabilités et Statistiques

Large deviations for Markovian nonlinear Hawkes processes