key: cord-0581974-upic78zk authors: Bielecki, Tomasz R.; Jakubowski, Jacek; Nieweglowski, Mariusz title: Generalized Multivariate Hawkes Processes date: 2020-04-28 journal: nan DOI: nan sha: 96fa41bd6ef10e87445fe78a8135d6d44fc7582f doc_id: 581974 cord_uid: upic78zk This work contributes to the theory and applications of Hawkes processes. We introduce and examine a new class of Hawkes processes that we call generalized Hawkes processes, and their special subclass -- the generalized multivariate Hawkes processes (GMHPs). GMHPs are multivariate marked point processes that add an important feature to the family of the (classical) multivariate Hawkes processes: they allow for explicit modelling of simultaneous occurrence of excitation events coming from different sources, i.e. caused by different coordinates of the multivariate process. We study the issue of existence of a generalized Hawkes process, and we provide a construction of a specific generalized multivariate Hawkes process. We investigate Markovian aspects of GMHPs, and we indicate some plausible important applications of GMHPs. Let F N be the natural filtration of N , so F N := (F N t , t ≥ 0), where F N t is the Pcompleted σ-field σ(N ((s, r] × A) : 0 ≤ s < r ≤ t, A ∈ X ), t ≥ 0. In view of Theorem 2.2.4 in [22] the filtration F N satisfies the usual conditions. Moreover, N is F N -optional, so, using Proposition 4.1.1 in [22] we conclude that T n 's are F N -stopping times and X n are F Tn -measurable. In what follows we denote by P the F N -predictable σ-field. We recall that for a given filtration F a stochastic process X : Ω×[0, ∞) → R is said to be F-predictable if it is measurable with respect to the predictable sigma field P F on Ω × [0, ∞), which is generated by F-adapted processes whose paths are continuous (equivalently leftcontinuous, with the left limit at t = 0 defined as the value of the path at t = 0) functions of time variable. More generally, a function X : Ω × [0, ∞) × X → R is said to be Fpredictable function if it is measurable with respect to the sigma field P F (X ) := P F ⊗ X on Ω × [0, ∞) × X . The sigma field P F (X ) is generated by the sets A × {0} × X where A ∈ F 0 and the sets of the form B × (s, t] × D where 0 < s ≤ t, B ∈ F s and D ∈ X . We now consider a random measure ν on (R + × X , B(R + ) ⊗ X ) defined as ν(ω, dt, dy) := 1 ]]0,T∞(ω)[[ (t)κ(ω, t, dy)dt, where, for A ∈ X , κ(t, A) = η(t, A) + (0,t)×X f (t, s, x, A)N (ds, dx), (2.4) η is a finite kernel from (Ω × [0, ∞), P) to (X , X ), and f is a kernel from (Ω × R + × R + × X , F ⊗ B(R + ) ⊗ B(R + ) ⊗ X ) to (X , X ). 1 We assume also that f is a kernel satisfying: 1. f (t, s, x, A) = 0 for s ≥ t, 2. θ defined as θ(t, A) := (0,t)×X f (t, s, x, A)N (ds, dx), t ≥ 0, A ∈ X , is a kernel from (Ω × [0, ∞), P) to (X , X ), which is finite for t < T ∞ . Clearly, we have θ(t, A) = n: Tn 0 for all t ≥ 0 and that the integral [0,t] κ(s, A)ds is finite for any A ∈ X and any t < T ∞ . This last assumption is satisfied under mild boundedness conditions imposed on η and f . A ∈ X and thus it is F N -predictable. Consequently, ν is a F N -predictable random measure. Before we proceed we recall that for a given filtration F the random measure ν is said to be F-compensator of a random measure N if it is F-predictable random measure such that it holds for every non-negative F-predictable function F : Ω × [0, ∞) × X → R. We are ready to state the underlying definition in this paper. Remark 2.2. We note that in our definition of the generalized Hawkes process the integral in (2.4) is taken over the interval (0, t). In the definition of the classic Hawkes process, the corresponding integral is taken over the interval (−∞, t); see eg. [9] . The "(0, t)" convention is used by several authors, though, in many applications of classical Hawkes processes (such as in Example 3.8) that do not regard stationarity and spectral properties of these processes. We use this convention here since we are not considering stationarity and spectral properties of the generalized Hawkes processes. (ii) With a slight abuse of terminology we refer to κ as to the Hawkes intensity kernel of N . Accordingly, we refer to the quantity κ(t, A) as to the intensity at time t of the event regarding process N and amounting to the marks of N taking values in the set A, or, for short, as to the intensity at time t of marks of N taking values in A. Remark 2.4. Since F N 0 is a completed trivial σ-field, then it is a consequence of Theorem 3.6 in [17] that the compensator ν determines the law of N under P, and, consequently, the Hawkes kernel κ determines the law of N under P. We will now demonstrate that for an arbitrary measure ν of the form (2.3) there exists a Hawkes process having ν as F N -compensator. Towards this end we will consider the underlying canonical space. Specifically, we take (Ω, F) to be the canonical space of multivariate marked point processes with marks taking values in X ∂ . That is, Ω consists of elements ω = ((t n , x n )) n≥1 , satisfying (t n , x n ) ∈ (0, ∞] × X ∂ and t n ≤ t n+1 ; if t n < ∞, then t n < t n+1 ; t n = ∞ iff x n = ∂. The σ-field F is defined to be the smallest σ-field on Ω such that the mappings T n : Ω → ([0, ∞], B[0, ∞]), X n : Ω → (X ∂ , X ∂ ) defined by T n (ω) := t n , X n (ω) := x n are measurable for every n. Note that the canonical space introduced above agrees with the definition of canonical space considered in [22] (see Remark 2.2.5 therein). On this space we denote by N a sequence of measurable mappings N = ((T n , X n )) n≥1 , (2.7) Clearly, these mappings satisfy 1. T n ≤ T n+1 , and if T n < +∞ then T n < T n+1 , 2. X n = ∂ iff T n = ∞. We call such N a canonical mapping. The following result provides the existence of a probability measure P ν on (Ω, F) such that the canonical mapping N becomes a generalized Hawkes process with a given Hawkes kernel κ, which in a unique way determines the compensator ν. Theorem 2.5. Consider the canonical space (Ω, F) and the canonical mapping N given by (2.7) . Let measures N and ν be associated with this canonical mapping through (2.2) and (2.3)-(2.4), respectively. Then, there exists a unique probability measure P ν on (Ω, F), such that the measure ν is an (F N , P ν )-compensator of N . So, N is a generalized multivariate Hawkes process on (Ω, F, P ν ). Proof. We will use Theorem 8.2.1 in [22] with X = X , ϕ = ω, and with α(ω, dt) := ν(ω, dt, X ) = 1 ]]0,T∞(ω)[[ (t)κ(ω, t, X )dt, (2.8) from which we will conclude the assertion of theorem. Towards this end, we will verify that all assumptions of the said theorem are satisfied in the present case. As already observed, the random measure ν is F N -predictable. Next, let us fix ω ∈ Ω. Given (2.8) we see thatᾱ satisfies the following equalities α(ω, {0}) = 0,ᾱ(ω, {t}) = 0 ≤ 1, t ≥ 0, which correspond to conditions (4.2.6) and (4.2.7) in [22] , respectively. It remains to show that condition (4.2.8) holds as well, that is where π ∞ (ω) := inf {t ≥ 0 :ᾱ(ω, (0, t]) = ∞}. To see this, we first note that (2.8) implies Thus it suffices to show that π ∞ (ω) ≥ T ∞ (ω). By definition ofᾱ we can writē If T ∞ (ω) = ∞, then we clearly have π ∞ (ω) = ∞ = T ∞ (ω). Next, if T ∞ (ω) < ∞, then lim t↑T∞(ω)ᾱ (ω, (0, t]) = a. We need to consider two cases now: a = ∞ and a < ∞. If a = ∞, thenᾱ(ω, (0, t]) = ∞ for t ≥ T ∞ (ω), and,ᾱ(ω, (0, t]) < ∞ for t < T ∞ (ω) in view of our assumptions imposed on κ in the beginning of this section. This implies that , which implies that (2.9) holds. Since ω was arbitrary, we conclude that for all ω ∈ Ω conditions (4.2.6)-(4.2.8) in [22] are satisfied. So, applying Theorem 8.2.1 in [22] with β = ν, we obtain that there exists a unique probability measure P ν such that ν is a F N -compensator of N under P ν . The classical Hawkes processes are conveniently interpreted, or represented, in terms of so called clusters. This kind of representation is sometimes called immigration and birth representation. We refer to [14] and [23] . Generalized Hawkes processes also admit cluster representation. The dynamics of cluster centers, or the immigrants, is directed by η. Specifically, η(t, A) is the time-t intensity of arrivals of immigrants with marks belonging to set A. The dynamics of the off-springs is directed by f . Specifically, f (t, s, x, A) represents the time-t intensity of births of offsprings with marks in set A of either an immigrant with mark x who arrived at time s, or of an offspring with mark x who was born at time s. The cluster interpretation will be exploited in a follow-up work for asymptotic analysis of generalized Hawkes processes. We now introduce the concept of a generalized multivariate Hawkes process, which is a particular case of the concept of a generalized Hawkes process. We first construct an appropriate mark space. Specifically, we fix an integer d ≥ 1 and we let (E i , E i ), i = 1, . . . , d, be some non-empty Borel spaces, and ∆ be a dummy mark, the meaning of which will be explained below. Very often, in practical modelling, spaces E i are discrete. The instrumental rationale for considering a discrete mark space is that in most of the applications of the Hawkes processes that we are familiar with and/or we can imagine, a discrete mark space is sufficient to account for the intended features of the modeled phenomenon. We set E ∆ i := E i ∪ ∆, and we denote by E ∆ i the sigma algebra on E ∆ i generated by E i . Then, we define a mark space, say E ∆ , as Moreover, denoting by ∂ i the point which is external to E ∆ i , we define E ∂ i := E ∆ i ∪ {∂ i }, and we denote by E ∂ i the sigma algebra generated by E i and {∂ i }. Analogously we define and by E ∂ we denote the sigma field generated by E ∆ and {∂}. Definition 3.1. A generalized Hawkes process N = ((T n , X n )) n≥1 with the mark space X = E ∆ given by (3.1) , and with X ∂ = E ∂ , is called a generalized multivariate Hawkes process (of dimension d). Note that a necessary condition for generalized Hawkes processes to feature the selfexcitation and mutual-excitation is that f = 0. We refer to Example 3.9 for interpretation of the components η and f of the kernel κ in case of a generalized multivariate Hawkes process. We interpret T n ∈ (0, ∞) and X n ∈ E ∆ as the event times of N and as the corresponding mark values, respectively. Thus, if T n < ∞ we have 2 Also, we interpret X i as the marks associated with i-th coordinate of N (cf. Definition 3.3). With this interpretation, the equality X i n (ω) = ∆ means that there is no event taking place with regard to the i-th coordinate of N at the (general) event time T n (ω). In other words, no event occurs with respect to the i-th coordinate of N at time T n (ω). Definition 3.2. We say that T n (ω) is a common event time for a multivariate Hawkes process N if there exist i and j, i = j, such that X i n (ω) ∈ E i and X j n (ω) ∈ E j . We say that process N admits common event times if P ω ∈ Ω : ∃n such that T n (ω) is a common event time > 0 Otherwise we say that process N admits no common event times. Definition 3.2 generalizes that in Bremaud and Massouli [6] and Liniger [24] . In particular, with regard to the concepts of multivariate Hawkes processes studied in Liniger [24] , the genuine multivariate Hawkes processes [24] admits no common event times, whereas in the case of pseudo-multivariate Hawkes process [24] all event times are common. We start with Definition 3.3. We define the i − th coordinate N i of N as for A ∈ E i and t ≥ 0, where Clearly, N i is a MPP and Indeed, the i-th coordinate process N i can be represented as a sequence N i = (T i k , Y i k ) k≥1 , which is related to the sequence (T n , X i n ) n≥1 as follows where k i = max{n : m i n < ∞}, with m i defined as In particular this means that for the i-th coordinate N i the times T n (ω) such that X i n (ω) = ∆ are disregarded as event times for this coordinate since the events occurring with regard to the entire N at these times do not affect the i-th coordinate. We define the completed filtration [22] the filtration F N i satisfies the usual conditions. We define the explosion time Clearly, T i ∞ ≤ T ∞ . We conclude this section with providing some more insight into the properties of the measure N i . Towards this end, we first observe that the measure N i is both F N -optional and F N i -optional. Subsequently, we will derive the compensator of N i with respect to F N and the compensator of N i with respect to F N i . The following Proposition 3.4 and Proposition 3.7 come handy in this regard. Proposition 3.4. Let N be a generalized multivariate Hawkes process with Hawkes kernel κ. Then the (F N , P)-compensator, say ν i , of measure N i defined in (3.2) is given as Proof. According to Theorems 4.1.11 and 4.1.7 in [22] the i-th coordinate N i admits a unique F N -compensator, say ν i , with a property that For every n and A ∈ E i the processes M i,n,A and M i,n,A given as are (F N , P)-martingales. Hence the process is an F N -predictable martingale. Since it is of integrable variation and null at t = 0 it is null for all t ≥ 0 (see e.g. Theorem VI.6.3 in [16] ). From the above and the fact that This proves the proposition. Remark 3.5. Note that for each i, the function κ i defined in (3.7) is a measurable kernel from (Ω × R + , P ⊗ B(R + )) to (E i , E i ). It is important to observe that, in general, there is no one-to-one correspondence between the Hawkes kernel κ and all the marginal kernels κ i , i = 1, . . . , d. We mean by this that may exist another Hawkes kernel, say κ, such that κ = κ and The following important result gives the F N i -compensator of measure N i . Proposition 3.7. Let N be a generalized multivariate Hawkes process with Hawkes kernel κ. Then the F N i -compensator of measure N i , say ν i , is given as Proof. Using Theorems 4.1.9 and 3.4.6 in [22] , as well as the uniqueness of the compensator, it is enough to show that for any A ∈ E i and any n ≥ 1 the process We will provide now some examples of generalized multivariate Hawkes processes. For ω = (t n , x n ) n≥1 , t ≥ 0 and A ∈ E ∆ we set (3.10) In all examples below we define the kernel κ of the form (2.4) with η and f properly chosen, so that we may apply Theorem 2.5 to the effect that there exists a probability measure P ν on (Ω, F) such that process N given by (3.10) is a Hawkes process with the Hawkes kernel equal to κ. In other words, there exists a probability measure P ν on (Ω, F) such that ν given in For a Hawkes process N with a mark space E ∆ we introduce the following notation Likewise, we denote for i = 1, . . . , d, We take d = 1 and E 1 = {1}, so that E ∆ = E 1 = {1}. As usual, and in accordance with (2.2), we identify N with a point process (N t ) t≥0 . Now we take where λ is positive, locally integrable function, and, for 0 ≤ s ≤ t, we take for some non-negative function w defined on R + (recall that f (t, s, 1, {1}) = 0 for s ≥ t). Using these objects we define κ by In case of the classical univariate Hawkes process sufficient conditions under which the explosion time is almost surely infinite, that is are available in terms of the Hawkes kernel. Specifically, sufficient conditions for no-explosion are given in [1] : λ is locally bounded, and ∞ 0 w(u)du < ∞. In the case of a generalized bivariate Hawkes process N we have d = 2 and the mark space is given as Here, in order to define kernel κ, we take kernel η in the form where δ ∆ is a Dirac measure, η i for i = 1, 2 are probability kernels, from (R + , B(R + )) to Kernel f is given, for 0 ≤ s ≤ t and x = (x 1 , x 2 ), by The decay functions w i,j and the impact functions g i,j , i, j = 1, 2, c, are appropriately regular and deterministic. Moreover, the decay functions are positive and the impact functions are non-negative. In particular, this implies that the kernel f is deterministic and non-negative. In what follows we will need the concept of idiosyncratic group of I coordinates of a generalized bivariate Hawkes process N . For I = {1} we define Clearly, N idio,I is a MPP. For example, N idio,i is a MPP which records idiosyncratic events occurring with regard to X i ; that is, events that only regard to X i , so that X j n = ∆ for j = i at times T n at which these events take place. Likewise, N idio,{1,2} is a MPP which records idiosyncratic events occurring with regard to X 1 and X 2 simultaneously. Let us note that We will now interpret various terms that appear in the expressions for η and f above: represents autonomous portion of the intensity, at time t, of marks of the coordinate N 1 taking values in the set dy 1 ⊂ E 1 and no marks occurring for N 2 ; η c (t, dy 1 , dy 2 ) represents autonomous portion of the intensity, at time t, of an event amounting to the marks of both coordinates N 1 and N 2 taking values in the set represents idiosyncratic impact of the coordinate N 1 alone on the intensity, at time t, of marks of the coordinate N 1 taking values in the set dy 1 ⊂ E 1 and no marks occurring for N 2 ; (0,t)×E 2 represents idiosyncratic impact of the coordinate N 2 alone on the intensity, at time t, of an event amounting to the marks of coordinate N 1 taking value in the set dy 1 ⊂ E 1 and no marks occurring for N 2 ; represents joint impact of the coordinates N 1 and N 2 on the intensity, at time t, of an event amounting to the marks of coordinate N 1 taking value in the set dy 1 ⊂ E 1 and no marks occurring for N 2 ; represents idiosyncratic impact of the coordinate N 1 alone on the intensity, at time t, of an event amounting to the marks of both coordinates N 1 and N 2 taking values in the set represents joint impact of the coordinates N 1 and N 2 on the intensity, at time t, of an event amounting to the marks of both coordinates N 1 and N 2 taking values in the set In particular, the terms contributing to occurrence of common events are η c (t, dy 1 , dy 2 ) and To complete this example we note that upon setting η c = 0 and φ c = 0 we produce a generalized bivariate Hawkes process with no common event times. Fix an arbitrary T > 0. In this section we first provide a construction of restriction to [0, T ] × E ∆ of a generalized multivariate Hawkes process, with deterministic kernels η and f , via Poisson thinning, that is motivated by a similar construction given in [5] . Then, based on our construction, we present a computational pseudo-algorithm for simulation of a generalized multivariate Hawkes process restricted to [0, T ] × E ∆ . We are concerned here with a generalized multivariate Hawkes process admitting the Hawkes kernel of the form . We may, and we do, represent kernels η, f as Note that Q 1 and Q 2 are deterministic probability kernels. Since we are concerned with a restricted Hawkes process we consider a Hawkes kernel κ T which is a restriction to [0, T ] of κ that is For simplicity of notation we suppress T in the notation below. So, for example, we will write f rather than We make the following standing assumption: for some measurable mapping f : Now we describe a construction of Hawkes process with Hawkes kernel given by (4.2) . This construction leads immediately to a pseudo-algorithm, presented in the next section, for simulation of such Hawkes process. In what follows we will define recursively a sequence of random measures (N k ) k≥0 that provide building blocks for our Hawkes process. Towards this end we first let β be the Borel isomorphism between the space E ∂ and a Borel subset of R d ∪ ∂, with the convention that β(∂) = ∂. Our construction will proceed in several steps. Step 1). Let us consider an array where we use the convention that 0 0 = 1. Therefore, for a random variable U uniformly distributed on (0, 1] the random variable D(λ, U ) has Poisson distribution with parameter λ ≥ 0, where we extend the concept of Poisson distribution by allowing λ = 0. Moreover let Existence of such functions G 1 and G 2 is asserted by Lemma 3.22 in [20] . We use the left open intervals of integration above so to be consistent with the the rest of the construction. The reason that we work with left open intervals in the rest of the construction is that the births of the offsprings occur after the appearance of their parents (e.g., after arrivals of the immigrants), see Section 2.2. This feature is explicitly accounted for in the construction presented here. Step 2). Using where Then, we consider a sequence j constructed as follows: Observe that S 0 j < S 0 j+1 on {S 0 j < ∞}, and that the measure N 0 may be identified with the sequence The representation (4.5) is more convenient for our needs than the representation (4.4). This is because the sequence (S 0 j , Y 0 j ) ∞ j=1 is ordered with respect to the first component, so that this sequence is a MPP and thus measure N 0 may also be considered as a MPP. Step 3). Now, we proceed by recurrence. So, for k ∈ N suppose that we have constructed a random sequence in the following way: Note that the random variable P k+1 otherwise, if S k j ≥ T , these elements are all constant and equal to (∞, 0, ∂). Moreover, they are σ(N 0 , . . . , N k )-conditionally independent random elements, and the σ(N 0 , . . . , Similarly as above we observe that the random measure N k+1 can be identified with the random sequence (S k+1 n , Y k+1 Indeed, we have Since the sequence S k+1 n , Y k+1 n ∞ n=1 forms a MPP, then N k+1 may be considered as a MPP. Step 4). Define a sequence of random measures H k , k ≥ 1, on B(R + ) ⊗ E ∆ in terms of the previously constructed marked point processes (N j ) j≥0 by , can be associated with H k in a way analogous to how the sequence S k+1 n , Y k+1 n ∞ n=1 has been associated with N k+1 . Consequently, H k may be considered as an MPP. Step 5). Repeat Step 3 and Step 4 infinitely many times to obtain limiting random measure H ∞ on B(R + ) ⊗ E ∆ given by (4.11) Remark 4.1. It is important to note that all random measures introduced in the construction above do not charge any set Now we will justify that the construction given in Steps 1-5 above delivers a generalized multivariate Hawkes process with the Hawkes kernel given in (4.1). Towards this end let us introduce the following filtrations: Our first aim is to compute H ∞ -compensator of the limiting random measure H ∞ given in (4.11) . We begin with following key result, Proposition 4.2. i) The marked point process N 0 is an H 0 -doubly stochastic marked Poisson process. The random measure ν 0 given by is the H 0 0 -intensity kernel of N 0 . Moreover, ν 0 is the H 0 -compensator of N 0 . ii) For each j the marked point process N k+1 j is an H k+1 -doubly stochastic marked Poisson process. The random measure ν k+1 j given by Proof. i). Note that from Lemma 7.2, by taking it follows that N 0 is H 0 0 -conditional Poisson random measure with intensity measure ν 0 given by (4.12) . Now, the assertion follows from the point i) of Proposition 7.3. ii). We first note that from Lemma 7.2, by taking it follows that N k+1 j defined by (4.6) is H k ∞ -conditionally Poisson random measure with intensity measure ν k+1 j given by (4.13) . To complete the proof, in view of assertion ii) of Proposition 7.3, it suffices show that the marked point processes (N k+1 are conditionally independent given σ(N 0 , . . . , N k ). For this we first note that for each j the random measure N k+1 j is defined by (4.6), so it is constructed from the pair (S k j , Y k j ), which is σ(N 0 , . . . , N k )-measurable and from the family . Now, using the fact that I 1 , I 2 , . . . are independent between themselves and also independent from σ(N 0 , . . . , N k ), we conclude that N k+1 1 , N k+1 2 , . . . are (N 0 , . . . , N k )-conditionally independent. So we see that N k+1 j is a H k+1 0 -conditional Poisson random measure for any j ≥ 1, and that (N k+1 j ) j≥1 are H k+1 0 -conditionally independent random measures. Thus we may use Proposition 7.3 to conclude that N k+1 j is an H k+1 -doubly stochastic marked Poisson process whose H k+1 0 -intensity kernel is ν k+1 j given by (4.13) From Proposition 4.2 and from its proof we conclude that the random measure N k+1 given by (4.9) is a sum of H k ∞ -conditionally independent H k+1 -doubly stochastic marked Poisson processes. We will prove now that N k+1 is also an H k+1 -doubly stochastic marked Poisson process whose intensity kernel is simply the sum of intensity kernels of N k+1 j , j ≥ 0. for 0 ≤ s ≤ t, D ∈ E ∆ . Moreover, the intensity kernel ν k+1 of N k+1 is the H k+1 -compensator of N k+1 . Proof. To prove the first assertion, in view of Proposition 6.1.4 in [22] , it suffices to show that ν k+1 is the H k+1 -compensator of N k+1 . Indeed this compensating property implies that where the last equality follows from H k+1 0 -measurability of ν k+1 . So, if ν k+1 is H k+1compensator of N k+1 then it is H k+1 0 -intensity kernel and, thus, Theorem 6.1.4 in [22] implies the first assertion. Therefore it remains to show that ν k+1 is H k+1 -compensator of N k+1 . Towards this end we first note that from Proposition 4.2 it follows that N k+1 j is an H k+1doubly stochastic marked Poisson process with H k+1 -compensator ν k+1 j given by (4.13) . So, for an arbitrary non-negative H k+1 -predictable function F : This implies that (ω, dt, dy)P(dω), µ(dt, dy, dω) = N k+1 (ω, dt, dy)P(dω) and once again for µ j (dt, dy, dω) = ν k+1 j (ω, dt, dy)P(dω), µ(dt, dy, dω) = ν k+1 (ω, dt, dy)P(dω) we see that and since we obtain that (4.14) holds. This concludes the proof of the first assertion. Now we will prove that the H k+1 0 -intensity kernel ν k+1 of N k+1 is the H k+1 -compensator of N k+1 . For this, we first observe that from Theorem 6.1.4 in [22] it follows that the intensity kernel of N k+1 is the H k+1 -compensator of N k+1 . So, for an arbitrary non-negative In order to proceed we will need the following auxiliary result. Proof. The necessity follows from Proposition 5.9.1.1 in [19] . To prove sufficiency it is enough to show, again by Proposition 5.9.1.1 in [19] , that for every t ≥ 0 and every bounded F ∞ -measurable random variable ξ it holds that Fix ξ, we need to show that for every A ∈ F t ∨ G t . Towards this end let us consider a family U of sets defined as Note that U is a π-system of sets which generates F t ∨ G t . Observe that family of all sets for which (4.18) holds is a λ-system. Thus, by the Sierpinski's Monotone Class Theorem (cf. Theorem 1.1 in [20] ), which is also known as the Dynkin's π − λ Theorem, it suffices to prove (4.18) for the sets from U, which we will do now so to complete the proof. For A ∈ U, we have where the fourth equality follows from (4.17) for η = 1 C . We are now ready to demonstrate the following proposition. for every u ≥ 0 and every A ∈ U, where . . , D n are disjoint sets, and Indeed, if (4.19) holds for A ∈ U, then since U is a π-system which generates F N k+1 u the monotone class theorem implies that (4.17) holds. It remains to show (4.19) for A ∈ U. Using Proposition 4.3 and invoking (7.11) we have (4.20) Since H k+1 0 = H k ∞ and ν k+1 ((s i , t i ] × D) is H k t i measurable we infer that the right hand side of (4.20) is H k tn -measurable and hence also H k u -measurable for arbitry u ≥ t n . Consequently by taking conditional expectations with respect to H k u for u ≥ t n we conclude that (4.19) holds for A ∈ U. The proof is complete. We will determine now the compensators for H 0 := N 0 and for H k given by (4.10) for k ≥ 1. Proposition 4.6. The H 0 -compensator of H 0 , is given by where the kernel η appears in (4.1). The H k -compensator of H k , for k ≥ 1, is given by Proof. The proof goes by induction. Since H 0 = N 0 , then the form of H 0 -compensator of H 0 follows from assertion i) of Proposition 4.2 and from Proposition 6.1.4 [22] . Suppose now that H k -compensator of H k is given by (4.21) . This means that for every D ∈ E ∆ the process is an H k -local martingale. Proposition 4.5 implies that M k (D) is an H k+1 -local martingale. We know from Proposition 4.3 that is an H k+1 -local martingale. Thus M k (D) + L k+1 (D) is an H k+1 -local martingale. This H k+1 -local martingale can be written in the form where the second equality follows from Note that the random measure ϑ k + ν k+1 is H k+1 -predictable so it is the H k+1 -compensator of H k+1 . To complete the proof it suffices to show that ϑ k + ν k+1 = ϑ k+1 . By the induction hypothesis on ϑ k and by (4.14) we have The proof is complete. Before we conclude our construction of a generalized multivariate Hawkes process, we derive the following result. Proof. Proposition 4.5 and Proposition 4.6 imply that for every k ≥ 1, the H ∞ -compensator of H k is given by (4.21) . Thus, we see that for any k ≥ 1 and for an arbitrary non-negative This completes the proof. We are now ready to conclude our construction of a generalized multivariate Hawkes process. Let T ∞ be the first accumulation time of H ∞ . 4 Then we have the following Proof. Let us define a sequence (T n , X n ) n≥1 by and the random measure where ϑ ∞ is given in (4.23 Towards this end note that for arbitrary 0 ≤ s < t ≤ T and D ∈ E ∆ we have The second term above can be written as This and (4.25) imply that ϑ ∞ | ]]0,T∞[[×E ∆ is an F N -predictable random measure such that for arbitrary non negative F N -predictable function F : Thus N is a F N -Hawkes process (restricted to [0, T ] × E ∆ ) with the Hawkes kernel κ. In the description of the pseudo-algorithm below we use the objects η, f , η, f , G 1 and G 2 that underly the construction of our Hawkes process given in Section 4.1. The steps of the pseudo-algorithm are based on the steps presented in our construction of a generalized multivariate Hawkes process with deterministic kernels η and f , and they are: Step 0. Choose a positive integer K, set C 0 = ∅. Step 1. Generate a realization, say p, of a Poisson random variable with parameter T η. Step 2. If p = 0, then go to Step 3. Else, if p > 0, then for i = 1, . . . , p : -Generate realizations u and v of independent random variables uniformly distributed on [0, 1]. Set t = T u, a = η. -If a ≤ η(t, E ∆ ), then generate a realization w of random variable uniformly distributed on [0, 1], compute x = G 1 (t, w) and include (t, x) into the cluster C 0 . Step 3. Set N = C 0 , C prev = C 0 , k = 0. Step 4. While C prev = ∅ and k ≤ K : -For every (s, y) ∈ C prev : * generate a realization p of Poisson random variable with parameter (T − s) f (s, y). * for i = 1, . . . , p: Generate realizations u and v of independent random variables uniformly If a ≤ f (t, s, y, E ∆ ), then generate a realization w of random variable uniformly distributed on [0, 1], compute x = G 2 (t, s, y, w) and include (t, x) into the cluster C new . -Set k = k + 1. Step 5. Return N . The pseudo-algorithm presented above is implemented here in two cases. In the first case, presented in Example 4.9, we implemented the algorithm for a generalized bivariate Hawkes process with E 1 = E 2 = {1}. In the second case, presented in Example 4.10, we set We used Python to run the simulations and to plot graphs. Here we implement our pseudo-algorithm for a bivariate point Hawkes process, that is the generalized bivariate Hawkes process with E 1 = E 2 = {1}, and hence with Moreover, we let and α i , η i (0), β i are non-negative constants. We assume that, for 0 ≤ s ≤ t, the kernel f is given as in (3.11) with the decay functions w i,j in the exponential form: with constant non-negative impact functions: g 1,1 (x 1 ) = ϑ 1,1 , g 1,2 (x 2 ) = ϑ 1,2 , g 1,c (x) = ϑ 1,c , g 2,1 (x 1 ) = ϑ 2,1 , g 2,2 (x 2 ) = ϑ 2,2 , g 2,c (x) = ϑ 2,c , g c,1 (x 1 ) = ϑ c,1 , g c,2 (x 2 ) = ϑ c,2 , g c,c (x) = ϑ c,c , and with Dirac kernels: Thus, the kernel f is of the form The coordinates of N (cf. (3.2)) reduce here to counting (point) processes, so that . Simulated sample paths of N corresponding to the above setting are presented in Figure 1 and Figure 2 . Here we apply our pseudo-algorithm to Example 3.9 with d = 2 and E 1 = E 2 = R. We let: η 1 (t, dy 1 ) = α 1 ϕ µ 1 ,σ 1 (y 1 )dy 1 , η 2 (t, dy 2 ) = α 2 ϕ µ 2 ,σ 2 (y 2 )dy 2 , η 1 (t, dy 1 ) = α c ϕ µc,σc (y 1 )ϕ µc,σc (y 2 )dy 1 dy 2 where α i ≥ 0, i ∈ {1, 2, c}, ϕ µ,σ is the one dimensional Gaussian density function with mean µ and variance σ 2 , and: Moreover, we set: and we take An important class of Hawkes processes considered in the literature is the one of Hawkes processes for which the Hawkes kernel is given in terms of exponential decay functions. See, e.g., [7] , [25] , [33] . One interesting and useful aspect of such processes is that they can be extended to Markov processes, a feature that we term the Markovian aspects of a generalized bivariate Hawkes process . To simplify the presentation, we will discuss Markovian aspects of generalized bivariate Hawkes processes specified in Example 4.9. Using this specification we end up with the Hawkes kernel κ of the form: where, for i = 1, 2, c, we have λ i 0 := η i (0) and We now refer to canonical space as in Section 2.1, and to the random measure ν corresponding to κ as in (2.3). So, using Theorem 2.5 we see that there exists a unique probability P ν such that the canonical process N given as in (3.10) is a generalized multivariate Hawkes process with Hawkes kernel κ. It is straightforward to verify (upon appropriate integration of the kernel κ i.e. over {1} × {1, ∆} for N 1 and {1, ∆} × {1} for N 2 ) that the F N -intensity of process N i , say λ i , is given as is the square bracket of N 1 , N 2 . Then, for i = 1, 2, c, the equality (5.2) can be written as for t ≥ 0. This follows from the fact that [N 1 , N 2 ] counts common jumps of N 1 and N 2 , so for i = 1, 2 the processN i is counts the idiosyncratic jumps of N i , that is the jumps that do not occur simultaneously with the jumps of N j , j = i. In particular, expression (5.4) allows us to give the interpretation of the parameters ϑ i,j , i, j ∈ {1, 2, c}, namely the parameter ϑ i,j describes the impact of the jump of the processN j on the intensity ofN i . Now, let us consider a bivariate counting process N := (N 1 , N 2 ). Note that we may, and we do, identify process N with our bivariate generalized Hawkes process N : T 0 = 0, T n = inf {t > T n−1 : ∆ N t = (0, 0)}, and for i = 1, 2 Also, note that we may, and we do, identify the process N with a random measure µ N on R + × E, where E = {(1, 0), (0, 1), (1, 1)}, given by Thus, we may slightly abuse terminology and call N a generalized bivariate Hawkes process. Theorem 5.1. Let N be a Hawkes process defined as above. Then i) The process N = (N 1 t , N 2 t ) t≥0 is not a Markov process. ii) The process Z = (λ 1 t , λ 2 t , λ c t , N 1 t , N 2 t ) t≥0 is a Markov process with the strong generator A acting on C ∞ c (R 5 + ) given by Av(λ 1 , λ 2 , λ c , n 1 , n 2 ) (5.7) λ 2 , λ c , n 1 , n 2 ) + (v(λ 1 + ϑ 1,1 , λ 2 + ϑ 2,1 , λ c + ϑ c,1 , n 1 + 1, n 2 ) − v(λ 1 , λ 2 , λ c , n 1 , n 2 ))λ 1 + (v(λ 1 + ϑ 1,2 , λ 2 + ϑ 2,2 , λ c + ϑ c,2 , n 1 , n 2 + 1) − v(λ 1 , λ 2 , λ c , n 1 , n 2 ))λ 2 + (v(λ 1 + ϑ 1,c , λ 2 + ϑ 2,c , λ c + ϑ c,c , n 1 + 1, n 2 + 1) − v(λ 1 , λ 2 , λ c , n 1 , n 2 ))λ c . Proof. i) From (5.4) and (5.6) we see that for any t > 0 the quantity ν(dt, dy) given in (5.5) depends on the entire path of N until time t. Thus, by Theorem 4 in [15] , the process N is not a Markov process. ii) First note that (5.4) can be written as Hence using stochastic integration by parts one can show that λ i can be represented as This and (5.6) implies that the process Z is an F Z -semimartingale with characteristics (with respect to cut-off function h(x) = x1 |x|<1 ) , 0, 0 and ν t (dy 1 , dy 2 , dy c , dz 1 , dz 2 ) (5.8) := λ 1 u− δ (ϑ 1,1 ,ϑ 2,1 ,ϑ c,1 ,1,0) (dy 1 , dy 2 , dy c , dz 1 , dz 2 ) + λ 2 u− δ (ϑ 1,2 ,ϑ 2,2 ,ϑ c,2 ,0,1) (dy 1 , dy 2 , dy c , dz 1 , dz 2 ) + λ c u− δ (ϑ 1,c ,ϑ 2,c ,ϑc,c,1,1) (dy 1 , dy 2 , dy c , dz 1 , dz 2 ). This, by Theorem II.2.42 in [18] , implies that for any function v ∈ C 2 b (R 5 ) the process M v given as is an F Z -local martingale. Hence, for any v ∈ C ∞ c (R 5 ) the process defined above is a martingale under P, since v and Av are bounded, which follows from the fact that v ∈ C ∞ c (R 5 ) has compact support, and thus the local martingale M v is a martingale for such v. Consequently, the process Z solves martingale problem for (A, ρ) , where ρ is the deterministic initial distribution of Z, that is ρ(dz) = δ Z 0 (dz). We will now verify that Z is a Markov process with generator A given in (5.7) using Theorem 4.4.1 in [10] . For this, we first observe that parameters determining A, i.e. are admissible in the sense of Definition 2.6 in [8] . Thus, invoking Theorem 2.7 in [8] we conclude that there exists a unique regular affine semigroup (P t ) t≥0 with infinitesimal generator A given by (5.7). Hence, there exists a unique regular affine process with generator A and with transition function P defined by (P t ) t≥0 . Since A is a generator of regular affine process it satisfies the Hille-Yosida conditions (cf. Theorem 1.2.6 in [10] ) relative to the Banach space B(R 5 ) of real valued, bounded and measurable functions on R 5 . Moreover, from Corollary 1.1.6 in [10] it follows that A is a closed operator. Now, using Theorem 4.4.1 in [10] we obtain that Z is a Markov process with generator A. Moreover, P is the transition function of Z. Let us note that using analogous argument as in the proof of Theorem 5.1 we can prove that the process Y 1 := (λ 1 t + λ c t , N 1 t ) t≥0 is a Markov process in filtration F Z provided that parameters of λ k , k ∈ {1, c}, satisfy Analogous statement is valid for Y 2 := (λ 2 + λ c , N 2 ). The are numerous potential applications of the generalized multivariate Hawkes processes. Here we present a brief description of possible applications in seismology, in epidemiology and in finance. In the Introduction to [27] the author writes: "Lists of earthquakes are published regularly by the seismological services of most countries in which earthquakes occur with frequency. These lists supply at least the epicenter of each shock, focal depth, origin time and instrumental magnitude. Such records from a self-contained seismic region reveal time series of extremely complex structure. Large fluctuations in the numbers of shocks per time unit, complicated sequences of shocks related to each other, dependence on activity in other seismic regions, fluctuations of seismicity on a larger time scale, and changes in the detection level of shocks, all appear to be characteristic features of such records. In this manuscript the origin times are mainly considered to be modeled by point processes, with other elements being largely ignored, except that the ETAS model and its extensions use data of magnitudes and epicenters." In particular, the dependence on (simultaneous) seismic activity in other seismic regions has been ignored in the classical univariate ETAS 6 model, and in all other models that we are aware of. The ETAS model is a univariate self-exciting point process, in which the shock intensity at time t, corresponding to a specific seismic location, is designed as (cf. Equation (17) in [27] ) In the above formula, H t stands for the history of after-shocks at the given location, µ represents the background occurrence rate of seismic activity at his location, t m s are the times of occurrences of all after-shocks that took place prior to time t at the specific seismic location, and where M m is the magnitude of the shock occurring at time t m , and M 0 is the cut-off magnitude of the data set; we refer to [27] for details. As said above, dependence between (simultaneous) seismic activity in different seismic regions has been ignored in the classical univariate ETAS model. Below we suggest a possible method to construct a generalized multivariate Hawkes process that may offer a good way of modeling of joint seismic activities at various locations, accounting for dependencies between seismic activities at different locations and for consistencies with local data. We will now briefly describe this construction that leads to a plausible model, which we name the multivariate generalized ETAS model. Towards this end we consider a GMHP N (cf. Definition 3.1), where the index i = 1, . . . , d represents the i-th seismic location, and where the set E i = M i := {m 1 , m 2 , . . . , m n i } of marks is a discrete set whose elements represent possible magnitudes of seismic shocks with epicenter at location i. In the corresponding Hawkes kernel κ the measure η(t, dy) represents the time-t background distribution of shocks' across all seismic regions, and the measure f (t, s, dy, x) represents the feedback effect. For the purpose of illustration, let d = 2. Suppose that local seismic data are collected for each location to the effect of producing local kernels of the form In particular, the quantity λ i (t) := κ i (t, E i ) = y i ∈M i κ i (t, y i ) represents the time-t intensity of seismic activity at the i-th location. In order to produce an ETAS type model, we postulate that for j = 1, 2 and Thus, where t j,m s are the times of occurrences of after-shocks that took place prior to time t only at the i-th seismic location, and X j,t j,m is the magnitude of the after shock at location i that took place at time t j,m ; t c,m s are the times of occurrences of after-shocks that took place prior to time t both seismic locations, and X j,tc,m is the magnitude of the after shock at location i that took place at time t c,m . The classical univariate ETAS model has been extended in [26] to the (classical) univariate space-time ETAS model (see also Section 5 in [27] ). It is important to note that our generalized multivariate Hawkes process may also be used as an useful generalization of the space-time extension of the multivariate generalized ETAS model. In order to see this, let us consider the model (2.1) in [26] with g as in Section 2.1 in [26] , that is (in the original notation of [26] , which should not be confused with our notation) Then, coming back to our generalized multivariate Hawkes process, let the seismic location i = 1, 2 be identified with a point in the plane with coordinates (a i , b i ) ∈ R 2 . Next, let the set of marks E i be given as This will lead to a space-time generalized multivariate Hawkes process that will be studied elsewhere. It was already observed by Hawkes in [12] that Hawkes processes may find applications in epidemiology for modeling spread of epidemic diseases accounting for various types of cases, such as children or adults, that can be taken as marks. This insight has been validated over the years in numerous studies. We refer for example to [30, 28, 21] and the references therein. It is important to account for the temporal and spatial aspects in the modeling of spread and intensity of epidemic and pandemic diseases, such as COVID-19. We believe that the variant of the generalized multivariate Hawkes process that we described at the end of Section 6.1 may offer a valuable tool in this regard. This will be investigated in a follow-up work. Hawkes processes have found important applications in finance over the past two decades. We refer to [13] for a relevant survey. Here, we briefly discuss a possible application in finance of the generalized multivariate Hawkes processes. In a series of papers [2] , [3] , [1] introduced a multidimensional model for stock prices driven by (multivariate) Hawkes processes. The model for stock prices is formulated in [2] via a marked point process N = (T n , Z n ) n≥1 , where Z n is a random variable taking values in {1, . . . , 2d}, and the compensator ν of N has the form (it is assumed that with µ i ∈ R + and functions φ i,j from R + to R + . Let us define the processes N i , i = 1, . . . 2d, by Note that the above implies that N 1 , . . . , N 2d have no common jumps and the F N -intensity of N i is given by λ i and can be written in the form In [2] it is assumed that a d-dimensional vector of assets prices S = (S 1 , . . . , S d ) is based on N via representation The obvious interpretation is that N 2i−1 corresponds to an upward jump of the i-th asset whereas N 2i corresponds to an downward jump of i-th asset. Bacry et.al. [2] showed that within such framework some stylised facts about high frequency data, such as microstructure noise and the Epps effect, are reproduced. Using the GMHPs we can easily generalize their model in several directions. In particular, a model of stock price movements driven by a generalized multivariate Hawkes process N allows for common jumps in upward and/or downward direction. This can be done by setting the multivariate mark space of N to be and where µ e ∈ R + and φ e,x is a function from R + to R + . Including possibility of embedding co-jumps of the prices of various stocks in the book in the common excitation mechanism, may turn out to be important in modeling the book evolution in general, and in pricing basket options in particular. In this appendix we provide some auxiliary concepts and results that are needed in the rest of the paper. Let (Ω, F, P) be a probability space and (X , X ) be a Borel space. For a given sigma field G ⊆ F, we define a G-conditionally Poisson random measure on (R + × X , B(R + ) ⊗ X ) as follows: Definition 7.1. Let ν be a σ-finite random measure on (R + × X , B(R + ) ⊗ X ). A random measure N on (R + × X , B(R + ) ⊗ X ) is a G-conditionally Poisson random measure with intensity measure ν if the following two properties are satisfied: 1. For every C ∈ B(R + ) ⊗ X such that ν(C) < ∞, we have 2. For arbitrary n = 1, 2, . . . , and arbitrary disjoint sets C 1 , . . . , C n from B(R + ) ⊗ X , such that ν(C m ) < ∞, m = 1, . . . , n, the random variables Clearly ν is G-measurable. Note that if G is trivial σ-field (or if N is independent of G), then N is a Poisson random measure (see Chapter 4.19 in [29] ), which sometimes referred to as the Poisson process on R + × X (see e.g. [20] ). In this case ν is a deterministic σ-finite measure. For G = σ(ν), the σ(ν)-conditional Poisson random measure is also known in the literature as Cox process directed by ν (see [20] ). Now we will provide a construction of a G-conditional Poisson random measure with the intensity measure given in terms of a specific kernel g. In fact, the measure constructed below is supported on sets from B((0, T ]) ⊗ X , in the sense that for any set C that has an empty intersection with (0, T ] × X the value of the measure is 0 almost surely. We begin by letting g(t, y, dx) be a finite kernel from (R + × Y, B(R + ) ⊗ Y) to (X , X ), where (Y, Y) and (X , X ) are Borel spaces, satisfying g(t, y, X ) = 0 for t > T. (7.1) Next, let ∂ be an element external to X , and define kernel g ∂ from (R + × Y, B(R + ) ⊗ Y) to (X ∂ , X ∂ ) as g ∂ (t, y, dx) = λ(t, y)γ(t, y, dx), where λ(t, y) = g(t, y, X ), γ(t, y, dx) = g(t, y, dx) g(t, y, X ) Suppose that Next, take Y to be a (Y, Y)-valued random element, which is G-measurable, and let Z and (U m , V m , W m ) ∞ m=1 be independent random variables uniformly distributed on (0, 1] and independent of G. We now define a random measure N on (R + × X , B(R + ) ⊗ X ) as where P , (T m , A m , X m ) ∞ m=1 are random variables defined by transformation of the sequence Z,(U m , V m , W m ) ∞ m=1 and the random element Y in the following way: Using the above set-up we see that, for each m = 1, 2, . . . , where δ (∞,0,∂) is a Dirac measure. Note that even though the random elements X m , m = 1, 2, . . . , may take value ∂, the measure N given in (7.2) is a random measure on (R + × X , B(R + ) ⊗ X ) having support belonging to B([0, T ]) ⊗ X . Given the above, we now have the following result. Lemma 7.2. The random measure N defined by (7.2) is a G-conditionally Poisson random measure with intensity measure ν given by First we will prove that, conditionally on G, the random variable N ((s, t]×B) has the Poisson distribution with mean ν((s, t] × B). Towards this end we observe that P has, conditionally on G, the Poisson distribution with mean (T − (Y )) λ(Y )1 { (Y )