key: cord-0837337-g3yapolq
authors: Zhang, Qihuang; Yi, Grace Y.
title: Sensitivity Analysis of Error-Contaminated Time Series Data under Autoregressive Models with Application of COVID-19 Data
date: 2020-08-13
journal: Journal of Applied Statistics
DOI: 10.1080/02664763.2022.2034760
sha: 4145f564536a9e4fd050f2e3f0963dd4dd5e8da6
doc_id: 837337
cord_uid: g3yapolq

Autoregressive (AR) models are useful tools in time series analysis. Inferences under such models are distorted in the presence of measurement error, which is very common in practice. In this article, we establish analytical results for quantifying the biases of the parameter estimation in AR models if the measurement error effects are neglected. We propose two measurement error models to describe different processes of data contamination. An estimating equation approach is proposed for the estimation of the model parameters with measurement error effects accounted for. We further discuss forecasting using the proposed method. Our work is inspired by COVID-19 data, which are error-contaminated due to multiple reasons including the asymptomatic cases and varying incubation periods. We implement our proposed method by conducting sensitivity analyses and forecasting of the mortality rate of COVID-19 over time for the four most populated provinces in Canada. The results suggest that incorporating or not incorporating measurement error effects yields rather different results for parameter estimation and forecasting.

Time series data are common in the fields of epidemiology, economics, and engineering.

Various models and methods have been developed for analyzing such data. The validity of these methods, however, hinges on the condition that time series data are precisely collected.

This condition is restrictive in applications. Measurement error is often inevitable. In the study of air pollution, for example, it is difficult or even impossible to precisely obtain the true measurement of the daily air population level. Some work on time series subject to measurement error is available in the literature. Tanaka (2002) proposed a Lagrange multiplier test to assess the presence of measurement error in time series data. Staudenmayer and Buonaccorsi (2005) explored the classical measurement error model for the autoregressive process. Tripodis and Buonaccorsi (2009) studied measurement error in forecasting using the Kalman filter. Dedecker et al. (2014) considered a non-linear AR(1) model with measurement error. Despite available discussions of measurement error in time series, several limitations restrict the application scope of the existing work. Most available methods consider only autoregressive models without the drift and assume the simplest additive measurement error model. Furthermore, most work involves a complex formulation to adjust for the measurement error effects, which is not straightforward to implement for practitioners. In addition, to our knowledge, there is no available work addresses measurement error effects on prediction under autoregressive models.

In this article, we systematically explore analysis of error-prone time series data under autoregressive models. We propose two types of models to delineate measurement error processes: additive regression models and multiplicative models. These modeling schemes offer us great flexibility in facilitating different applications. We investigate the impact of the naive analysis which ignores the feature of measurement error in the inferential procedures, and we obtain analytical results for characterizing the biases incurred in the naive analysis.

We develop an estimating equation approach to adjust for the measurement error effects on time series analysis. We establish asymptotic results for the proposed estimators, and develop the theoretical results for the forecasting of times series in the presence of measurement error. Finally, we describe a block bootstrap algorithm for computing standard errors of the 1 proposed estimators.

Our work is partially motivated by the data of COVID-19, a wide-spread disease that has become a global health challenge and has caused over ten million infections and half million deaths as of August, 2020. Because of the special features of the disease, the COVID-19 data introduce a number of new challenges: 1) due to the asymptomatic infected cases and the patients with light symptoms who do not go to hospitals, the number of reported cases with COVID-19 is typically smaller than the true number of infected cases; 2) due to the limited test resources, many infected cases are not able to be identified instantly; and 3) the varying incubation periods lead to the delay of the identification of the infections. Consequently, the discrepancy between the reported case number and the true case number can be substantial, and ignoring these features and applying the traditional time series analysis method would no longer produce valid results.

In this paper, we apply the developed methods to analyze the COVID-19 data. We are interested in studying how the mortality rate in a region may change over time and describing the trajectory of the death rate. While the mortality rate of a disease is defined as the death number divided by the case number, the determination of the mortality rate of COVID-19 is challenging. In contrast to the standard definition, Baud et al. (2020) estimated mortality rates by dividing the number of deaths on a given day by the number of patients with confirmed COVID-19 infections 14 days earlier, driven by the consideration of the maximum incubation time to be 14 days. Due to the unique features of COVID-19, there does not seem to be a precise way to define the mortality rate of COVID-19. In this paper, we conduct sensitivity analyses to assess the severity of the pandemic by using different definitions of the mortality rate and considering different ways of modeling measurement error in the data.

The remainder of the article is organized as follows. The notation and the setup for autoregressive time series models and the proposed measurement error models are introduced in Section 2. In Section 3, we present the theoretical results for characterizing the impact of measurement error on the analysis of time series data. In Section 4, we develop an estimating equation approach to adjust for the biases due to measurement error. In Section 5, we implement the proposed method to analyze the COVID-19 data in four Canadian provinces.

The article is concluded with a discussion presented in Section 6. 2 2 Model Setup and Framework

Consider a T × 1 vector of time series, X (T ) = (X 1 , X 2 , . . . , X T ) T . We are interested in modeling the dependence of X t on it previous observations X (t−1) and we consider it to be postulated by an autoregressive model with lag p

where p is an integer smaller than T , (t) = ( 1 , . . . , t ) T is independent of X (t) = (X 1 , . . . , X t ) T with each t having zero mean and variance σ 2 , φ 0 is a constant drift, and φ = (φ 1 , . . . , φ p ) T is the regression coefficient.

The additive form in (1) and the zero mean assumption of t show that φ 0 and φ are constrained by

where X t−1 = (X t−1 , . . . , X t−p ) T . To make the process of X t stationary, φ 1 , . . . , φ p are further constrained such that all the roots of the equation in z z p − φ 1 z p−1 − · · · − φ p = 0

have absolute values smaller than 1 (Brockwell and Davis 2002, Sec.3.1.) . For example, a stationary AR(1) process requires that |φ 1 | < 1, and a stationary AR(2) process needs that (φ 1 + φ 2 ) < 1, (φ 2 − φ 1 ) < 1 and |φ 2 | < 1. Here we are interested in the estimation of parameters, φ and φ 0 . Let µ denote the mean E(X t ) of the time series, which equals φ 0 1−φ 1 −...−φp if X t is (weakly) stationary. When p = 1, the stationarity of a time series implies Var(X t ) = σ 2 1−φ 2 1 for t = 1, . . . , T .

The estimation of the parameters in the AR(p) time series model (1) can be carried out by the least squares method. To see this, we first focus on estimation of φ = (φ 1 , . . . , φ p ) T . Let S(φ) = T t=p+1 {X t − (φ 0 + p j=1 φ j X t−j )} 2 be the sum of the squared difference between X t and its linearly combined history with lag p. Then applying the constraint (2) gives

To minimize S(φ) with respect to φ, we solve ∂S(φ) ∂φ = 0 for φ and obtain the solution

where for t = 1, . . . , T , E(X t ) can be estimated by 1 T T t=1 X t , which is denoted as µ. Next, by the constraint (2), replacing E(X t ) by µ gives an estimator of φ 0 :

Re-expressing (1) as t = X t − (φ 0 + p j=1 φ j X t−j ) and by the definition of S(φ), we may estimate Var( t ) = σ 2 by

with E(X t ) estimated by µ.

Estimators (3)-(5) can be derived in an alternative way. First, by the stationarity of the X t , for k = 0, . . . , p and p ≤ t, Cov(X t , X t−k ) is time-independent and let γ k denote it; it is clear that γ 0 represents Var(X t ) for any t. Let Γ be the autocovariance matrix

Let γ = ( γ 1 , · · · , γ p ) T with γ k = 1 T −k T t=k+1 (X t − µ)(X t−k − µ) being an estimator of γ k for k = 0, . . . , p, and let Γ be the estimator of Γ with γ k replaced by γ k for k = 0, . . . , p − 1.

Next, we examine the summation terms in (3) and (5) by using the fact that as T → ∞,

Then, (3)-(5) motivate an alternative method of finding estimators for φ, φ 0 , and σ 2 , by solving the estimating equations:

for φ, φ 0 , and σ 2 . Let φ, φ 0 and σ 2 denote the resultant estimators of φ, φ 0 , and σ 2 , respectively. These estimators are asymptotically equivalent to the least squares estimators

T → ∞, and hence, they are consistent (Box et al. 2015, Ch.7, A.7.4) .

Estimating equations (6) offer a unified estimation framework in its connections with not only the least squares estimation but also the maximum likelihood method under the assumption of Gaussian error as well as the Yule-Walker method. Similar to the least squares method, finding estimators using one of those approaches is asymptotically equivalent to solving (6) for φ, φ 0 and σ 2 (Box et al. 2015, Ch.7, A.7.4 ).

Suppose that for t = 1, . . . , T , the observation of X t is subject to measurement error and the precise measurement of X t may not be observed, but its surrogate measurement X * t is available. We consider two measurement error models.

The first measurement error model takes an additive form

for t = 1, . . . , T , where the error term e t is independent of X t with mean 0 and timeindependent variance σ 2 e and is assumed to be independent for t = 1, . . . , T , and α = (α 0 , α 1 ) T is the parameter vector. Here, α 0 represents the systematic error and α 1 represents the 5 constant inflation (or shrinkage) due to the measurement error. For instance, if α 0 = 0, then setting α 1 < 1 (or α 1 > 1) features the scenario where X * t tends to be smaller (or larger) than X t if the noise term is ignored. This model generalizes the classical additive model considered by Staudenmayer and Buonaccorsi (2005) who considered the case with α 0 = 0 and α 1 = 1.

By the stationarity of the X t , we note that model (7) yields E(X * t ) = α 0 + α 1 µ and

the variability of the X * t can be greater or smaller than that of the X t , depending on the value of α 1 .

The second measurement error model assumes a multiplicative form:

for t = 1, . . . , T , where β 0 is a positive scaling parameter, and the u t are the error terms which are independent of each other as well as of the X t , and have mean one and timeindependent variance σ 2 u . Depending on the distribution of the error term u t , (9) can feature different types of discrepancy between X t and X * t . The stationarity of the X t together with model (9) implies E(X * t ) = β 0 µ, and

where we use the independence of X t and u t .

Since E(X * t ) is time-independent for both (7) and (9), in the following discussion, we let µ * denote E(X * t ) for t = 1, . . . , T . The modeling of the measurement error process by (7) or (9) introduces extra parameters {α 0 , α 1 , σ 2 e } or {β 0 , σ 2 u }, where the variance of the error term is bounded by the variability of X * t together with others. Clearly, (8) shows that σ 2 e < Var(X * t ) and (10) implies that σ 2 u < Var(X * t ) β 2 0 µ 2 .

Estimating equations (6) are useful when measruements of X t are available. However, due to the measurement error, X t is not observed so (6) cannot be directly used for estimation of the parameters for model (1). As the surrogate X * t for X t is available, one may attempt to employ the naive analysis to model (1) with X t replaced by X * t . Here we study the impact of measurement error on the naive analysis disregarding the difference between X t and X * t . We start with the AR(1) model, i.e., model (1) with p = 1.

If we naively replace X t in (1) by X * t , then the time series model (1) becomes

where (φ * 0 , φ * 1 ) T and * t show possible differences from the corresponding quantity in the model (1). To estimate φ * 0 and φ * 1 , we may employ the ordinary least squares (OLS) method.

. Assume the stationarity of the times series. If the measurement error process satisfies (7), then

The proof of the theorem is included in Supplementary Appendix A.2. This theorem essentially implies that the naive estimator under the additive form in (7) is inconsistent because φ * 1 = φ 1 and φ * 0 = φ 0 . The naive estimator φ * 1 attenuates and the attenuation factor ω 1 depends on the parameters α 1 and σ 2 e of the measurement error model (7) as well as φ 1 and σ 2 in the time series model (1). The coefficient α 1 in the measurement error model (7) affects the estimation of the both naive estimators φ * 1 and φ * 0 , while the intercept α 0 influences the estimation of φ * 0 only, but not φ * 1 or Var( * ).

If the times series is stationary and the measurement error process satisfies (9), then

The proof of the theorem is included in Supplementary Appendix A.3. This theorem says the attenuation effect resulting from the measurement error on estimation of φ 1 . The constant scaling parameter β 0 in the measurement error model (9) does not influence the estimation of φ 1 but affects the estimation of φ 0 and σ 2 . The attenuation factor ω 2 is determined by the magnitude σ 2 u of measurement error as well as the values of φ 0 , φ 1 , and σ 2 of the time series model (1).

We now extend the discussion in Section 3.2 to the AR(p) model with p ≥ 2. Replacing X t with X * t in (1) gives the working model

where φ * = (φ * 1 , . . . , φ * p ) T and * t may differ from the corresponding symbol in (1). If mimicking the procedure of using (6) with X t replaced by X * t to estimate φ * , φ * 0 and σ 2 * in (13), then we let φ * = ( φ * 1 , . . . , φ * p ) T , φ * 0 and σ * 2 denote the resultant estimators. Similar to γ k and µ, we define µ * = 1

. We now discuss the asymptotic results of the naive estimators under different measurement error models.

Theorem 3 Let 1 p be the p×1 unit and let I p be the p×p identity matrix. Define γ * = α 2 1 γ, γ * 0 = α 2 1 γ 0 + σ 2 e , φ * = α 2 1 (α 2 1 Γ + σ 2 e I p ) −1 γ, φ * 0 = (1 − φ * · 1 p ) (α 0 + α 1 µ) and σ 2 * = α 2 1 γ 0 + σ 2 e − α 4 1 γ T (α 2 1 Γ + σ 2 e I p ) −1 γ. Under regularity conditions, if the time series is stationary and the measurement error process satisfies (7), then

Then the elements of Q 1 are given by

for p ≥ 1, where q jk is the (j, k) element of the asymptotic covariance matrix of ( γ 0 , γ T ) T , given by (Brockwell et al. 1991, Sec. 7 .3)

for (j, k) = (0, 0), (0, p), (p, p) and (p, r) with r = 0 and r = p, with η = E( 4 t )/σ 4 .

The proof of Theorem 3 is presented in Supplementary Appendix A.4. Similar to the results in Theorem 1, the intercept α 0 only influence φ 0 and does not influence φ.

Under regularity conditions, if the time series are stationary and the measurement error process satisfy (9), then

as T → ∞. Then the elements of Q 2 are given by

where the q jk are given by (14), for (j, k) = (0, 0), (0, p), (p, p) and (p, r) with r = 0 and r = p, and v p = lim T →∞

The proof of the theorem is presented in Supplementary Appendix A.5. The multiplicative measurement error u t contributes to the biasedness of the parameter estimation for φ, while the scaling parameter β 0 has no effects on the naive estimator φ * .

In the presence of measurement error, measurements of the X t are not always available but surrogate measurements X * t are available. It may be tempting to conduct a naive analysis by implementing (6) with the X t replaced by the X * t , or equivalently with µ and γ k replaced by µ * and the γ * k , respectively, to find estimators of φ, φ 0 and σ 2 . However, by Theorems 3-4, such a procedure typically yields biased estimators. In this section, we develop new estimators accounting for the measurement error effects described by either the additive model (7) or the multiplicative model (9).

Our idea is still to employ (6) to find consistent estimators of φ, φ 0 and σ 2 , but instead of replacing µ and the γ k with µ * and the γ * k as in the naive analysis, we replace µ and the γ k in (6) with new functions of the X * t , denoted as µ and the γ k , which adjust for the measurement error effects. Specifically, if we can find µ and the γ k such that they resemble µ and the γ k in the sense that as T → ∞, µ and µ have the same limit in probability, and γ k and γ k have the same limit in probability for k = 0, . . . , p,

then substituting µ and the γ k with µ and the γ k in (6) yields consistent estimators of φ, φ 0 and σ 2 .

With the availability of the γ k satisfying (15), let Γ denote Γ with the γ k replaced by the γ k . Then provided regularity conditions, consistent estimators of φ, φ 0 and σ 2 can be obtained by solving the estimating equations for φ, φ 0 , and σ 2 :

It is immediate to obtain the following result.

Theorem 5 Assume regularity conditions hold and the time series are stationary. If µ and the γ k are functions of the X * t with t = 1, . . . , T and they satisfy (15), and let φ, φ 0 , and σ 2 denote the estimators for φ, φ 0 and σ 2 , respectively, obtained by solving (16). Then, as

where G is the matrix of derivatives of φ with respect to the components of ( γ * 0 , γ * T ) T . Here Q = Q 1 , the matrix in Theorem 3, if measurement error follows the model (7); and Q = Q 2 , the matrix in Theorem 4, if measurement error follows the model (9). Now we discuss explicitly how to determine µ and the γ k under the measurement error model (7) or (9). With (7), take

for k = 1, . . . , p. By the results in Theorem 3(1) and Theorem 4(1), it can be easily verified that these µ and the γ k satisfy (15).

We conclude this section with a procedure of estimating the asymptotic covariance matrix for the estimator φ. While Theorem 5 presents the sandwich form of the asymptotic covariance matrix of φ, its evaluation involves lengthy calculations. We may alternatively employ the block bootstrap algorithm (Lahiri 1999 ) to obtain variance estimates for φ using the following steps. Firstly, we set a positive integer, say N , as the number for the bootstrap sampling; N can be set as a large number such as 1000. Next, we repeat through the following five steps:

Step 1: At iteration n ∈ {1, . . . , N }, we initialize a null time series X (n,0) of dimension 0 and specify a block length, say b, which is an integer between 0 and T . Initialize m=1.

Step 2: Sample an index, say i, from {0, . . . , T − b}, and then define X

Step 3: Update the previous time series X (n,m−1) by appending X (m−1) add to it, and let X (n,m) denote the new time series.

Step 4: If the dimension X (n,m) is smaller than T then return to Steps 2 and 3; otherwise drop the elements in the time series with the index greater than T to ensure the dimension of X (n,m) is identical to T and then go to Step 5.

Step 5: Obtain an estimate φ (n) of parameter φ by applying the times series X (n,m) to (16). If n < N , then set n to be n + 1 and go back to Step 1 to repeat; otherwise stop.

Let¯ φ (n) = 1 N N n=1¯ φ (n) be the sample mean. The bootstrap variance of φ is then given by, the h-step forecasting of X T +h is based on its history of lag-p, {X T +h−1 , . . . , X T +h−p }, by using the conditional expectation E(X T +h |x T +h−1 , . . . , x T +h−p ), denoted X T +h , where for j = T + h − 1, . . . , T + h − p, x j is the observe value of X j if j ≤ T ; and x j is the predicted value of X j , X j , if j > T . This prediction minimizes the squared prediction error E( X T +h − X T +h ) 2 (e.g., Box et al. 2015, p.131 ).

If no measurement error is involved, due to the zero mean of the random error term t in the AR(p) model (1), for h = 1, . . . , H, the conditional expectation can be calculated by

When measurement error appears, the observe values x j for j = T, . . . , T − p + 1 in (17) are no longer available but their surrogates X * j are available. We now provide a sensible estimate of X j by using the measurement error model for characterizing the relationship of X j and X * j . If measurement error follows (7), we "estimate" X j by

if the measurement error follows (9), then X j is "estimated" by

13

These "estimates" are unbiased in the sense that E( X j ) = X j for j = t, . . . , t − p + 1.

Consequently, for h = 1, . . . , H, X T +h is predicted as

In contrast to the observed values {x T , . . . , x T −p+1 }, also referred to as the initial values of the forecasting of X T +1 , . . . , X T +H , the estimates determined by (18) or (19) introduce additional prediction error which should be characterized. Without the loss of generality, we consider p = 1 to illustrate the recursive calculation of the prediction error; the prediction error with higher orders of the autoregressive process can be derived recursively in a similar way but with more complex expressions.

If the measurement error follows (7), the mean squared prediction error of the 1-step prediction is given by

where the last step is due to the independence between e t and t+1 , as well as E(e 2 t ) = σ 2 e and E( 2 t ) = σ 2 . Then, the h-step prediction error is given by

where the last step comes from the recursive evaluation of P

Similarly, if the measurement error follows (9), the mean squared prediction error is given

where we use the independence of t+1 , u t and X t , E(u t ) = 1, and Var(X t ) = σ 2 1−φ 2 1 due to the stationary AR(1) process. Hence,

The evaluation of the mean squared prediction error P (h) e is carried out by replacing the parameters with their estimators. We comment that the common second term in (21) and (22), h−1 i=0 φ 2i 1 σ 2 , is the mean squared prediction error for the AR(1) model for error-free settings (e.g. Box et al. 2015, p.152) , which equals

For an α with 0 < α < 1, then h-step (1 − α)-prediction interval is constructed as

where q α 2 the α-level quantile of the distribution of X T +h − X T +h . In practice, under normal assumption of t and e t , one can take q α 2 to be the α-level quantile of the standard normal distribution (Brockwell and Davis 2002, p.108) .

Columbia, Ontario, Quebec, and Alberta, the four provinces in Canada which experience severe situations. The daily confirmed cases and fatalities are taken from "1Point3Acres.com" (https://coronavirus.1point3acres.com/).

In epidemiology, the mortality rate, defined as the proportion of cumulative deaths of the disease in the total number of people diagnosed with the disease (Kanchan et al. 2015) , is often used to measure the severeness of an infectious disease. For COVID-19, determining the mortality rate is not trivial due to the difficulty in precisely determining the number of While the first two ways may help more reasonably estimate mortality rates than the third definition, these calculated rates still differ from the true mortality rates because of underreported cases which are primarily due to limited test capacity and undetected asymptomatic infections. To reflect the discrepancy between the reported and the true mortality rates for each province, for each definition of the mortality rate, we let X 1,t , X 2,t , X 3,t , and X 4,t , represent the true mortality rate on day t for British Columbia, Ontario, Quebec and Alberta, respectively; and let X * 1,t , X * 2,t , X * 3,t and X * 4,t denote the reported mortality rate on day t in British Columbia, Ontario, Quebec and Alberta, respectively. The objective is to use the reported mortality rates {X * it : t = 1, . . . , 31} to infer the true mortality rates X i,t which are modeled by (1) separately for i = 1, . . . , 4. In addition, we want to forecast the true mortality rate of COVID-19 for a future time period. Due to the undetected asymptomatic cases and untested cases for light symptoms, the reported mortality rates X * i,t are typically overestimated (i.e., X * i,t ≥ X i,t ) for i = 1, . . . , 4. As there is no exact information to guide us how to characterize the relationship between X * it and X it , here we conduct sensitivity studies by considering measurement error model (7) or (9). We use the observed data X * i,t from April 3, 2020 to May 4, 2020, i.e., {X * i,t : t = 1, ..., T i } with T 1 = T 2 = 31, to estimate the model parameters in (1) with measurement error effects accounted for, and then forecast the mortality rate of COVID-19, from May 5, 2020 to May 9, 2020, in British Columbia, Ontario, Quebec and Alberta, Canada. Figure 1 displays the trajectory of the mortality rates of COVID-19 in the four provinces that are obtained from the three definitions. To assess the stationarity of the X * it , we conduct the augmented DickeyFuller (ADF) tests (Cheung and Lai 1995) Table 4 presents the test statistics and p-value of the ADF test for each time series, where "TSV" represents a test statistics value.

[ Place Figure 1 About Here ]

To determine the lag value p for the autoregression model (1) used for the time series {X i,t : t = 1, ..., T i } with T 1 = T 2 = 31 for i = 1, . . . , 4, we fit the naive model (13) with * t assumed to follow a normal distribution N (0, σ * 2 ), and use the AIC criterion by minimizing

The results are summarized in Supplementary Table 5 , where no-differencing or 1-differencing is applied, the entries with "-" indicate that the corresponding model is not applicable due to the ADF test results.

We take those lag values for an AR(p) model to feature the true mortality rate X i,t for each definition and i = 1, . . . , 4. To be specific, for the British Columbia data, with Definition 1 we consider two models: AR(1) model for the time series with 1-order differencing and AR(2) model for the time series with no-differencing; with Definitions 2 and 3, we consider AR(2) and AR(1) models, respectively, for the time series with 1-order differencing. For the Ontario data, we consider AR(1) and AR(4) for the time series with 1-order differencing in Definitions 1 and 3, respectively, and AR(2) for Definition 2 with no transformation. For the Quebec data, we consider AR(1) and AR(2) models for the times series with 1-order differencing in Definitions 1 and 2, respectively. For Alberta data, we consider an AR(1) model for the times series with 1-order differencing for both Definitions 1 and 2.

As there are no additional data available for estimating the parameters for the model (7) or (9), we conduct sensitivity analyses using the findings in the literature. Different studies showed different estimates of the asymptomatic infection rates, changing from 17.9% to 78.3% (Kimball 2020; Day 2020) . To accommodate the heterogeneity of different studies, He et al. (2020) carried out a meta-analysis and obtained an estimate of the asymptomatic infection rate to be 46%. If under-reported confirmed cases are only caused from undetected asymptomatic cases, then X t = (1 − τ A )X * t , or equivalently,

where τ A represents the rate of asymptomatic infections. Now we use (24) as a starting point to conduct sensitivity analyses. In the multiplicative

, the value from the meta-analysis of He et al. (2020) . To see different degrees of error, we consider σ 2 u to take a small value, say σ 2 u1 , and a large value, say, σ 2 u2 , which is alternatively reflected by the change of the coefficient of variation, CV = σu E(ut) , of the error term u t from σ u1 × 100% to σ u2 × 100%.

When using the additive model (7) to characterize the measurement error process, motivated by (24), we set α 0 = 0 and α 1 = 1 1−46% , and let σ 2 e take a small value, say σ 2 e1 , and a large value, say σ 2 e2 , to feature an increasing degree of measurement error. Due to the constraints for the parameters discussed for (8) and (10), we set the values for σ u1 , σ u2 , σ e1 , and σ e2 case by case for each definition and for each province, which are recorded in Table 6 .

The model fitting results are reported in Tables 1-2 and Supplementary Table 7 for the three definitions of mortality rates, where the point estimates (EST), the associated standard errors (SE), and the p-values for the model parameters are included. Table 1 shows that with Definition 1, the estimates of φ 0 in the absolute value from the proposed method are smaller than those of naive method, while the estimates of φ 1 produced from the proposed and naive methods exhibit an opposite direction. As expected, the standard errors for the proposed method are generally larger than those of the naive method. However, both methods find no evidence to support that φ 0 and φ 1 are different from zero for the data of British Columbia and Ontario, suggesting that the mortality rates of these two provinces remain statistically unchanged. At the significance level 0.1, the naive method and the proposed method show different evidence for the data of Quebec and Alberta. The naive method suggests a likely downward trend with p-value 0.071 and 0.061 for testing of φ 0 for Quebec and Alberta, respectively. The proposed method, on the other hand, show that φ 0 is insignificant for these two provinces. Table 2 displays the results for Definition 2. For the British Columbia data, the estimates of the three parameters φ 1 , φ 2 and φ 3 produced from the proposed method are smaller than those yielded from the naive method, whereas the standard errors output from the proposed method are larger than those from the naive method. However, at the significance level 0.05, both methods find no evidence to show the significance of φ 0 , φ 1 and φ 2 , suggesting that the mortality rate of British Columbia remain unchanged with time. Similar findings are revealed for the Alberta data except that the parameter estimates output from the proposed method are larger than those produced from the naive method. For the Ontario and Quebec data, the revealings from the two methods are quite different. For Ontario, both methods

show that φ 0 is insignificant and φ 1 is significant. The evidence of φ 2 , however, depends on the nature of measurement error. On the contrary, the findings for Quebec do not tend to show a definite direction, and they vary with the model form or degree of the measurement error process. 

With the fitted model for each time series in Section 5.3, we forecast the true mortality rate for the subsequent five days (May 5 -May 9) using the method described in Section 4.2.

Specifically, since the true mortality rates are not observable, we "estimate" them using (18) and (19), respectively, for the measurement error models (7) and (9), and then we forecast the values of X i,32 , X i,33 , X i,34 , X i,35 , and X i,36 using (20).

To quantify the forecasting performance, we calculate P Tables 8-10 , where H is set as 5. For h = 1, . . . , H, we report the observed prediction error (X T +h − X T +h ) 2 , and the expected prediction error defined in (21) and (22). and red for the measurement error models (7) and (9), respectively, together with prediction areas marked in shaded parts, as well as the prediction results obtained from the naive method by using (20) with naive estimates of φ (marked in dark yellow). In comparison, we display the reported mortality rate (in black) from Apr 3, 2020 to May 9, 2020 as well as the adjusted mortality rates obtained from (24) The results for British Columbia are presented in Figure 2 and Web Figures 4-6. With

Definition 1, the methods with measurement error effects accommodated suggest that the mortality rate in the past and its forecasting values are around 4%, whereas the results obtained from the method without accounting for measurement error effects indicate that the mortality rates over time are higher than 6%. With Definition 2, the methods with or without accounting for measurement error effects reveal that the mortality rates over time are, respectively, below 3.5% and above 5%. With Definition 3, the methods with or without accounting for measurement error effects indicate that the mortality rates over time are, around 3% and above 4%, respectively.

[ Place Figure 3 About Here ]

The results for Ontario are presented in Figure 3 and Supplementary Figures 7-8 . With

Definition 1, the methods with measurement error effects accommodated suggest that the mortality rate over time is around 7% over time, while the reported mortality rate over time is about 12.5%. With Definition 2, the methods with and without incorporating the feature of measurement error indicate the mortality rate in the past and its forecasting values are, respectively, below 6% and around 10%. With Definition 3, the mortality rate increases over time substantially. The methods with measurement error effects accommodated suggest that the mortality rate increases from 2% to above 4% whereas the reported mortality rate shows that rate increases from below 4% to above 8%.

The results for Quebec are presented in Supplementary Figures 9-10 . With Definition 1 the methods with measurement error effects accommodated show that the mortality rate is around 6.5% over time, whereas the method without considering measurement error indicates the mortality rate is over 10%. With Definition 2, the methods with or without addressing the measurement error effects show that the mortality rates over time are, respectively, below 6% and above 7.5%.

The results for Alberta are presented in Supplementary Figures 11-12 . With Definition 1 the methods with and without measurement error accommodated suggest that the mortality rates are, respectively, around 2% and 4% over time. With Definition 2, the methods with or without addressing the measurement error effects show that the historical mortality rate and its predictions are, respectively, below 2% and above 2%.

The specification of lag p for model (1) of the true mortality rates {X i,t : t = 1, . . . , T } is based on (23) which is derived from the reported mortality rates {X * i,t : t = 1, . . . , T }, but not from {X i,t : t = 1, . . . , T } itself. This discrepancy introduces the possibility of model misspecification when featuring the series X i,t using (1). To investigate this, we conduct a sensitivity analysis by considering the AR(p) with a different value of p for the X i,t from Definition 1. As Table 5 indicates the feasibility of using AR(1) for all four provinces, here we further employ the AR(2) model to do forecasting for the period from May 5 to May 9.

In Table 3 , we report the observed and expected prediction errors of the forecasting using AR(2) models in comparison with AR(1) models. Comparing different lag orders of the autoregressive models, we find that in terms of the observed prediction error, the selected AR(1) models have better performance than the AR(2) models for the data of Ontario and Alberta, and the results for British Columbia and Quebec are fairly similar. It is noticed that both the observed prediction error and the expected prediction error associated with the proposed method tend to become small when the degree of measurement error increases for British Columbia, Ontario, and Quebec.

[ Place Table 3 About Here ]

In this article, we investigate the impact of measurement error on time series analysis under autoregressive models and establish analytic results under the additive and multiplicative measurement error models. We propose an estimating equation method to correct for the biases induced from the naive analysis which disregards the differences between the true measurements and their surrogate measurements. We rigorously establish the theoretical re-22 sults for the proposed method. As a genuine application, we apply to the proposed method to analyze the mortality rates of COVID-19 data in four provinces, British Columbia, Ontario, Quebec, and Alberta, which have the most severe virus outbreaks in Canada. The real data analysis clearly demonstrates that incorporating measurement error in the analysis can uncover various different results.

Our method has the flexibility or robustness in that distribution assumptions are required to describe the measurement error process as well as the time series autoregressive process.

While our research is motivated by the faulty nature of COVID-19 data, the proposed method can be applied to handle other problems related to error-contaminated time series. Our development here is directed to using autoregressive models to delineate time series data.

The same principles can be applied to other model forms such as moving average models or autoregressive moving average models which may be used to handle error-prone time series data, where technical details can be more notationally involved.

When checking the stationarity of time series, we apply the ADF test to the observed time series X * t , which is mainly driven by the unavailability of the true values of X t , as well as the fact that the weakly stationarity of observed time series implies the weakly stationarity of the true time series if measurement error is featured with (7) or (9). It is interesting to rigorously develop a formal test similar to the ADF test to handle time series subject to measurement error. Table 2 : Definition 2: The parameter estimation under different measurement error models: the AR(2) model with "no differencing" is used to fit the data of Ontario, the AR(1) model with "order-1 differencing" is used to fit the data of Alberta, and the AR(2) model with "order-1 differencing" is used to fit the data of British Columbia and Quebec. based on the additive (in blue) or multiplicative (in red) versus the naive model (in dark yellow); the reported mortality rates (in black) and the adjusted true mortality rate accounting for the asymptomatic cases (in green).

Data under Autoregressive Models with Application of COVID-19 Data" (R4) For any p, 1

While the two process {X t : t = 1, . . . , T } and {X * t : t = 1, . . . , T } are constrained by the measurement error model (7) or (9), they can both be assumed to be stationary without inducing conflicting requirements on the associated processes. Obviously, the weak stationarity of {X t : t = 1, . . . , T } implies the weak stationarity of {X * t : t = 1, . . . , T } if they are linked by (7) or (9). Condition (R3) says that as the time series goes long enough, the average of the covariances between any paired variables is is negligible. Condition (R4) requires the summation of the third moment of X t is O(T ), which is needed in Theorem 4 when φ 0 = 0; this condition can be satisfied if E( 3 t ) = 0, for example.

Applying the weak law of large numbers to φ * 1 given by (12), we obtain that the estimator φ * 1 converges in probability to

Cov(X * t ,X * t−1 ) Var(X * φ * 1 by using the AR(1) model (1) and the measurement error model (7):

where the second step is due to (7), the third step is because of the independence among the X t and the e t , and the fourth step is because of (1). Since the time series {X t } is stationary, it follows that Var(X t ) = Var(X t−1 ) = σ 2 1−φ 2 1 , and hence

Next, applying the Slutsky's theorem to (12), we have that as T → ∞,

where the limit equals α 0 + α 1 φ 0 1−φ 1 (1 − φ 1 ω 1 ) by (S.1) and the fact that E(X * t ) = α 0 + α 1 φ 0 1−φ 1 . Finally, plugging the AR(1) model (1) into the measurement error model (11), we obtain that

On the other hand, plugging the measurement error model (7) into the working model (11), we obtain that

Then equating (S.2) and (S.

Consequently, by the independence assumption for X t−1 , e t and t , we obtain that V ar( * t ) = φ 2 1 α 2 1 (1 − ω 1 ) 2 Var(X t−1 ) + (1 − ω 1 φ 1 ) 2 Var(e t ) + α 2 1 Var( t ) = φ 2 1 α 2 1 (1 − ω 1 ) 2 σ 2 1 − φ 2 1 + (1 − ω 1 φ 1 ) 2 σ 2 e + α 2 1 σ 2 .

As noted in the beginning of A.2, as T → ∞, φ *

. Now we further examine φ * 1 by using the AR(1) model (1) and the measurement error model (9):

where the second step is due to measurement error model (9), the seventh step is because u t , u t−1 and X t−1 are mutually independent, and the second last step is due to E(u t ) = 1.

Since the time series {X t } is stationary, it follows that E(X t ) = E(X t−1 ) = φ 0 1−φ 1 and 3

Var(X t ) = Var(X t−1 ) = σ 2 1−φ 2 1 . Hence (S.4) becomes

Next, applying the Slustky's Theorem to (12) gives that as T → ∞,

by (S.5) as well as E(X * t ) = β 0 φ 0 1−φ 1 . Finally plugging the AR(1) model (1) into the measurement error model (9), we obtain that

On the other hand, plugging the measurement error model (9) into the working model (11), we obtain that

Then equating (S.6) and (S.7) gives that

where the second step is because of the independence assumption as well as E(u 2 t−1 ) = E(u 2 t ) and E(u t−1 ) = E(u t ) such that Var(X t−1 u t ) = Var(X t−1 u t−1 ), and the second last step is due to ω 2 = Var(X t−1 ) Var(u t−1 )Var(X t−1 )+Var(u t−1 )E 2 (X t−1 )+Var(X t−1 ) in (S.5).

For k = 1, . . . , p, applying the weak law of large numbers to γ * k , we obtain that as T → ∞, the estimator γ * k converges in probability to Cov(X * t , X * t−k ), denoted γ * k . Next, we examine γ k . By the form of measurement error model (7), we have that for

and by (8), Var(X * t ) = α 2 1 γ 0 + σ 2 e , which is denoted as γ * 0 . Thus, Theorem 3(1) follows.

First, by Theorem 3(1), we write

. Then the naive estimator φ * is obtained by replacing γ k in

Again, replacing γ k in (6) with γ * k gives the naive estimator φ *

where φ k and φ k are respectively the kth element of φ and φ, the third step is because 9) as well as the model form (7), and the last step is due to the stationarity of the time series {X t } such that E(X t ) = µ.

Finally, noting that the native estimator σ 2 * is given by σ 2 * = γ * 0 − 2 φ * T γ * + φ * T Γ * φ * by applying a version similar to (6), we obtain that

where the second step is due to (8), (S.8) and (S.9).

Step 1: We show certain identities before proving Theorem 3(3).

1. By model (7), we have that

where the first step is because µ * = 1 T T t=1 X * t and in the last stepē = 1 T T t=1 e t . 2. For any t and s, we have that

where the second step is due to the independence of e t and X t , and the last step is by E(e s −ē) = 0.

3. By the independence of e t and e s for t = s, we have that

where the second step is due to the independence of the e t and the X t , and the last step is by E(e s −ē) = 0.

where the third step is due to µ − µ = 1 T s=1 (X s − µ), and the fourth step is because E( µ−µ) = 0 by stationarity of the time series, the second last step is due to lim T →∞ V ar( µ) = 0 (Brockwell et al. 1991, Theorem 7.1.1.) , and the last step due to Condition (R3).

6. Similar to (S.14), we have that

where the last step is due to Condition (R3) and lim T →∞ V ar( µ) = 0 (Brockwell et al. 1991, Theorem 7 .1.1).

E{(e t −ē) 2 } = σ 2 e . 8. By the independence of e t and X t , for any s and t, we have that

where the last step is due to E(X t − µ) = 0 and E(e t −ē) = 0.

9. For any t = s, Cov {(e t −ē) 2 , (e s −ē) 2 } = 0; and for t = s, 11. For any t, (S.20) where the second step is because of E(X t − µ) = 0 and the independence of X t and e t , the third step is due to (S.16) and (S.15). Hence,

Similarly,

Then, similarly, In addition, by (S.16), we have that

Step 2: Now we prove the results in (3).

11 1 • . We first show the derivation of q * 100 as follows:

Cov (e t −ē) 2 , (e s −ē) 2 = α 4 1 q 00 + lim

where the second step is due to (S.10), the third step is because of (S.11), (S.17), and the

the fifth step is due to (S.12) and (S.18), and the sixth step is because (S.13) and (S.18), and the last step is because (S.16) and (S.17). 2 • . We derive the value of q * 10p :

where the second step is due to (S.10), the third step is because of (S.11) and (S.17), the fourth step is by definition that q 0p = lim T →∞ T Cov 1 .19) , the fifth step is due to (S.12), and the last step is result from (S.16) and (S.15).

where the second step is due to (S.10), the third step is because of (S.11) and a similar version to (S.17), the fourth step is because (S.22) and by the definition that For k = 1, . . . , p, applying the weak law of large numbers to γ * k , we obtain that as T → ∞, the estimator γ * k converges in probability to Cov(X * t , X * t−k ), which is denoted as γ * k . Next, we examine γ k . By the form of measurement error model (9), we have that for

and by (10), Var(X * t ) = β 2 0 {(σ 2 u + 1)γ 0 + σ 2 u µ 2 }, which is denoted as γ * 0 . Thus, Theorem 4(1) follows.

First, by Theorem 4(1), we write

. Then the naive estimator φ * is obtained by replacing γ k in

Again, by replacing γ k in (6) with γ * k gives the naive estimator φ *

where φ k and φ k are respectively the kth element of φ and φ, the third step is because φ k = φ k + o p (1) by (S.26) as well as the model form (9), and the last step is due to the stationarity of the time series {X t } such that E(X t ) = µ.

Finally, noting that the native estimator σ * 2 is given by σ * 2 = γ * 0 − 2 φ * T γ * + φ * T Γ * φ * by applying a version similar to (6), we obtain that

Step 1: We show that as T → ∞,

With some simple algebra,

where the first step is because X t and u t are independent, and the second last step is due to E(u 2 t ) = V ar(u t ) + E(u 2 t ) = σ 2 u + 1 and is derived similar to the second and third step in (S.35), and the last step is because of the definition that q 0p = lim T →∞

and E(u 3 t ) are time-independent, derived from Conditions (R1) and (R2) together with the independence between u t and X t .

4. Analogous to the derivation in (S.35) and (S.36), we derive the summation of Cov{u t u t+p (X t − µ)(X t+p − µ), u s u s+r (X s − µ)(X s+r − µ)} for p > 0, r > 0 and p = r,

where the third step is derived analogously to the second step of (S.36), and E(u t u t+p u s u s+r ) = 1, and the last step is due to the definition q pr = lim T →∞

are time-independent derived from Conditions (R1) and (R2).

5. Similar to the derivation in (S.35), (S.36), and (S.37), we derive the summation of

where the last step is by the definition q pp = lim T →∞

due to the stationarity of the time series and the fact that Var{(X t − µ)(X t+p − µ)} and E{(X t −µ)(X t+p −µ) 2 (X t+2p −µ)} are time-independent, resulting from the Conditions (R1) and (R2).

6. For any t, s and p, we have that

where the last step is because E(X s − µ) = 0.

7. For any t and s, we have that

where the second step is because of the independence between u t and X t and that E(X t −µ) = 0. Then, E{u t (u t −1)u s (u s −1)} = σ 4 u for t = s and E{u 2 t (u t −1) 2 } = E(u 4 t )−2E(u 3 t )+σ 2 u +1 for any t.

By (S.40), we have that

where the last is because lim T →∞ Brockwell et al. 1991, Theorem 7.1.1) .

8. For any t, s and p > 0, we have that

where the second step is because of the independence between u t and X t and that E(X t −µ) = 0. Then, E{u t (u t − 1)u s+p (u s − 1)} = 0 for t = s and E{u t (u t − 1) 2 u t+p } = E{u t (u t − 1) 2 } = E{(u t − 1) 3 } + σ 2 u for any s = t.

9. By independence of u t and u s , for t = s, we have that (S.43) and for any t,

10. By independence of u t and u s , for s = t, s = t + p and any p, we have that

For any t and p > 0,

and

11. For any t and s, and r = p and r > 0, we have that

By independence of u t and u s , for t = s and any p, we have that (S.48) and for any t and p > 0,

12. For any t, we have that

where the last step is because E(X t − µ) = 0.

In addition, for any t and p we have that

and for any t, we have that 

where the second step is by (S.34), the third step is because (S.39) and (S.50), and the last step is because (S.38), (S.48), (S.49) and (S.52).

Supplementary (1) AR (1) AR (1) AR (1) Additive ( (May 5 -May 9) based on the additive (in blue) or multiplicative (in red) versus the naive model (in dark yellow); the reported mortality rates (in black) and the adjusted true mortality rate accounting for the asymptomatic cases (in green). 

Real estimates of mortality following COVID-19 infection. The Lancet Infectious Diseases

Time Series Analysis: Forecasting and Control

Introduction to Time Series and Forecasting

Time Series: Theory and Methods

Lag order and critical values of the augmented dickeyfuller test

COVID-19: four fifths of cases are asymptomatic, China figures indicate

Estimation in autoregressive model with measurement error

Estimation of the basic reproduction number, average incubation time, asymptomatic infection rate, and case fatality rate for COVID-19: Metaanalysis and sensitivity analysis

Mortality: Statistics

Asymptomatic and presymptomatic SARS-CoV-2 infections in residents of a long-term care skilled nursing facilityKing County

Theoretical comparisons of block bootstrap methods

Measurement error in linear autoregressive models

A unified approach to the measurement error problem in time series models

Prediction and forecasting in linear models with measurement error

1pr for p > 0, r > 0 and p = r:+α 1 (X t+p − µ)(e t −ē) + (e t −ē)(e t+p −ē)} ,+ α 1 (X s+r − µ)(e s −ē) + (e s −ē)(e s+r −ē)where the second step is due to (S.10), the third step is because of (S.11) and a similar version to (S.17), the fourth step is because (S.22) and by the definition that, and the last step is from (S.20) and (S.21). 15 4 • . Finally, we present the derivation of q * 1pp for p = 0,Now we examine each term in (S.28) as T → ∞ separately. First,Next, we examine the second term I 2 in (S.28). Since (Brockwell et al. 1991, p.230 ) and T − 1 2 pVar(X t ) → 0 as T → ∞, we have thatFinally, we examine I 3 in (S.28).In addition, by the weak law of large numbers,By condition (R2) and the central limit theorem for strictly stationary p-dependent sequences (Brockwell et al. 1991, Theorem 6.4 .2), we have Step 2: We show that as T → ∞, the asymptotic covariance matrix ofwhere the last step is due to (S.27).Hence, the (r, q) element of matrix limStep 3: We show certain identities to be used for proving Theorem 4(3):1. By model (9), we have thatwhere the first step is becausewhere the second and third step is due to the independence between u t and X t . In the last step, we use the definition q 00 = lim T →∞, and the fact that E(u 4 t ) and E{(X t − µ) 4 } are time-independent which are derived from Conditions (R1) and (R2) together with independence between u t and X t .3. Similar to the derivation in (S.35), now we derive the summation of Cov{β 2 0 u 2 t (X t −Step 4: Now we prove the results in (3).1 • . We first show the derivation of q * 200 as follows:where the second step is due to (S.34), the third step is because of (S.50), the last step is by 29 2 • . Then we derive the value of q * 20p :Cov (u t − 1) 2 , (u s − 1)(u s+p − 1) = β 4 0 q 0p (σ 2 u + 1) + β 4 0 E(u 3 t ) − (σ 2 u + 1) E{(X t − µ) 3 (X t+p − µ)} + E{(X t − µ) 3 (X t−p − µ)}where the second step is by (S.34), the third step is because (S.39) and (S.50), and the second last step is because (S.36), (S.47), (S.46), (S.42), and (S.51).30 3 • . Then we derive the value of q * 2pr for r = p q * 2pr = lim