UNIVERSITY OF 
 
 ILLINOIS LIBRARY 
 
 AT URBANA-CHAMPAIGN 
 
 BOOKSTACKS 
 
CENTRAL CIRCULATION BOOKSTACKS 
 
 The person charging this material is re- 
 sponsible for its renewal or its return to 
 the library from which it was borrowed 
 on or before the Latest Date stamped 
 below. You may be charged a minimum 
 fee of $75.00 for each lost book. 
 
 Theft, mutilation, and underlining of book* are reasons 
 for disciplinary action and may result In dismissal from 
 the University. 
 TO RENEW CALL TELEPHONE CENTER, 333-8400 
 
 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 
 
 HU 
 
 G 1 1997 
 
 When renewing by phone, write new due date below 
 
 r\fnrir\t,a Huo /Hots T 1 (S> 
 
 previous due date. 
 
 L162 
 
Faculty Working Paper 92-0107 
 
 330 STX 
 
 B385 
 
 199£:107 COPY 2 
 
 Preliminary Test Estimation for the 
 Second Order Autoregression 
 
 I 1992 
 
 Robert Bohrer Kunjung Lai 
 
 Department of Statistics Department qf Statistics 
 
 University of Illinois University of Illinois 
 
 Thomas A. Yancey 
 
 Department of Economics 
 University of Illinois 
 
 Bureau of Economic and Business Research 
 
 College of Commerce and Business Administration 
 
 University of Illinois at Urbana-Champaign 
 
BEBR 
 
 FACULTY WORKING PAPER NO. 92-0107 
 
 College of Commerce and Business Administration 
 
 University of Illinois at Urbana-Champaign 
 
 February 1992 
 
 Preliminary Test Estimation for the 
 Second Order Autoregression 
 
 Robert Bohrer 
 Kunjung Lai 
 
 Department of Statistics 
 
 Thomas A. Yancey 
 Department of Economics 
 
Digitized by the Internet Archive 
 
 in 2012 with funding from 
 
 University of Illinois Urbana-Champaign 
 
 http://www.archive.org/details/preliminaryteste92107bohr 
 
PRELIMINARY TEST ESTIMATION FOR THE SECOND 
 ORDER AUTOREGRESSION 
 
 Robert Bohrer Kunjung Lai Thomas A. Yancey 
 
 Dept. of Statistics Dept. of Statistics Dept. of Economics 
 
 Univ. of Illinois Univ. of Illinois Univ. of Illinois 
 
 Champaign 61820 Champaign 61820 Champaign 61820 
 
 Key Words and Phrases: preliminary test; autoregressive model; asymp- 
 totic squared error risk; Monte Carlo. 
 
 ABSTRACT 
 
 Consequences of preliminary test model identification procedures in 
 time series analysis are examined in the context of a squared error risk func- 
 tion. Asymptotic risk comparisons and Monte Carlo studies are used in 
 comparing these procedures. 
 
 1. INTRODUCTION 
 
 In order to obtain parameter estimators for statistical models with desir- 
 able statistical properties, assumptions are made about the properties of the 
 random elements and the functional form of the model generating the data. 
 Since the investigator is unsure of the applicability of these assumptions, null 
 hypotheses reflecting the assumptions are tested, and the model is retained 
 or altered and reestimated depending on the outcomes of these tests. This 
 
 1 
 
procedure leads to preliminary test estimators whose properties include the 
 outcomes of these preliminary tests. Most of the work on preliminary test 
 estimators to date has concentrated on the standard linear statistical model 
 and some extensions of it. A series of papers and books, Miyazaki, Judge and 
 Yancey (1986), Judge and Yancey (1986), Yancey, Judge and Bohrer (1989), 
 Giles and Clarke (1989), Judge, Bohrer and Yancey (1990), Giles (1991) and 
 Yancey and Bohrer (1992) concentrate on inequality constraint conjectures 
 for the parameters of the linear model and extensions with various assump- 
 tions about the generation of the error term. 
 
 In this paper we are interested in determining the order of an autore- 
 gressive model based on testing hypotheses to decide which parameters in 
 the model are zero, and the model is altered if necessary. Since pretesting is 
 commonly done to determine the correct autoregressive model in time series 
 analysis in econometrics, we are interested in comparing the risk function of 
 the pretest estimator with that of the maximum likelihood estimator, mle. 
 We use a second order autoregressive model, AR(2), which may be an AR(1) 
 or an AR(0) model depending on where we are in the parameter space. 
 
 The paper is organized as follows: Section 2 discusses the model and the 
 estimation procedures. Section 3 presents the risk of the mle procedure for 
 the initial model. Section 4 discusses the risk of a pseudo pretest estimator 
 which honors the outcome of the hypothesis test, but uses only the initial mle. 
 Section 5 presents the true pretest estimator risk where the model is altered 
 and reestimated if the hypothesis is accepted. Section 6 is the summary and 
 conclusions. 
 
2. DEFINITION OF FRAMEWORK AND PROCEDURES 
 
 For t = 1, . . . , n, X t is a stationary normal second order autoregression, 
 that is, 
 
 X t — 0iX t _i + faX t -2 + z u 
 
 where Z t are identically distributed, independent normal A^O, a 2 ) random 
 variables. Stationarity requires that the zeros of s 2 — <f>\S — (p 2 = be both 
 within the unit circle. Denote maximum likelihood estimators for the autore- 
 gressive parameters as 4>\ and fa. Brockwell and Davis (1987) derived the 
 asymptotic distribution for these maximum likelihood estimators as unbiased 
 with Var(4>x) = Var(fa) = (1 - <j> 2 )/n an d Cov(4> 1 , fa) — — fa{\ + fa)/ 71 - 
 
 Denote (f) = (<f>\,fa) and p = (pi,p 2 )- The goal is to estimate (p to 
 minimize the square error risk of the estimator p, that is, the risk ol the 
 estimator p at (p is /?(p, (p) — E(p x — (f>\) 2 + E(p 2 — fa) 2 . Three procedures 
 are considered. Procedure zero is not a preliminary test estimator; rather it 
 merely estimates <p by the maximum likelihood estimators <p = {<f>\,fa). 
 
 Procedure one gives preliminary test estimator </>= (<j>i,fa). First, 
 it does a test, T , of size a for the null hypothesis that 4>\ = 4> 2 = 0. If 
 T accepts the null hypothesis, then set (pi=fa= 0. If T rejects the null 
 hypothesis, then the test T\ of size a is done for the first order, i.e., that 
 fa = 0. If T\ accepts, then set (f) X = 4>\ and fa= 0. If 7\ rejects, then set 
 fa= </>! and fa= fa . 
 
 The tests T and 7\ are chi-square tests based on the asymptotic distri- 
 bution for <fri and fa, namely, if the null hypothesis that <f>i = fa — is true, 
 then the large sample distribution of fa and fa is that of independent normal 
 
- 2 
 
 7V(0, l/n). Thus the acceptance region for To is C = [<f)\ + 02 < Xai")/ n ]- 
 Also under the hypothesis of first order, the large sample distribution of 
 
 4>2 is normal iV(0, 1/n). Thus the acceptance region for T\ is the strip 
 
 - 2 
 S = [<f>2 < XaO-)/ n ]' Let C be the complement of C and S' be the comple-' 
 
 ment of 5. Thus ( (f>\, 02 ) = (0, 0) on C; ( 0i, 02 )= [<t>\, 0) on C D 5; and 
 
 (Lfa) = (Lfa)onC'nS'. 
 
 Procedure two is very similar to procedure one, but leads to two pretest 
 
 estimators. The only difference is in the case T rejects and T\ accepts. In 
 
 procedure two 0i is replaced by 0j which is the maximum likelihood estimator 
 
 for 0! assuming a first order model rather than using <f>\. Thus, procedure 
 
 mm mm mm mm „ 
 
 two builds the estimate ( 0i, 02 ) = (0, 0) on C; ( 0i, 02 ) = (0i, 0) on C'flS; 
 and ( <f>i,(f>2 ) = (0i,02) on C f) S'. 
 
 3. ASYMPTOTIC RISK OF PROCEDURE ZERO 
 
 The asymptotic risk of procedure zero is 
 
 fl(0,0) = Var{4> x ) + Var(fo) = 2(1 - 0*)/n. 
 
 4. ASYMPTOTIC RISK OF PROCEDURE ONE 
 
 For A any two-dimensional set, denote the indicator function, /.4(p), 
 which takes the value one when p falls in A and is zero otherwise. The risk 
 
 m 
 
 R{4>,(j)) of procedure one is the expectation with respect to the asymptotic 
 distribution for <f> of the loss 
 
 L(0,0) = (^+^)/c(0) + ((^l-0l) 2 +^)/c'n5(0) + ((^-0l) 2 + (^2-02) 2 )/c'n5'(^) 
 
And this risk can be calculated by numerical integration. 
 
 4.1 NOTES ON COMPUTATION 
 
 Calculations are accomplished using an algorithm for bivariate integra- 
 tion. The outer integral is over <£ 2 , and the plane is partitioned into nine 
 sets including C, C D S D [4>\ > 0], C fl S fl [<f>i < 0], and six sets con- 
 venient for partitioning C H 5'. One such bivariate integration algorithm 
 is that of Tavernini as described by Milton (1972). Tavernini's algorithm 
 uses the Newton-Cotes method and has an approximate bound for the er- 
 ror. Subdivision is terminated when successive Simpson rule approximations 
 are sufficiently close to each other. This leaves the possibility of premature 
 termination if the integrand is minuscule over most of the region of the in- 
 tegration. For successful use, integration must be confined to subsets of the 
 desired integration region on which the integrand is not too close to zero. 
 
 4.2 EXAMPLE 
 
 Let a=0.05 and the sample size be n=100. The asymptotic risk /?(</>, 0) 
 is evaluated for stationary pairs (<^, <f> 2 ) such that (f) l = —0.9, —0.8, . . . , 0.8,0.9. 
 Stationary pairs arc those for which ±<t>\ + 2 < 1- Of more interest than 
 the asymptotic risk of procedure one is the comparison of this risk with 
 the asymptotic risk of procedure zero. This is measured by relative regret 
 (R((f>,<f>) — R{(j> ,<}))) I R{(f) , <f>) for procedure one and shown in Figure 1. 
 
The preliminary test procedure one is preferred to the non-preliminary 
 test procedure zero, where the relative regret is negative. Figure 1 shows 
 a contour plot of the relative regret in the (4>i,<f>2) plane. The fact that 
 procedures zero and one are nearly identical at (f) values away from C U S is 
 shown by the small values of regret there. For other cf) values, preliminary 
 test procedure one is preferred to procedure zero near the origin and along 
 the <j)\ = axis. This follows because procedure one takes advantage of 
 preliminary testing and making the correct decision in the hypothesis test. 
 In an annulus around the boundary of region C = [| <p\ |< 0.245], the regret 
 is positive because of the probability of using (p\ — when it is not. 
 
 -£ 
 
 <J 
 
 ^^^^ 
 
 9 a— J — I — I — 1 — 1 — I T I — I i— ri i ■ i i 
 
 •0.9 -0.6 -0.3 0.0 0.2 0.4 0.6 08 
 
 FIG. 1: Contour plot of the regret of proced 
 
 ure one. 
 
4.3 MONTE CARLO VERIFICATION OF THE ASYMPTOTIC 
 RISK OF PROCEDURE ONE 
 
 It is of interest to assess whether the asymptotic risk R{<fi, 4>) is a good 
 approximation to the risk with finite sample sizes. And to this end, 100 
 samples of n observations are generated for each of the 55 systematically 
 sampled points shown in Figure 2. 
 
 m 
 o 
 
 *2 
 
 m 
 
 -1.0 
 
 -0.5 
 
 0.0 
 
 0.5 
 
 1.0 
 
 FIG. 2: The 55 Mjnpled (<p { , fa) poinU. 
 
 For each sample of n points the loss is calculated and the risk is es- 
 timated by the average R((f),(f)) of these losses. In addition, a sample vari- 
 ance V { of these 100 losses is calculated. The measure of the fit for the 
 data to the asymptotic theory is given by the p- level of the x 2 (l) statistic 
 H)0{R{4>,(f>) - tf(<J>,0)) 2 /Vi. The larger the p-level, the better the fit. If the 
 asymptotic theory were exact, these p-levels would be uniformly distributed 
 
on the unit interval. Using the maximum likelihood fitting procedure us- 
 ing the arimamle command for autoregressions from Splus on the Sun 3/50 
 system, the 55 p-levels are calculated for n=100 and n=400. The empirical 
 cumulative distribution for these />-levels is shown in Figure 3 along with 
 the ramp shape cumulative distribution for the uniform on the unit interval. 
 For both n=100 and n=400, the Kolmogorov-Smirnov test of size 0.05 would 
 accept the hypothesis of the uniform distribution. It is somewhat supris- 
 ing that the Kolmogorov-Smirnov statistic is somewhat larger for the larger 
 sample size. 
 
 co 
 d 
 
 > o 
 
 •^ d 
 
 CM 
 
 d 
 
 o 
 d 
 
 0.0 0.2 0.4 0.6 0.8 1.0 
 p- level 
 n=400 
 
 CO 
 
 d 
 
 v Z; 
 > o 
 
 •J 2 
 
 C\J 
 
 d 
 
 o 
 d 
 
 0.0 0.2 0.4 0.6 0.8 1.0 
 p- level 
 n«100 
 
 FIG. 3: The Elmpirical cumulalive dislnbutiuns of p-leveis. 
 
 Two computing comments are note-worthy. First, for each 4> value 
 and for both n=100 and n=400, the 100 runs of procedure one requires 
 about 3 hours of computing time. Second, a similar Monte Carlo study was 
 conducted with the older Splus version of maximum likelihood fitting by 
 
using arima.mle command, and the empirical distribution of p-levels was 
 found to reject the hypothesis of uniform distribution at significance level 
 0.01. 
 
 5. SOME MONTE CARLO EVIDENCE ABOUT PROCEDURE 
 
 TWO 
 
 Procedure two differs from procedure one only for <f> values which lie on 
 C'ClS. Thus one would not expect much difference in R(<f),(f>) — R((f),<f)) for 
 4> values away from C D S. With a=0.05 and n=100, a Monte Carlo study 
 for the difference of risks was conducted over the 55 points in Figure 2. One 
 hundred runs were observed. A statistic for testing of no difference between 
 the risks of procedures one and two is 100( /?(</>, 4>) — /?(</), </))) 2 /(V / 1 + V 2 ), 
 where V{ is the sample variance of the 100 samples of procedure i. Under 
 the null hypothesis, this statistic has the \ (1) distribution, and 7>levels are 
 uniformly distributed on the unit interval. Under the null hypothesis, one 
 expects 5.5 ;>levels less than 0.10. Among the 55 points, 9 have p-levels less 
 than 0.10. Most of these were near the strip S and had R(<f),(f)) less than 
 R{4>, <f>). For further evidence on comparisons of procedures one and two on 
 the region C fl 5, 100 Monte Carlo runs were done for each of 10 points 
 with 4>2 = and (f> x in C H S . For all 10 points, the risk of procedure two 
 was less than that of procedure one by from 5 to 43 percent. In summary, it 
 appears that procedures one and two are not significantly different except for 
 values near the strip S where procedure two does somewhat better than 
 procedure one. 
 
6. SUMMARY AND CONCLUSIONS 
 
 We have considered the time series problem of determining the order 
 of an AR process by testing the significance of coefficients of lagged variables 
 as is commonly done to identify the model. Starting with a stationary AR(2) 
 model, procedures zero, one and two were used and the risk functions under 
 squared error loss were obtained. It was found that although there was 
 little difference between the risks of procedures one and two over most of 
 the parameter space, procedure two did lead to a smaller risk in regions 
 where they differed. For both procedures, the properties of the estimators 
 are recognized explicitly which is not done with the usual diagnotic and 
 identification procedure used in time series analysis. Further analysis of the 
 pretesting done in the identification of time series models should consider the 
 moving average, MA, and ARMA models. One difficulty with this extension 
 is that the asymptotic theory used here for the AR(2) model breaks down 
 for the ARM A (1,1) model. 
 
 10 
 
BIBLIOGRAPHY 
 
 Brockwell, P. J. and Davis, R. A. (1986). Time Series: Theory and Methods, 
 Springer- Verlag, New York. 
 
 Giles, D. E. A. and Clarke, J. A. (1989). Preliminary-test estimation of the 
 scale parameter in a mis-specified regression model. Economics Letters, 
 30, 201-205. 
 
 Giles, J. A. (1990). Pretesting in the misspecified regression model. Com- 
 munictions in Statistics: Theory and Methods, 20 (10), 3221-3238. 
 
 Judge, G. G., Bohrer, R. and Yancey, T. A. (1990). Some statistical impli- 
 cations of multivariate inequality constrained testing. Communications 
 in Statistics: Theory and Methods, 19 (2), 413-430. 
 
 Judge, G. G. and Yancey, T. A. (1986). Improved Methods of Inference in 
 Econometrics, North- Holland Publishing, Amsterdam. 
 
 Milton, R. C. (1972). Computer evaluation of the multivariate normal in- 
 tegral. Technometrics, 14, 881-889. 
 
 Miyazaki, S., Judge, G. G. and Yancey, T. A. (1989). Estimation of location 
 parameters under nonnormal errors and quadratic loss. Journal of 
 Business and Economic Statistics, 4(2), 263-268. 
 
 Yancey, T. A., Judge, G. G. and Bohrer, R. (1989). Sampling performance 
 of some joint one-sided preliminary test estimators under squared error 
 loss. Econometrica, 51(5), 1221-1228. 
 
 Yancey, T. A. and Bohrer, R. (1992). "Risk and power for inequality pretest 
 estimators: general case in two dimensions" in William E. Griffiths, 
 Helmut Lutkepohl and Mary Ellen Bock (eds) Reading in Econometric 
 Theory and Practice: A Volume in Honor of George Judge. North- 
 Holland Publishing, Amsterdam. 
 
 11 
 
HECKMAN 
 
 BINDERY INC. 
 
 JUN95 
 
 |B-nJ-r„.iw N.MANCHESTER 
 INDIANA 46962