UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN BOOKSTACKS CENTRAL CIRCULATION BOOKSTACKS The person charging this material is re- sponsible for its renewal or its return to the library from which it was borrowed on or before the Latest Date stamped below. You may be charged a minimum fee of $75.00 for each lost book. Theft, mutilation, and underlining of book* are reasons for disciplinary action and may result In dismissal from the University. TO RENEW CALL TELEPHONE CENTER, 333-8400 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN HU G 1 1997 When renewing by phone, write new due date below r\fnrir\t,a Huo /Hots T 1 (S> previous due date. L162 Faculty Working Paper 92-0107 330 STX B385 199£:107 COPY 2 Preliminary Test Estimation for the Second Order Autoregression I 1992 Robert Bohrer Kunjung Lai Department of Statistics Department qf Statistics University of Illinois University of Illinois Thomas A. Yancey Department of Economics University of Illinois Bureau of Economic and Business Research College of Commerce and Business Administration University of Illinois at Urbana-Champaign BEBR FACULTY WORKING PAPER NO. 92-0107 College of Commerce and Business Administration University of Illinois at Urbana-Champaign February 1992 Preliminary Test Estimation for the Second Order Autoregression Robert Bohrer Kunjung Lai Department of Statistics Thomas A. Yancey Department of Economics Digitized by the Internet Archive in 2012 with funding from University of Illinois Urbana-Champaign http://www.archive.org/details/preliminaryteste92107bohr PRELIMINARY TEST ESTIMATION FOR THE SECOND ORDER AUTOREGRESSION Robert Bohrer Kunjung Lai Thomas A. Yancey Dept. of Statistics Dept. of Statistics Dept. of Economics Univ. of Illinois Univ. of Illinois Univ. of Illinois Champaign 61820 Champaign 61820 Champaign 61820 Key Words and Phrases: preliminary test; autoregressive model; asymp- totic squared error risk; Monte Carlo. ABSTRACT Consequences of preliminary test model identification procedures in time series analysis are examined in the context of a squared error risk func- tion. Asymptotic risk comparisons and Monte Carlo studies are used in comparing these procedures. 1. INTRODUCTION In order to obtain parameter estimators for statistical models with desir- able statistical properties, assumptions are made about the properties of the random elements and the functional form of the model generating the data. Since the investigator is unsure of the applicability of these assumptions, null hypotheses reflecting the assumptions are tested, and the model is retained or altered and reestimated depending on the outcomes of these tests. This 1 procedure leads to preliminary test estimators whose properties include the outcomes of these preliminary tests. Most of the work on preliminary test estimators to date has concentrated on the standard linear statistical model and some extensions of it. A series of papers and books, Miyazaki, Judge and Yancey (1986), Judge and Yancey (1986), Yancey, Judge and Bohrer (1989), Giles and Clarke (1989), Judge, Bohrer and Yancey (1990), Giles (1991) and Yancey and Bohrer (1992) concentrate on inequality constraint conjectures for the parameters of the linear model and extensions with various assump- tions about the generation of the error term. In this paper we are interested in determining the order of an autore- gressive model based on testing hypotheses to decide which parameters in the model are zero, and the model is altered if necessary. Since pretesting is commonly done to determine the correct autoregressive model in time series analysis in econometrics, we are interested in comparing the risk function of the pretest estimator with that of the maximum likelihood estimator, mle. We use a second order autoregressive model, AR(2), which may be an AR(1) or an AR(0) model depending on where we are in the parameter space. The paper is organized as follows: Section 2 discusses the model and the estimation procedures. Section 3 presents the risk of the mle procedure for the initial model. Section 4 discusses the risk of a pseudo pretest estimator which honors the outcome of the hypothesis test, but uses only the initial mle. Section 5 presents the true pretest estimator risk where the model is altered and reestimated if the hypothesis is accepted. Section 6 is the summary and conclusions. 2. DEFINITION OF FRAMEWORK AND PROCEDURES For t = 1, . . . , n, X t is a stationary normal second order autoregression, that is, X t — 0iX t _i + faX t -2 + z u where Z t are identically distributed, independent normal A^O, a 2 ) random variables. Stationarity requires that the zeros of s 2 — \S — (p 2 = be both within the unit circle. Denote maximum likelihood estimators for the autore- gressive parameters as 4>\ and fa. Brockwell and Davis (1987) derived the asymptotic distribution for these maximum likelihood estimators as unbiased with Var(4>x) = Var(fa) = (1 - 2 )/n an d Cov(4> 1 , fa) — — fa{\ + fa)/ 71 - Denote (f) = (\,fa) and p = (pi,p 2 )- The goal is to estimate (p to minimize the square error risk of the estimator p, that is, the risk ol the estimator p at (p is /?(p, (p) — E(p x — (f>\) 2 + E(p 2 — fa) 2 . Three procedures are considered. Procedure zero is not a preliminary test estimator; rather it merely estimates

\,fa). Procedure one gives preliminary test estimator = (i,fa). First, it does a test, T , of size a for the null hypothesis that 4>\ = 4> 2 = 0. If T accepts the null hypothesis, then set (pi=fa= 0. If T rejects the null hypothesis, then the test T\ of size a is done for the first order, i.e., that fa = 0. If T\ accepts, then set (f) X = 4>\ and fa= 0. If 7\ rejects, then set fa= ! and fa= fa . The tests T and 7\ are chi-square tests based on the asymptotic distri- bution for i = fa — is true, then the large sample distribution of fa and fa is that of independent normal - 2 7V(0, l/n). Thus the acceptance region for To is C = [2 is normal iV(0, 1/n). Thus the acceptance region for T\ is the strip - 2 S = [2 < XaO-)/ n ]' Let C be the complement of C and S' be the comple-' ment of 5. Thus ( (f>\, 02 ) = (0, 0) on C; ( 0i, 02 )= [\, 0) on C D 5; and (Lfa) = (Lfa)onC'nS'. Procedure two is very similar to procedure one, but leads to two pretest estimators. The only difference is in the case T rejects and T\ accepts. In procedure two 0i is replaced by 0j which is the maximum likelihood estimator for 0! assuming a first order model rather than using \. Thus, procedure mm mm mm mm „ two builds the estimate ( 0i, 02 ) = (0, 0) on C; ( 0i, 02 ) = (0i, 0) on C'flS; and ( i,(f>2 ) = (0i,02) on C f) S'. 3. ASYMPTOTIC RISK OF PROCEDURE ZERO The asymptotic risk of procedure zero is fl(0,0) = Var{4> x ) + Var(fo) = 2(1 - 0*)/n. 4. ASYMPTOTIC RISK OF PROCEDURE ONE For A any two-dimensional set, denote the indicator function, /.4(p), which takes the value one when p falls in A and is zero otherwise. The risk m R{4>,(j)) of procedure one is the expectation with respect to the asymptotic distribution for of the loss L(0,0) = (^+^)/c(0) + ((^l-0l) 2 +^)/c'n5(0) + ((^-0l) 2 + (^2-02) 2 )/c'n5'(^) And this risk can be calculated by numerical integration. 4.1 NOTES ON COMPUTATION Calculations are accomplished using an algorithm for bivariate integra- tion. The outer integral is over <£ 2 , and the plane is partitioned into nine sets including C, C D S D [4>\ > 0], C fl S fl [i < 0], and six sets con- venient for partitioning C H 5'. One such bivariate integration algorithm is that of Tavernini as described by Milton (1972). Tavernini's algorithm uses the Newton-Cotes method and has an approximate bound for the er- ror. Subdivision is terminated when successive Simpson rule approximations are sufficiently close to each other. This leaves the possibility of premature termination if the integrand is minuscule over most of the region of the in- tegration. For successful use, integration must be confined to subsets of the desired integration region on which the integrand is not too close to zero. 4.2 EXAMPLE Let a=0.05 and the sample size be n=100. The asymptotic risk /?(, 0) is evaluated for stationary pairs (<^, 2 ) such that (f) l = —0.9, —0.8, . . . , 0.8,0.9. Stationary pairs arc those for which ±\ + 2 < 1- Of more interest than the asymptotic risk of procedure one is the comparison of this risk with the asymptotic risk of procedure zero. This is measured by relative regret (R((f>,) — R{(j> ,<}))) I R{(f) , ) for procedure one and shown in Figure 1. The preliminary test procedure one is preferred to the non-preliminary test procedure zero, where the relative regret is negative. Figure 1 shows a contour plot of the relative regret in the (4>i,2) plane. The fact that procedures zero and one are nearly identical at (f) values away from C U S is shown by the small values of regret there. For other cf) values, preliminary test procedure one is preferred to procedure zero near the origin and along the ) is a good approximation to the risk with finite sample sizes. And to this end, 100 samples of n observations are generated for each of the 55 systematically sampled points shown in Figure 2. m o *2 m -1.0 -0.5 0.0 0.5 1.0 FIG. 2: The 55 Mjnpled (

,(f>) - tf(,0)) 2 /Vi. The larger the p-level, the better the fit. If the asymptotic theory were exact, these p-levels would be uniformly distributed on the unit interval. Using the maximum likelihood fitting procedure us- ing the arimamle command for autoregressions from Splus on the Sun 3/50 system, the 55 p-levels are calculated for n=100 and n=400. The empirical cumulative distribution for these />-levels is shown in Figure 3 along with the ramp shape cumulative distribution for the uniform on the unit interval. For both n=100 and n=400, the Kolmogorov-Smirnov test of size 0.05 would accept the hypothesis of the uniform distribution. It is somewhat supris- ing that the Kolmogorov-Smirnov statistic is somewhat larger for the larger sample size. co d > o •^ d CM d o d 0.0 0.2 0.4 0.6 0.8 1.0 p- level n=400 CO d v Z; > o •J 2 C\J d o d 0.0 0.2 0.4 0.6 0.8 1.0 p- level n«100 FIG. 3: The Elmpirical cumulalive dislnbutiuns of p-leveis. Two computing comments are note-worthy. First, for each 4> value and for both n=100 and n=400, the 100 runs of procedure one requires about 3 hours of computing time. Second, a similar Monte Carlo study was conducted with the older Splus version of maximum likelihood fitting by using arima.mle command, and the empirical distribution of p-levels was found to reject the hypothesis of uniform distribution at significance level 0.01. 5. SOME MONTE CARLO EVIDENCE ABOUT PROCEDURE TWO Procedure two differs from procedure one only for values which lie on C'ClS. Thus one would not expect much difference in R() — R((f), values away from C D S. With a=0.05 and n=100, a Monte Carlo study for the difference of risks was conducted over the 55 points in Figure 2. One hundred runs were observed. A statistic for testing of no difference between the risks of procedures one and two is 100( /?(, 4>) — /?(levels are uniformly distributed on the unit interval. Under the null hypothesis, one expects 5.5 ;>levels less than 0.10. Among the 55 points, 9 have p-levels less than 0.10. Most of these were near the strip S and had R(, ). For further evidence on comparisons of procedures one and two on the region C fl 5, 100 Monte Carlo runs were done for each of 10 points with 4>2 = and (f> x in C H S . For all 10 points, the risk of procedure two was less than that of procedure one by from 5 to 43 percent. In summary, it appears that procedures one and two are not significantly different except for values near the strip S where procedure two does somewhat better than procedure one. 6. SUMMARY AND CONCLUSIONS We have considered the time series problem of determining the order of an AR process by testing the significance of coefficients of lagged variables as is commonly done to identify the model. Starting with a stationary AR(2) model, procedures zero, one and two were used and the risk functions under squared error loss were obtained. It was found that although there was little difference between the risks of procedures one and two over most of the parameter space, procedure two did lead to a smaller risk in regions where they differed. For both procedures, the properties of the estimators are recognized explicitly which is not done with the usual diagnotic and identification procedure used in time series analysis. Further analysis of the pretesting done in the identification of time series models should consider the moving average, MA, and ARMA models. One difficulty with this extension is that the asymptotic theory used here for the AR(2) model breaks down for the ARM A (1,1) model. 10 BIBLIOGRAPHY Brockwell, P. J. and Davis, R. A. (1986). Time Series: Theory and Methods, Springer- Verlag, New York. Giles, D. E. A. and Clarke, J. A. (1989). Preliminary-test estimation of the scale parameter in a mis-specified regression model. Economics Letters, 30, 201-205. Giles, J. A. (1990). Pretesting in the misspecified regression model. Com- munictions in Statistics: Theory and Methods, 20 (10), 3221-3238. Judge, G. G., Bohrer, R. and Yancey, T. A. (1990). Some statistical impli- cations of multivariate inequality constrained testing. Communications in Statistics: Theory and Methods, 19 (2), 413-430. Judge, G. G. and Yancey, T. A. (1986). Improved Methods of Inference in Econometrics, North- Holland Publishing, Amsterdam. Milton, R. C. (1972). Computer evaluation of the multivariate normal in- tegral. Technometrics, 14, 881-889. Miyazaki, S., Judge, G. G. and Yancey, T. A. (1989). Estimation of location parameters under nonnormal errors and quadratic loss. Journal of Business and Economic Statistics, 4(2), 263-268. Yancey, T. A., Judge, G. G. and Bohrer, R. (1989). Sampling performance of some joint one-sided preliminary test estimators under squared error loss. Econometrica, 51(5), 1221-1228. Yancey, T. A. and Bohrer, R. (1992). "Risk and power for inequality pretest estimators: general case in two dimensions" in William E. Griffiths, Helmut Lutkepohl and Mary Ellen Bock (eds) Reading in Econometric Theory and Practice: A Volume in Honor of George Judge. North- Holland Publishing, Amsterdam. 11 HECKMAN BINDERY INC. JUN95 |B-nJ-r„.iw N.MANCHESTER INDIANA 46962