J he Person char • * t ,"""• ~™-J m, ' c ' 1 ' '"•'«»• befo " (he re nevy co// - "i dismijj., , LJ61- O-J096 Digitized by the Internet Archive in 2012 with funding from University of Illinois Urbana-Champaign http://archive.org/details/stochasticanalysOObull ENGINEERING LIBRA UNIVERSITY OF ILLINC URBANA. ILLINOIS Center for Advanced Comnu UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA. ILLINOIS 61801 CAC DOCUMENT NO, 208 STOCHASTIC ANALYSIS OF UNCERTAINTY IN A U.S. INPUT-OUTPUT MODEL . by Clark W. Bullard Donna L. Amado Dan L. Putnam Anthony V. Sebald September 1976 CAC DOCUMENT NO. 208 STOCHASTIC ANALYSIS OF UNCERTAINTY IN A U.S. INPUT-OUTPUT MODEL by Clark W. Billiard Donna L. Amado Dan L. Putnam Anthony V. Sebald Center for Advanced Computation University of Illinois at Urbana-Champaign Urbana, Illinois 61801 September 1976 Table of Contents Page 1.0 INTRODUCTION .'.... '. ...... 1 2.0 DATA BASE PREPARATION ' '3 2.1 The Model . ......'. ' . . 3 2.2 The Data i| 3.0 METHODOLOGY. ......................... 6 3.1 Point of View for Stochastic Error Analysis . . ...... . -6 3.2 Sampling Random Variables . . 7 3.3 Aggregating Random Variables. . . : 11 3.^ .Results Saved for Analysis 15 3.5 Stopping R\il e . . . . . ... .... . . '. .• . .17 3.5.1 An Outline of the Approach . , . 18 3.5.2 A Mathematical Overview. ..... 19 3.5.3 An Example . . . ... . . . . . . . . . . . . . . . 25 ' 3.5.^ Mathematical Derivations .............. 27 k.O ANALYSIS OF RESULTS 29 k.l Goodness of Fit . . . 29 ^4.2 Confidence Intervals 31 k.3 Variability of the Elements in the Result Set 32 k.h Bias on Elements of the Result Set. 32 U.5 Sensitivity of Simulation Results to Assumptions on Input Uncertainties 3^ h.6 The Effect of Aggregation 36 4.7 Results for the 101 Sector Model. 37 ERRATA NOTTCF: Pages 32 - 36 were inadvertently incorrectly numbered, The continuity of the text is correct, Table of Contents (continued) Appendix A: BASE YEAR UNCERTAINTY ESTIMATES. .' • . . 39 A.l BEA DATA , ' 39 A. 1.1 INTRODUCTION 39 A. 1.1.1 Sources of Error ' . . 1*0 A. 1.1. 2 Effects of Scale '...•... 1*1 A. 1.2 METHOD . . . . .1*2 A. 1.2.1 Quantities Estimated. ............ "^2 A. 1.2. 2 Degree of Detail • . . 1*2 A. 1.2. 3 Interview Techniques ; . 1*3 A.1.2..1* Bias. . . . " kk A. 1.2. 5 Effect of Numerical Magnitude hk A. 1.3 UNCERTAINTY ESTIMATES 1*5 A. 1.3.1 Direct Allocations 1*5 A. 1.3. 2 Gross Domestic Output 1*9 A. 1.3. 3 Transfers 1*9 A.l. 3.1+ Margins 50 A.l.l* CONCLUSIONS, APPLICATIONS, AND LIMITATIONS 50 A. 2 DIRECT ENERGY ALLOCATIONS 52 A. 3 DISPERSION FACTORS FOR SMALL-MAGNITUDE FIGURES 58 A.l* PROBABLE VALUES OF "ZERO" ELEMENTS 59 Appendix B : SECTOR DEFINITIONS 60 Appendix C: TREATMENT OF ZERO VALUES 62 C.l VARIANCE OF A "FOLDED" NORMAL DISTRIBUTION 62 C.2 VARIANCE OF A LOGNORMAL DISTRIBUTION 62 Table of Contents (continued) Appendix D: RANDOM NUMBER GENERATOR VERIFICATION D.l INTRODUCTION 6k D.2 INDEPENDENCE BETWEEN MONTE CARLO SAMPLES ...... 67 D.3 INDEPENDENCE BETWEEN INPUTS TO A MONTE CARLO SAMPLE' 73 D.U TESTING FOR NORMALITY . .77 REFERENCES 83 1.0 INTRODUCTION This report describes a stochastic parametric sensitivity analysis of a detailed structural model of the U.S. economic system. Model parameters defining the structure of the system at the time of measurement were derived from physical observations of the system. Use of such models is becoming increasingly prevalent for mid-to long-range studies and policy analyses in government planning at all levels. Resource scarcity, foreign policy con- tingencies and other factors have made rapid structural change the object of analysis, not something one can assume avay. Effective use of such models requires an understanding of the effects of parametric change and uncertain- ty. We are concerned here with a linear static input-output model of the U.S. economy. Its parameters are derived from data on interindustry trans- actions complied by the U.S. Department of Commerce. Due to the size and complexity of the economic system, funding limitations and measure- ment lags, these parameters are seven years out of date when published. Parametric uncertainty therefore can arise from two sources: observation of the system during the base year and structural changes during the seven year lag period. Estimates of uncertainty in the base year parameters were compiled by Bullard (1976) and are discussed in Appendix A. The effect of parametric uncertainty on model outputs has been dis- cussed by Sebald (197M and Bullard and Sebald (1975). These papers quantified the maximum error tolerances that would result from the worst- case distribution of parametric errors. For this model, it was found that the process of matrix inversion could magnify input errors by more than emphasizing the need for developing a methodology that could quantify the extent to which parametric errors cancel one another. The Monte Carlo simulation analysis described here was designed to answer that question. Base year interindustry transactions were character- ized as random variables ana the model parameters were derived from them. The results from each simulation were used to update a set of sufficient statistics to yield unbiased estimates of means, variances and some covari- ances. The simulations were performed to evaluate both the effect of doubling error tolerances on inputs and the effect of changing the structure of the model to enhance its usefulness for predictive work. After 200 simulations, the preliminary results were analyzed in order to determine the cost-effectiveness of proceeding with more simulations. In all cases, this a priori determination of the confidence intervals on final results showed that 1000 runs would be adequate. These estimates were then verified when the simulation had been completed. Chapter 2 describes the preparation of the data base and estimation of uncertainty on base year transactions. Chapter 3 details the simulation methodology, the criteria for determining "acceptability" of simulated parameters and derivation of the stopping rule. Chapter h presents the results of all simulations, and discusses the effects of aggregation, magnitude of input uncertainty and other <. .riables. 2.0 DATA BASE PREPARATION 2.1 The Model The linear static input-output model of the U. S. economic system is described in detail by Bullard and Herendeen (1975). It is based on the theory developed by Leontief (19^1), and relies largely on data assembled by the U. S. Department of Commerce, Bureau of Economic Analysis (BEA). Data are expressed in constant dollars, 'which act as a surrogate for physical units. In this particular model however, the inputs of energy to all sectors are expressed in physical units, to take account of the fact that energy is sold to different users at different prices. The governing equation of the model is (I-A) X = Y (2.1-1) where X is an N-order vector of gross domestic outputs for each sector, Y is the vector of total final demands for the output of each sector, and A is the matrix of parameters describing the technology of producing goods and services during the base year. A typical element A. . represents the amount of input from sector i required directly by sector j to produce one unit of its out- put. These parameters are derived from base year observations of interindustry transactions, T.., (amount of output from sector i sold directly to sector j): ■*- J T. . A E -iJ- (2.1-2) In turn, these interindustry transactions are defined as the sum T = DA + MDT + TF (2.1-3) where DA is the amount of product i sold directly to sector J, MDT. 1 J ij represents the transportation or trade margin i on all inputs to sector j , and TF. . represents the amount of product j produced as a secondary output -'-J by sector i. 2.2 The Data Estimates of all elements of the above matrices are collected and assembled by BEA at the kQk sector level of detail. Before publication however, they are aggregated to about 360 sectors. BEA personnel respon- sible for this compilation were interviewed; their subjective estimates of uncertainty on all base year transactions are given in Appendix A. Before proceeding with the Monte Carlo simulation, these data were aggregated to 90 and then to 30 sectors. The 30 sector data base was used for the development and verification of all the computer programs. The 90 sector data base was used for the main simulation. This degree of aggregation was chosen for economic reasons (matrix inversion is an expensive N operation) and because it corresponds most closely to the most widely dis- tributed and used version of the BEA input-output tables. These are published at the 83 sector level of detail, while the 90 sector model used here retains more detail in the transportation and energy sectors of the economy. Aggregating an input-output data base is a nontrivial operation since it must be done prior to the operations in (2.1-3). After aggregating the Names of sectors at each level of aggregation are given in Appendix B. three matrices independently and summing to obtain T, X and A are computed using eqs. (2.1-1) and (2.1-2). A 101 sector data base was also constructed by replacing the 5 energy sectors in the 90 order model by 16 energy . supply and service sectors. The rationale and development of the 101 sector model are described by Bullard and Sebald (1975). Its purpose is to more accurately predict energy consumption in future years by explicitly modeling fuel substitutability. In effect, this model recognizes that end uses of energy (space heating, lighting, air conditioning, etc.) are less substitutable than the fuels themselves by permitting the non-energy sectors to purchase only end uses of energy, while fuels are sold only to the end use sectors. Note that the former are very stable over time while the latter are not. The most variable coefficients or parameters involved in energy consumption are thereby confined to the few representing sales of fuels to the end use sectors, which can be estimated independently using models designed expressly for that purpose. * Names of sectors at each level of aggregation are given in Appendix B, 3.0 METHODOLOGY 3.1 Point of Viev for Stochastic Error Analysis There are several ways to interpret this problem, and the point of view affects both the methodology and the interpretation of results. One way is to act as a simulator of BEA's activities from data collection through matrix inversion. In an alternative viewpoint, the analyst attempts an a priori de- termination of the effect of mathematical transformations on uncertain obser- vations. In either case, this information enables the analyst to assess the usefulness of the data for modeling purposes. We have adopted the latter point of view. Within this framework, the analyst receives signals from the economic system associated with each interindustry transaction as well as total output and value added. Actually, each of these signals from an industry is the sum of many signals from individual establishments. The signals appear to be in- dependent; that is to say, the signals tell us little about their correlation.* The analyst's only information on these correlations comes from accounting identities requiring income to equal outgo. Each signal is characterized by BEA in terms of upper and lower bounds and a "published value" representing their estimates of where the true value is mos likely to lie. We then characterize BEA's knowledge of the transactions as ran dom variables. The distributions are inputs for the Monte Carlo analysis which transforms them into a set of numbers comprising the solution set. Each elemen of the solution set is characterized by a set of statistics** which are then compared with the deterministic result. * Due to the size and complexity of the economic system, frequent measurement is economically prohibitive so no information is available from time series analysis. ** Mean, variance 6 Each input variable is first sampled independently, but some effort is then made to assure that the external balance conditions are satisfied. This is what BEA does in their deterministic approach, and we make a similar attempt with our Monte Carlo approach. It is unrealistic however, to completely simulate BEA's activities, many of which are judgemental, undocu- mented, and not reproducible. The specific shortcuts taken are detailed in section 3.^. 3.2 Sampling Random Variables All of the basic data (transactions, industry output, final demands) are characterized as random variables having either normal or lognormal distributions.* As discussed in Appendix A, Section k t entries which have been truncated to zero by BEA are modeled with a "folded normal" ran- dom variable, which is simply the absolute value of a normal random variable with mean 0. Non-zero cells are modeled using either normal . or lognormal random variables with the former used in those cases where the published value is relatively accurate. In situations where the data is less well known, an analyst will tend to use a multiplicative factor to bound his estimate rather than an additive error bound. A lognormal distribution is appropriate in such a case because of its property of multiplicative sym- metry about the median. That is, if X_is the median of a lognormal random variable X, then Prob. (X > X D) = Prob^ (X < X /D) for any factor D. For * In a few cases a negative entry in the data is modeled by the negative of a lognormal random variable (which necessarily takes only positive values). This set of circumstances is handled so much like the usual lognormal case that it is not discussed separately in what follows. example, if an analyst states that his estimate has probability a of being correct within a factor of D, then a lognormal random variable with a = Prob. (X fl /D < X <_ X D) will be used to model the situation. This section outlines a procedure for sampling from random variables such that 1) The sample will be drawn from a folded normal, normal or lognormal population. 2) The distributions will be truncated to prevent samples that are absurd (e.g., negative transactions). Truncation eliminates samples in the upper and lower 0.15$ tails in the normal and log- normal cases and in the upper 0.3$ tail in the folded normal case. This corresponds to the percentage of probability outside 3 standard deviations from the mean in a normal population. 3) The expected value of the sampled result is equal to the published value, M, of the entry in question (except in the folded normal case where the published value is zero). k) Before truncation, the random variable X from which we sample has a confidence interval defined by a parameter b, 5 or D. a. Folded Normal Case Prob (X < b) = .997 (i.e., b amounts to 3 standard deviations of the underlying normal random variable. b. Normal Case Prob (p - 6 ^x- X - y JC + 6 V = '" 7 (i.e., o amounts to 2> standard deviations of X expressed as a fraction of the mean, M y = M) c. Lognormal Case Prob (X Q /D < X < X Q D) = .997 In all three cases the sampling procedure is based on a standard normal random variable (i.e. , mean = and variance = 1, denoted N(0,l)). * The standard normal random number generator used was the International Mathematical Statistical Library routine GGNRF. Tests of randomness and normality were performed for verification purposes and are described in Appendix D. Truncation is achieved by sampling until a value r is obtained which is less than 3 in absolute value. In the folded normal case we set y = and a = b/3 so that y + or is a sample from a truncated N(y, a ) variable; the absolute value then satisfies the conditions for the folded normal sample. 6M In the normal case we set y = M, a = — and then y + ra is used as the normal sample. The situation for X lognormal is slightly more complicated. In this Y case, X = e where Y is a normal random variable. Let y and a be the mean and standard deviation of Y. Then the median X of X is equal to e so y = In X . Therefore, Prob (X Q /D < X <_ X Q D) = .997 implies that Prob (lnX - InD <_ Y <_ lnX Q + InD) = Prob (-InD <_Y - y <_ InD) = .997 so that InD = 3a. Since we want the mean value to equal the published value, 2 y + 7T . „ a n ,, In D y = e 2 = M, we must set y = InM - — = InM - n To summarize, we sample for X in the lognormal case by obtaining a truncated standard normal random number, r, and setting X = e where 2 a = InD and y = InM - n . Comparing the three cases we have; X Folded Normal X Normal X Lognormal y = a = b/3 y = M a = 6M/3 y = InM - a = lnD/3 2 In D 18 X = truncated ABS(N(y,a )) X = truncated N(y,o ) X = truncated e N(y,a' In the lognormal case the mean is not coincident with the median. To evaluate the error resulting from assuming they are equal, suppose an analyst gives a confidence interval for the true value T, in terms of his estimate M and a factor D. That is, Prob (M/D < T <_ MD) = .997 where .997 is just the probability spanned by three standard deviations about the mean in a normal distribution. We have modeled this situation with a random variable X with u = M and Prob (X /D <_ X < DX Q ) = .997 We want to show that Prob (M/D < X 1 DM) is close to .997- In fact, Prob (M/D < X < DM) = Prob (inM - InD < Y < InD + InM) = 2 2 Prob (y+|- -3a . U. Here we use a lognormal random variable specified b y the paramete r D chosen to be consistent withv. D = exp(3 / ln(l + V/M^))* 3.3 Constructing the Transactions Matrix Fig. 3-1 shows graphically the relationship between the matrices of transactions, (T), final demand (FD), imports (M) and gross domestic outputs (GDO). N 10 T T. + J FD. n - M. = GDO. (3.3-1) j=l 1J k=l lk X These random variables are sampled from normal or lognormal distributions as described above. Each element in the first row (i=l) is sampled first independently, just as BEA analysts receive these values from apparently in- dependent sources. Since eq. (3.3-1) is an external balance condition that *See appendix C for details. 12 o o X X o -P o a o •H -p O w u En !h cd >H pq H i on •H 13 is not satisfied in general, we force this condition to be satisfied in much the same manner as BEA does. The lognormally distributed variables in the row are generally those obtained from unreliable sources or computed using surrogate variables. Therefore these values are scaled proportionately * to satisfy eq. (3.3-1). Proceeding in this manner through N rows, a complete data set is con- structed satisfying row constraints. The rows are not independent, however, because the value of all outputs (GDO) of a sector must equal the value of all commodity inputs (from the other N sectors) plus "value added" (a term, VA, accounting for wages, taxes, and profit). VA is measured independently by federal agencies and provides BEA analysts with another external condition to satisfy. Their method for satisfying this was too complex to »* model, so a simpler check had to be devised for this Monte Carlo study. The method employed is based on the response of the BEA's director of the 1-0 study to the following question: "If the criterion for terminating the iterative process of balancing the 1-0 table were based on uncertainty of the VA values, how much could be tolerated?" The answer indicated that out of 90 sectors, at least 88 must be within ±20$ of the "true" value.*** If the condition was not met, the matrix was rejected. This condition was never violated in the actual simulation. * In fact, BEA analysts actually estimate many of these uncertain values by computing the difference between GDO and the sum of the well known (normall; distributed) variables and allocating proportional to some surrogate variables (e.g. employment). ** In the 1967 input-output study, consistency between row and column sums was assured by assigning responsibility for individual sectors to different analysts and after each independently estimated initial row values, the resulting columns were presented to each analyst for independent verifi- cation. After many iterations and some undocumented Judgement decisions, the "published" values were agreed upon. *** Philip M. Ritz (1976) Interindustry Economics Branch, Bureau of Economic Analysis, U. S. Department of Commerce, personal communication. Ik Next, the terms in eq. (3.3-1) are used to compute the coefficients T. . A = -±d- ij GDOj and the Leontief inverse matrix (i-A) " is finally calculated. Aside from checking the eigenvalues of A, there is no a priori check that can he performed to guarantee positivity of the inverse matrix. Therefore, each inverse matrix is checked after it is computed to verify that every element is greater than zero. If it fails the test, all the randomly selected variables T, FD, M, GDO are discarded and a new set is selected. This is exactly the procedure employed by BEA. Again, the simulation was completed without this condition being violated. 3.^ Results Saved for Analysis The simulation described here is expensive from a computational point 3 of view since matrix inversion is an N operation. For this reason, every simulated Leontief inverse matrix was saved on tape so it would be available ** for future analysis if necessary. For purposes of this analysis, our attention was focused on the means, variances and confidence intervals for the elements of (i-A) and selected subsets and linear combinations thereof. To calculate these, it was necessary to save a set of sufficient statistics on disk after each iteration, the running sum and the sum of the squares for each element of the following set of results which we shall denote by fi: * If all variables were expressed in current-year dollars, some a priori tests are available. In the general case such as thin one, where the energy sector outputs are uxprt-ujjtnJ in [Aiy \: i caJ unit.;:, no awAi tests exist * * The tape will be delivered to EPRI under separate cover. 15 1. The entire (i-A) matrix; 2. The total primary energy intensity vector, e; and 3. The sector output vector, X. The total primary energy intensity vector is a linear combination of the energy rows of (i-A) , and a typical element e. represents the amount of basic J energy resources required directly and indirectly to produce one unit of output from sector j for final consumption. The sector outputs X are computed from the simulated (i-A) ' matrix using the base year domestic final demands as weighting factors: 1 10 X. = I (I-A) 7. ( I FD. . - M.) (3.U-1) This is done because 1-0 models are frequently employed to estimate total sectc outputs corresponding to a specified final bill of goods, and a significant amount of additional error cancellation may be achieved. In order to ascertain the nature of the distribution of typical random variables, each simulated value was saved for source results. The variables saved were X, e, and the electricity sector row of (i-A) . Goodness of fit tests performed on these variables are described in Section h. Finally, since most applications of the particular models examined are in the area of energy policy analysis, it was decided to save sufficient stat- * The energy rows utilized are those corresponding to coal, crude oil and gas, and the fossil fuel equivalent of hydro and nuclear electricity: e. = (i-A). + (I-A)" 1 + 0.6 (I-A)" 1 !*. 16 istics for recovering covariances of the energy sector rows of (l-A)~ . Since all possible linear combinations were ~ot o^ interest - only row and column combinations - storage requirements were considerably reduced. It was sufficient to save the running sums of products of all pairs of entries appearing together in such linear combinations. If other combinations are ever needed, they will be recoverable from (i-A) matrices saved on an archive tape as described earlier. With this set of results it is possible to estimate the total energy require- ments to meet arbitrarily specified final demands, and to compute linear combinations of energy intensities similar to the "total primary" one described earlier. 3.5 Stopping Rule One of the major difficulties associated with Monte Carlo simulation is knowing how many runs will be required to attain reasonable confidence in- tervals on the results of the simulation. There are two major problem areas. If one is considering whether or not to use Monte Carlo techniques, an estimate of the required number of runs is crucial to determination of simulation costs. It may be, for example, that reasonable confidence in- tervals may require a prohibitively expensive number of runs. The second problem arises after the decision has been made to use Monte Carlo methods. One needs to know when enough runs have been made. In the first problem area, present practice dictates running several small scale simulations of a similar nature to the one of interest in order to be able to extrapolate the number of runs in the smaller cases to the probable runs needed in the larger. In the second area, good statistical IT practice dictates that before taking any samples, one must determine how to stop sampling in a way that doesn't bias results. Executing additional runs if the resulting confidence intervals are too large is considered unwise since one runs the risk of biasing the simulation results by stopping when the desired outcome occurs. In this section we present a method for determining, based on a very small number of runs, the proper number of total runs the simulation should require. The method properly elucidates the cost/benefit tradeoff between the cost of additional runs and the benefits of increased accuracy. Informa- tion is displayed to the analyst in a way that facilitates his making a judgement on the proper number of runs to be made. Since this method is based on just the first few runs, biasing of the simulation is not a problem. Based on a very small number of runs, it is also a cost effective way to decide whether a Monte Carlo analysis is economically feasible. In section 3.5.1 we outline the approach used, in section 3.5.2 a brief sketch of the mathematics involved is given, followed by an example in section 3.5-3. Mathematical derivations are given in section 3.5.^. 3.5.1 An Outline of the Approach Suppose a relatively small number of simulation runs have been made and unbiased estimates of the second order statistics of all elements of a set of results, ft, have been calculated. Since the estimates are themselves random quantities, one can determine an interval about each estimate which contains the population value (e.g. mean or variance) with a certain probabilit These intervals are called confidence intervals (Cl) and we shall interest 18 ourselves in intervals for which the corresponding probability is .95. Fig. 3.5-1 describes the situation within which one must interpret the results of a simulation. Generally, then, the simulation output is given in two ways: the unbiased estimates are tabulated and their confidence intervals (e.g. 95%) are also given. Our strategy will be to make a few runs and then based on the resulting estimates and CI's, determine how many runs the entire simulation will require. Two major effects occur with increasing sample size. First, the estimates y and a will move around, ultimately converging to the correct values. Second, the width of the CI's will decrease monotonically to zero as the number of runs goes to infinity. For purposes of the stopping rule, we have chosen to quantify the resulting simulation accuracy by monitor- ing a histogram of T3+4 3o + U ■d - — y for each stopping point considered. The algorithm therefore: 1) Draws a histogram of the actual B values after an initial number of runs , m . 2) On the basi| of the information after m runs, draws a histogram of predicted B values after ^ 2 runs, where m 2 > m denotes a possible stopping point for the full blown simulation. 3) Step 2 for various m . Typical results are displayed in fig. 3.5-2. 3.5.2 A Mathematical Overview As before, we denote the matrix of simulation output variables by ft. Each X e ft has an unknown distribution which is at best only approximately normal. The symbols are defined in Figure 3. 5 - l. 19 A IT ■A. x a. b f44 1 Figure 3.5-1 Mean and Variance Estimates and their Confidence Intervals u « unbiased estimate of the mean of the underlying distribution a and b are the upper and lower confidence interval lengths for y. - unbiased estimate of the standard deviation of the underlying distribution U and L are the upper and lower confidence interval lengths for 3o. 20 ho 4o s ^ ™ SO °/ cr{ -tli^wu- *4 * Figure 3.5-2 Stopping Rule Output Histograms for the 90 Order Simulation. 21 We denote the set of samples of each X e ft by {x.}. Although what follows could be done for the unknown density of X e ft , this approach involves inac- curacies in the determination of the fourth central moment and is computationa] quite expensive. Instead we have chosen to convert each X e ft to a normal random variable Z by the transformation lOi 1 10 j = 10(i-l) J Since the samples x are independent and identically distributed (iid) the J 2 central limit theorem implies that the z. are approximately N(y ,o /10 ) . All i y •* statistical evaluations will be performed on Z and the results will be back- transformed via (3.5-1) to X. Since the Z's are normal, the unbiased estimates for their mean and variance are given by the well known relations [Winkler & Hayes (1970)] V = n i=l y = - I z. (3.5-2) n .^ 1 -.2 I (z.- M V o = i=± (3.5-3) (n-1) where n = m/10 is the number of Z sample points and m is the total number of ru We assume that even after the initial 1^ runs, the CI around J is very small. This has empirically been verified as a valid assumption and it permits us to evaluate B+ for the Z variables by only worrying about the upper CI on a which "2. ~ o we denote by a . Again due to the normality of the Z's, c is well known to u u be [Winkler & Hayes (1970)] ; 2 (n-l) a 2 (.975; n-l) 22 ■where ■)(• / _n denotes the value in a chi-square distribution with n-1 J degrees of freedom cutting off the upper .975 of sample values. B for the Z variables is then given by 3o T B + = _Un = ^1 / (n-1) (3>5 _ 5) n y y / v 2 n V * (.975. n-1) i where the subscript n denotes the number of Z sample points in the simulation. To evaluate B we shall require (J and a . Even with relatively small n, n 2 n 2 n 2 y ^ p since the CI even at n, is very short. Such is not the case with n.,?* n 1 ** a . We can, however, upper bound a if a is known by noting that n 2 n 2 n ± 2~ 1~ 1 has an F distribution with n - n and n degrees of freedom (n 2 -n 1 )(n 1 -l) o n 2 where Q ^ ttt~. We can then determine K such that a n l P{Q <_K} = .50 (3.5-6) and then use a = A o (3.5-7) n 2 n x in (3.5-5). It is clear from (3. 5-5) and (3.5-7) that for r^ and n 2 fixed, the B + is linearly related to B for each Z. The histogram of Figure n 2 n ± + 3.5-2 can then be generated by evaluating the actual histogram for B after * N. = M./10 i : ** This is proved in section 3.5-^. 23 n runs, fixing n , evaluating the constant multiplier, C, and multiplying each point in the histogram by C. In particular, B = C B + Where n 2 n ± C = / K n 2 " X \ / n i - 2 $ (.975-, n 2 -l)y^# (.975; r^-l) and K is given by (3.5-6) The choice of the probability .5 in (3.5-6) is not arbitrary. We a^e interested in predicting the histogram of B from the histogram B Foi n 2 n i 1 each Z e ft, define an indicator function I as follows: Z 1 if Q„ > K I =' z otherwise The numbers of Z's for which Q > K is then equal to Li N A I I Zeft Provided .50 is used in (3. 5-6), the expected value of N = \ E{I„} = — |fi| Zeft where |fl| denotes the number of elements in the set ft. The histogram of B can be predicted from that for B if the behavior around the decile n 2 n ± points can be quantified. The fact that E{N} = — |ft| indicates that the set of points around each decile in the histogram of B should behave in the fol- n l lowing way. The deciles of B are approximately C times the deciles of n 2 B since for large Iftl roughly half of the points around each B decile n ' ' n 2k is expected to change by more than a factor of C with the completion of n runs. The other half is expected to change by less than a factor of C. Provided some degree ol Independence exists among the entries, the decile at n should therefore be approximately c times the decile at n . The final step simply uses (3.5-1) to convert the B histogram of values for the Z variables n 2 in il back to a histogram for the associated X variables. This just amounts to multiplying the decile values of B by / 10 since /10a = a and u = u . 3.^.3 An Example As an example, we shall discuss the actual 90 order stopping rule results. After 200 runs, the histograms of Figure 3.5-2 were generated.* As discussed above, B = 3 a / y was chosen as the ordinate since it is a useful measure of the variability of each element in the result set. This figure predicts the variability of results to be expected after different possible stopping points. The diminishing returns for increasing the number of runs is evident from a comparison of the marginal benefit by increasing from 200 to 600 runs to that obtained by increasing from 1000 to 2000 runs. This format permits the analyst to decide at which point the marginal benefit no longer justifies the increased cost. Clearly, this requires a non trivial judgement by the analyst. Based on the relatively high cost per iteration in this simulation, 1000 was chosen as the stopping point. In an attempt to quantify the accuracy of this stopping rule, comparisons of predicted and actual histograms were made at the 90 order. Results are given in Table 3.5-1. Similar tests were done at the 30 order where actuals were compared with predictions based on only 100 runs, and very small errors were observed. *Although similar histograms for the total primary vector, e, and GDO were used in the determination of the proper stopping point, for purposes of this example, we shall concentrate on figure 3-5-1. 25 r~ «-. kj ■_; £; r^ <: c. .^, r- C c 3 — o o o Cj Cl Ui ui IM UJ ui u. ' u. UI Ui w O -* t>- 1/ o> w V- f^ IV (J O *~ *" (M f"t •>f v/t Cf •4 «- •- o 1 l -o (%j r- f>» >S\ K1 ."J at r~ f*. •> f\i » KV K> ft IO CC o im ■o .- .- ui ui Ui r- «— o o O o UI UI UI -J iSS ro •f aO O •- i- o o o o o o IA rr «- «- «- - C3 O O aj UI «- M ■J -t f- fM r> o S = -O r- t- wW IU *- oo r^ O <- r- uu r- r- 26 3.5.1+ Mathematical Derivations (n^ - n - l)n We first demonstrate in this appendix that 73 — ^77 ^ N Q has U, - n )(n - l) an F/ \ 2 (n 2 - n ; n ) distribution where Q 4— and then outline the method a n l for calculating the K of (3.5-6). Consider the event R A < K ' where I 7- "2 i»l n-1 and y. = z. - y 1 1 z. 1 f n 2 R =< I y, 2 < (n-1) K 10 as discussed in Sec. A. 3. *Based primarily on interviews with Jerry Schluter, U.S. Department of Agriculture. Based primarily on interviews with Roy Seaton , Bureau of Economic Analysis, h6 Defense purchases are generally better known, due to more complete source data. Inputs from manufacturing sectors are assigned 6 = .10 if they exceed $50 million. Transportation inputs were derived from outdated formulae that applied poorly to the Southeast Asia situation in 1967 and are assigned 6 = .50. Other non-manufacturing inputs were assigned 6 = .10 if they were above the $50 million threshold. Non-defense purchases of inputs from non-manufacturing sectors were less well known, and were assigned D = 3 if they exceeded one percent of total inputs and D=3-*10 if they were smaller, liar.uf acturir.g ir.puts below the $50 million threshold were treated the same. Transportation inputs were assigned 6 = .30. State and local government purchases . 7cr health, welfare, education, and sanitation purchases, new construction and real estate inputs are as- signed 6 = .05 since they are obtained from census sources. Together with wages, these inputs account for nearly 75^ of all ir.puts. Other inputs are assigned 6 = .25 if they exceed 1% of total ir.puts, and Z = 1.5 -*■ 10 as per Sec. A. 3 if they are equal to or smaller than 1%. For public safety purchases, new construction and real estate are assigned 6 = .05. Maintenance construction is known poorly; D = 1.5. Manufactured inputs greater than $2 million are assigned D = 1.5, and smaller inputs D = 1.5 ■* 10. Non-manufactured ir.puts are assigned D = 1. 5 for those greater than $10 million, and D = 1.5 •*■ 10 for the smaller ones. Other state and local government purchases are also assigned 5 = .05 for new construction and real estate, but also 5 = .05 for maintenance construction since it is primarily highway maintenance which is a Census 4t Based primarily on interviews with John Wealty, Bureau of Economic Analysis, 1,7 number. Manufactured inputs greater than $5 million are assigned D = 1.5, and smaller figures D ■ 1.5 ■+• 10 ae per Sec. A. 3. Non-manufactured inputs greater than $50 million are assigned D = 2 and D = 2 -*■ 10 for smaller inputs # Imports and exports . Trade data for commodities (BEA sectors 1.00 - 6U.00) are obtained from Census sources and are assigned 6 = .05. Trans- portation and wholesale and retail trade data, including margins, were assign* 6 = .25. Data on other items (services, etc.) involved in international trade were assigned D = 2, since they were obtained from balance of pay- ments sample data. Small entries at the 368-sector level of detail, repre- senting less than 1% of gross imports or exports were assigned D = 2 ■* 10 as per Sec. A. 3. Inventory change . These figures are in general the least accurate of all final demand entries, and were assigned 6 = .20 for manufactured goods and 6 = .40 elsewhere. "All other" direct allocations . Within the scope of this study it was impossible to identify those responsible for most entries in the input- output tables. Having taken care of most entries through interviews des- cribed above, the remainder were handled as a group. The algorithm was designed to assign very tight tolerances to any transaction comprising a high percentage of total outputs or inputs, and to any sector's output which "by definition" had to be assigned to a particular cell. For example, the algorithm had to assign a very tight tolerance to sales from new resi- dential construction to gross private capital formation, so it would be compatible with the tolerance assigned to that sector's gross domestic out- put. There are numerous other instances where census data might identify Based primarily on interviews with Robert Mangen, Bureau of Economic Analysis. kQ ■ sales of "butter to food processors or bakers, and the remainder is attri- buted to personal consumption expenditures. On the other hand, very small- magnitude transactions were assigned high uncertainty for the reasons dis- cussed earlier. The algorithm defined two fractions for each direct allocation: an input fraction, by normalizing with respect to the gross domestic output of the consuming sector; and an output fraction, by normalizing with respect to the gross domestic output of the producing sector. The algorithm proceeds with these tests in the following order, and assigning 6 or D when the first condition is satisfied: if both fractions exceed •95 then 6 = .01, if only one exceeds .95 then 5 = .02; if both exceed .80, 6 = .05, if only one exceeds .80, 6 = .10; if either fraction ex- ceeds .05, then 6 = .20; if either exceeds .01, then Z - 1.5. If both are smaller than .01 it assigns D = 2 -*■ 10 as per Sec. A. 3. * A. 1.3. 2 Gross Domestic Output These figures are the best known because they are from the Census or other equally reliable sources (e.g., IRS) and are assigned 5 = .01. The largest errors here probably stem from classification problems and possible confusion between company and establishment-based data. A. 1.3. 3 Transfers . If both the row and column sectors were manufacturing sectors , the Based primarily on interviews with Gene Roberts ana Phil Ritz , Bureau oi Economic Analysis, and with Kenneth Hanson, Census of Manufactures Industry Division. #* Based primarily on interviews with Kenneth Hanson, Census of Manufactures Industry Division. U9 source of this data was the Census Bureau, cut the accuracy was less than that of direct allocations; assign 6 = .20. All other transfers were assigned upper and lower bounds in the same manner as the corresponding cell in the direct allocations matrix. A. 1.3.*+ Margins Transportation margins, Toy product type and mode, are obtained as totals and then prorated proportional to producers' prices across all pur- chasers of that commodity. Then margins in each .input are summed for each purchaser and added to the directly allocated inputs. For all transport modes, 6 = .25 was assigned to the margins, Wholesale and retail trade margins may be expected to be more variable, and are some- times computed as percentage markups over the already estimated trans- port-margins. Therefore they are assigned 6 - .35- A.l.U. CONCLUSIONS, APPLICATIONS, AND LIMITATIONS Earlier work using maximum-upper-bound analyses had shown the dangers that might be encountered using results of input-output analyses. There- fore, these estimates of uncertainty on the actual data were needed to check the maximum error bounds on the particular results we were interested in using (e.g. , elements of the energy sector rows of the 1967 Leontief inverse matrix) . It soon became evident that the magnitude of the uncertainties in *See for example the results presented by Bullard and Sebald (1975). 50 the parameter estimation process that our max' mum upper bound analysis would yield unsatisfactory results. The above information is given to further i.lluminat'e the context in which these uncertainty estimates were made, nnd hopefully will dis- courage inappropriate applications of the results. Finally, I repeat that the uncertainty estimates presented here are my own. I have listed many of the persons whom I interviewed, but they have not endorsed my interpretations of those interviews. If the absolute levels of the estimates are widely disputed (and I expect they will be) perhaps at least the relative levels will be accepted. On this basis we have performed stochastic error analyses on the 1°67 U.S. input-output model for several cases; including doubling error margins presented here, to determine the sensitivity of the results to systematic bias in the estimates. 51 A. 2 DIRECT ENERGY ALLOCATIONS Knecht (1975) estimated error tolerances on all physical-unit energy- transactions. These are coded in Table A. 2-2, at the 90 sector level of detail, and in Tables A. 2-3 and A.2-U at the 101-sector level, and the" codes ai explained in Table A. 2-1 below. Table A. 2-1 ENERGY TRANSACTION TOLERANCE CODES Code 00 y=0 and 3a=10 11 Btu) 01, 02,09,13 .05 0U, Ul,l6, 18, 19,20,2^,28 .10 03,05,06,29,30 .15 07, 12, Ik, 15, 17, 22, 23, 26, 27 .20 25 .25 08,10,11 .30 10 .35 * Note that instead of the 368-sector level of aggregation, the results pre- sented here are consistent with the slightly aggregated 357-sector breakdown described by Bullard & Herendeen (1975). Dummy sectors consuming no energy have been deleted and public and private sectors producing the same primary product have been combined. 52 TABLE A. 2-2 TOLERANCE CODES FOR DIRECT ENERGY USE DATA (90 SECTOR) Sector Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 2k 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Sector Name COAL MINING CRUDE PETR0LEU1/ GAS REF'D PETROLEUM PROD'S ELECTRIC UTILITIES NATURAL GAS UTILITIES LIVESTOCK.. .PRODUCTS OTHER AGRIC'L PRODUCTS FORESTRY AND FISHERY.. AG./ FOR'Y. ..SFRVICES IRON.. .OPES MINING NONFERROUS ORES MINIMS STONE AND CLAY MINING. CHEMICALS^ ETC. MINING NEW CONSTRUCTION MAINT. AND REPAIR CON. ORDNANCE AND ACCESSOR. FOOD AND KINDRED PROD. TOBACCO MANUFACTURING FABRIC. ..THREAD MILLS MISC. TEXTILE. ..FLOOR. APPAREL MISC. FA3. TEXTILE PRO LUMBER. ..PROD'S/ EXCEP WOODEN CONTAINERS HOUSEHOLD FURNITURE OTHER FURNITURE AND PAPER AND.. .EXCEPT... PAPERB'D CONTAINERS PRINTING AND PU3LISH'G CHEMICALS AND. ..PROD'S PLASTICS AND MATER'S DRUGS /...PREPARATIONS PAINTS AND PRODUCTS PAVING MIXTURES AND... ASPHALT FELTS AND COAT RUBBER AND. ..PRODUCTS LEATHER TANNING AND... FOOTWEAR AND. ..PROD'S GLASS AND GLASS PROD'S STONE AND CLAY PROD'S PRIM. IRON AND STEEL.. PRIM. NONFERROUS METAL METAL CONTAINERS HEAT./ PLUMti PROD'S SCREW MACH. PROD'S/ OTHER FAO. METAL PROD. ENGINES AND TURH1NES FARM MACHINERY CONSTRUCTION/ EQUIP. MAT. HANDLING EQUIP. Coal Energy Supplies Crude Oil Electric Gas 02 00 02 02 02 11 02 02 02 02 01 01 01 01 01 01 00 01 01 01 11 01 11 11 01 11 00 08 08 11 11 00 08 08 11 11 00 11 11 11 11 00 11 11 11 11 00 02 02 02 11 00 02 02 02 02 00 02 02 02 11 00 02 02 02 03 00 11 11 11 03 00 11 11 11 03 00 05 03 03 02 00 04 02 02 02 00 04 02 02 02 00 04 02 02 03 00 05 11 03 03 00 05 03 03 11 00 11 11 11 03 00 05 03 03 11 uo 05 03 11 03 00 05 03 03 03 00 05 11 03 02 00 04 02 02 02 00 04 02 02 03 00 05 03 03 02 02 04 02 02 02 00 04 02 02 02 00 04 02 02 02 00 04 02 02 11 00 06 02 02 02 00 06 02 02 02 00 04 02 02 03 00 05 11 11 05 00 05 03 03 02 00 04 02 02 02 00 04 02 02 02 00 04 02 . 02 02 00 04 02 02 02 00 06 02 02 03 00 05 03 03 02 00 04 02 02 03 00 07 03 03 02 . 00 04 02 02 02 00 04 02 02 03 00 05 03 03 03 00 05 11 03 53 TABLE A. 2-2 (continued) Sector Number 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Sector Name Energy Supplies METALS SPEC. GEN. I M A C M J N OFF., SERV. ELEC. HOUSt H ELEC. RADIO, ELEC. MISC. MOTOR A1RCRA OTHER PROFES OPICAL MISC. RA1LRO ...HIG MOTOR WATER AIR TR PIPE L TRANSP COM'. MS RADIO WATER WHOLES FINANC REAL E HOTELS BUSINE AUTO. AMUSEM MED./ FED. G STATE BUS. T OFFICE PERS. GROSS. NET IN NET EX FED. G FED. G STATE. STATE. STATF. STATE. 0RK1N INDUS ND'JST E ShO C'OMP' 1ND. TRANS OLD A LI'jHT TV, CO*!PO ELEC. VEHIC FT AN TRANS SIONA . ..EQ M A N U F ADS A HwAY FREIG TRANS ANSPO INE T ORTAI EXCE AND T AND S ALE A E AND STATE AND. SS SE REPAI ENTS ED. S OV'T AND L RAV.. SUPP CONSU ..CAP V E N T PORTS OV'T. OV'T. . .GOV . .HFA . .GOV . .GOV 'j . . . E TRY.. RIAL. P PRO G M M A C H 1 ...AP PPLIA •G... COM. NENTS ..SUP HLES. D PAR . E9IJ L...S UIP. ACTUR ND PASS . HT TR PORTA RTATI RANSP ON SE PT RA V 3R0 AN. S ND RE INSU AND ..EXC "VICE R AND QUIP'T .EQUIP . .EQ'T DUCTS mCHINE NES PARAT. NCES EQUIP. EQUIP. PLIES ..EQ'T TS IP*1ENT UPPLIE AND. .. 1NG SERV'S TRAN. ANS. .. T10N ON ORTA'N RVICES DID... ADCAST ERV 'S TAIL.. RANCE RENTAL EPT... S SERV. ERV'S AND.. ENTERPRISES OCAL...EN'S .AND GIFTS LIES MP'N EXPEN. . FORMATION RY CHANGE ..DEFENSE ..OTHER 'T...EDUC 'N LTH, SAN. 'T.. .SAFETY •T OTHER sal Crude Oil Electric Gas U2 00 04 02 02 03 00 05 11 03 02 00 04 02 02 03 00 05 03 03 11 00 04 02 02 03 00 05 11 03 03 00 05 03 03 03 00 05 11 03 03 00 05 03 03 03 00 05 03 03 02 00 04 02 02 03 00 05 11 11 02 00 04 02 02 02 00 04 02 02 0? 00 04 02 02 03 00 05 11 11 03 00 04 02 02 03 00 05 03 11 09 00 09 09 11 11 00 09 09 11 11 00 09 11 11 0? 00 09 11 11 11 00 09 11 11 11 00 11 11 09 11 00 00 11 11 11 00 11 11 11 11 00 11 11 11 11 00 11 11 11 11 00 11 11 11 11 00 11 12 11 no 00 11 11 11 11 00 11 11 51 11 00 11 12 11 n 00 11 12 11 11 00 11 12 11 11 00 11 11 11 11 00 11 12 11 11 00 11 12 11 00 00 00 00 00 03 00 00 00 00 11 00 01 12 01 CO 00 00 00 00 01 01 01 00 01 01 01 01 01 01 11 00 11 11 11 11 00 11 11 11 11 00 11 11 11 11 00 11 12 11 11 . 00 11 12 11 11 00 11 12 11 5U TABLE A. 2-3 TOLERANCE CODES FOR ENERGY END USES DATA (101 SECTOR) Misc. Misc. Elec. Sector i Feed- Mot. Ther. Water Space Air Pow. Number Sector Name Coke Stocks Pow. Users Heat Heat Cond • Uses 1 2 COAL MINING 00 14 14 16 15 00 00 18 CRUDE PETROLEUM/- GAS 00 14 14 16 15 00 00 18 3 GASIFIED COAL 00 21 '21 21 21 21 21 21 k RtF'D PETROLEUM PROD'S 00 13 14 19 15 14 1 7 20 5 NATURAL GAS UTILITIES 00 14 14 00 14 22 14 23 6 FOSSIL ELECTRIC UTIL'S 00 14 14 30 14 22 14 23 7 NUCLEAR ELEC. UTIL'S 00 14 14 00 14 22 14 23 8 RENEWA3LE ELEC. UTIL'S 00 14 14 00 14 22 14 23 9 ORt-REDUC. FEEDSTOCKS 00 00 00 DO 00 00 00 00 10 CHEMICAL FEEDSTOCKS 00 00 DO 30 00 00 00 00 11 MOTIVE POWER 00 00 00 00 00 00 00 00 12 MISC. THERMAL USES 00 00 00 30 DO 00 UO 00 13 WATER HEAT 00 00 00 30 30 00 00 00 Ik SPACE HEAT 00 00 00 30 00 00 00 00 15 AIR-CONDITIONING 00 00 00 00 00 00 00 00 16 MISC. ELEC. POWER USES 00 00 00 00 DO 00 00 00 17 LIVESTOCK PRODUCTS CO u 14 00 14 22 00 23 18 OTHER AGRIC'L PRODUCTS 00 13 13 30 14 22 00 23 19 FORTSTRY AND FISHERY.. 00 14 14 00 14 22 00 23 20 AG.*- FOR*Y. -.SERVICES 00 14 14 00 14 22 00 23 21 IRON. ..ORES MINING 00 n 14 16 15 00 00 18 22 NONFfcRROUS ORES MINING 00 14 14 16 15 00 00 18 23 STONE AND CLAY MINING. 00 14 14 16 15 00 00 18 2U CHEMICALS, ETC. MINING 00 14 14 16 15 00 00 18 25 NEW CONSTRUCTION 00 24 14 25 25 25 00 23 26 MAINT. AND REPAIR CON. 00 24 14 25 25 00 00 23 27 ORDNANCE AND ACCESSOR. 00 14 14 29 15 14 17 30 28 FOOD AND KINDRED PROD. 13 14 14 19 15 14 17 20 29 TOBACCO MANUFACTURING 00 14 14 19 15 14 17 20 30 FA3RIC THREAD MILLS 00 14 14 19 15 14 17 20 31 MISC. TEXTILE.. .FLOOR. 00 14 14 29 15 14 17 30 32 APPAREL 00 14 14 29 15 14 17 30 33 MISC. FAB. TEXTILE PRO 00 14 14 29 15 . 14 17 30 3U LUMBER. ..PROD'S, EXCEP 00 14 14 29 15 14 17 30 35 WOODEN CONTAINERS 00 14 14 29 15 14 00 30 36 HOUSEHOLD FURNITURE 00 14 14 29 15 14 00 30 37 OTHER FURNITURE AND 00 14 14 29 15 14 00 30 38 PAPER AND.. .EXCEPT 00 13 14 19 15 14 17 20 39 PAPERB'D CONTAINERS... 00 14 14 19 15 14 17 20 UO Ul U2 »»3 kk h5 k6 hi PRINTING AND PUBLISH'G 00 14 14 29 15 14 17 30 CHEMICALS AND PROD'S 00 13 14 19 15 14 17 20 PLASTICS AND. . .MATER'S 00 13 14 19 15 14 17 20 DRUGS, PREPARATIONS 00 14 14 19 15 14 17 20 PAINTS AND PRODUCTS 00 13 14 19 15 14 17 20 PAVING MIXTURES AND... 00 13 14 19 15 14 00 20 ASPHALT FELTS AND COAT 00 13 14 19 15 14 00 20 RUBBER AND PRODUCTS 00 14 14 19 15 14 17 20 J*8 LEATHER TANNING AND 00 14 14 29 15 14 00 30 k9 FOOTWEAR AND PROD'S 00 14 14 29 15 14 00 30 50 GLASS AND GLASS PROD'S 00 14 14 19 15 14 17 20 51 STONE AND CLAY PROD'S 13 14 14 19 15 14 17 20 52 PRIM. IRON AND STEEL.. 13 14 14 19 15 14 17 20 53 PRIM. NONFERROUS METAL 13 14 14 19 15 14 17 20 55 TABLE A. 2- 3 (continued) Misc Misc. Elec Sector Feed- Mot. Ther. Hater Space Air- Pow. Number Sector Name Coke Stocks Pow. Uses. Heat Heat Cond. Uses 5k METAL CONTAINERS 00 14 14 19 15 14 OC 20 55 HEAT./ PLUMB. . .PROD'S 13 14 14 29 15 14 17 30 56 SCREW MACH. PROD 'S/ . . . CO 14 14 19 15 14 17 20 57 OTHER FAB. METAL TROD. 13 14 14 29 15 14 17 30 58 ENGINES AND TURBINES 13 14 14 19 15 14 17 20 59 FARM MACHINERY 13 14 14 19 15 14 1 7 20 60 CONSTRUCTION, ...EQUIP. 13 14 14 29 15 14 17 30 61 MAT. HANDLING E0U1P. 00 14 14 29 15 14 17 30 62 METAL WORKING EQUIP' T 00 14 14 19 15 14 17 20 63 SPEC. INDUSTRY EQUIP 13 14 14 29 15 14 1 7 30 6k GEN. INDUSTRIAL. ..EQ'T 13 14 14 19 15 14 17 20 65 MACHINE SHOP PRODUCTS 13 14 14 29 15 14 17 30 66 OFF./ C *! P ' G MACHINE 00 14 14 30 15 14 17 20 67 SERV. 1ND. MACHINES 00 14 14 29 15 14 17 30 68 ELEC. TRANS APPARAT. 00 14 14 29 15 14 17 30 69 HOUSEHOLD APPLIANCES 00 14 14 29 . 15 14 17 30 70 ELEC. LIGHT • C . . .EQUIP. 13 14 14 29 15 14 17 30 71 RADIO/ TV/ COM. E3UIP. 00 14 14 29 15 14 17 30 72 ELEC. COMPONENTS. .. 00 14 14 19 15 14 17 20 73 H1SC. ELEC. ..SUP-LIES 00 14 14 29 15 14 17 30 7k MOTOR VEHICHLES CQ'T 13 14 14 19 15 14 17 20 75 AIRCRAFT AND » A fi T S 00 14 14 19 15 14 17 20 76 OTHER TRANS. EQUIPMENT 00 14 14 19 15 14 17 20 77 PROFESSIONAL.'. .SUP PL IE 00 14 14 29 15 14 17 30 78 0P1CAL EQUIP. AND 00 14 14 19 15 14 17 20 79 MISC. MANUF ACTUPINu 00 14 14 29 15 14 17 30 80 RAILROADS AND ... SERV ' S 00 13 13 30 15 22 00 18 81 ...HIGHWAY PASS. T R A N . 00 13 13 30 15 22 00 1b 82 MOTOR FREIGHT TRANS 00 13 13 30 15 22 00 18 83 WATER TRANSPORTATION 00 13 13 30 15 22 00 18 81* AIR TRANSPORTATION 00 13 13 30 15 21 00 18 85 PIPE LINE TOANSPORTA'N 00 22 22 00 15 22 00 18 86 TRANSP0RTA10N SERVICES 00 90 00 30 15 22 00 18 87 COHVNS EXCEPT RADIO... 00 ?7 27 30 14 22 14 23 88 RADIO AND TV BROADCAST 00 27 27 00 14 22 14 23 89 WATER AND SAN. SERV'S 00 27 27 30 14 22 14 23 90 -WHOLESALE AND RETAIL.. 00 27 27 26 26 22 14 23 91 FINANCE AND INSURANCE 00 27 27 30 14 22 14 23 92 REAL ESTATE AND RENTAL 00 27 27 30 14 22 14 23 93 HOTELS AND EXCEPT... 00 27 27 26 52 52 14 23 9^ BUSINESS SERVICES 00 27 27 30 14 22 14 23 95 AUTO. REPAIR AND SERV. 00 27 27 30 14 22 14 23 96 AMUSEMENTS 00 27 27 26 26 22 14 23 97 MED./ ED. SERV'S AND.. 00 27 27 ?6 26 22 14 23 98 FED. GOV'T ENTERPRISES 00 27 27 30 14 22 14 23 99 STATE AND LOCAL EN'S 00 27 27 30 14 22 14 23 100 BUS. TRAV AND GIFTS 00 00 00 00 00 00 00 00 101 OFFICE SUPPLIES 00 00 00 30 00 00 00 00 102 PERS. CONSUMP'N EXPEN. 00 28 28 28 28 28 28 28 103 bROSS...CAP. FORMATION 00 00 00 30 00 00 00 00 10U NET INVENTORY CHANGE 00 00 00 00 00 00 00 00 105 106 NET EXPORTS 00 00 00 00 00 00 00 00 FED. GOV'T DEFENSE 00 13 13 26 26 22 14 23 107 108 FED. GOV'T OTHER 00 27 27 30 14 22 14 23 STATE.. .GOV'T. . .EDUC'N 00 27 27 26 26 22 14 23 109 STATE HEALTH/ SAN. 00 27 27 26 26 22 14 23 110 STATE GOV'T SAFETY 00 27 27 00 14 22 14 23 111 STATE. ..GOV'T. . .OTHER 00 27 27 30 14 22 14 23 56 K O Eh u w CQ rH O co o O H CQ >H 1-3 CQ a o CO s o H E-t C_> < CO g CO § o w o o Eh H o o O O o o o o o o on H on H rH H o o o o o o o o on H on H on H -3" H m o o o o 00 H on on H en H on H on H O o o o o o on H CO H on H on H on H 1 OJ H en H o o o o on H on H on H on H on rH i a O O o o o o on H o o o o o o o o o H on H o o o o on H en H o o o o o o 1 On on H o o o o o o o o o o o o o o CO o o o o o o o o o o o o o o H o t— o o o o o o o o o o o o H o o o VO H O o o o o H o H o H o o o o o ir\ H H H o o o H H H o H H H H H H I -d- H O H o o o H o o H O H O H O on O H o o o o o o o H O O O O o o CVJ H H OJ o o o CM o OJ O OJ O OJ O OJ o rH O o o o o OJ o OJ o OJ O OJ o OJ o H CVJ on ,=r ir\ vo c— CO u o -p o (U w rH O Cm on i OJ Eh O -P u «m 0) S7 A. 3 DISPERSION FACTORS FOR SMALL-MAGNITUDE FIGURES Often the uncertainty on a column of figures, X- would be described in terms of "a dispersion factor D. for x. = B increasing to a dis- persion factor D for the smallest value reported." Taking this lower bound to be X- = A where A = $10 for the 196? U.S. input-output tables, and assuming a linear dependence of D(x) on log(x) we obtain the following expression for D as a function of x-« Let D(x) = a log(x) + b where a = (D g - D )/(log A - log 3) and b = (D 1 log A - D g log B)/ (log A - log B). It is easy to verify that D(x) takes values D x and Z 2 at x = B and x = A respectively. Obviously this is a crude approximation, but it actually may be even too refined when viewed from the perspective of the person estima- ting the uncertainty. 58 A.U PROBABLE VALUES OF "ZERO" ELEMENTS Since few transactions can be defined to be zero, the published figures truncated at $10 dollars may be misleading. It is probably- true that if we examined in detail the transactions of all firms in the U.S. defined by a particular transaction cell in the 1-0 table, we would find at least one nonzero transaction. Therefore the following approxi- mation was used to estimate the probable distribution of nonzero values between the lower and upper bounds [0, $10 ]. Let X be the absolute value of a normal random variable Y with mean and a = 10/ . Then X takes nearly all its values between and 10 . By truncating X at 10 , in the sense that larger values are discarded and resampled, the resulting random variable takes all of its values in [0, 10 ] with the great bulk of its unit probability ac- cumulated near zero. For direct energy transactions, the cutoff was 10 ^Btu, which corresponds to approximately the same dollar value. Details of the "folded normal" distribution are given in Appendix C. 59 Appendix B. Sector Definitions TABLE B-l. 30 Sector Model 30 SECTOR MODEL BEA SECTOR 1. 7 Coal 2. 8 Crude Oil and Natural Gas 3. 31.01 Refined Petroleum Products k. 68.01 Electric Utilities 5- 68.02 Natural Gas Utilities 6. 1-k Agriculture 7. 5-10 Mining 8. 11-12 Construction 9. Ik, 15, 29 Food and Drugs 10. 16-19 Textiles and Apparel 11. 20-26 Wood and Paper Products 12. 27, 28, 30-32 Paint, Plastics and Oil Products 13. 33, 3^+ Leather and Shoes lk. 35, 36 Stone, Clay and Glass Products 15. 37-^2 Metals and Metal Products 16. 1+3-52 Machinery 17. 53-58 Electrical Equipment and Appliances 18. 59-61 Cars, Planes and Transport Equipment 19. 62-6U, 13 Miscellaneous Manufacturing 20. 65.OI Rail Transport 21. 65.02 Local Passenger Transport 22. 65.03 Truck Transport and Warehousing 23. 65. 0k Water Transport 2k. 65.05 Air Transport 25. 65.06 Pipeline Transport 26. 66-67 Radio, TV, Communications 27. 69 Wholesale and Retail Trade 28. 70-71 Finance, Insurance and Real Estate 29. 65.07, 68.03, 72-79 • • • Services 30. 81-82 Business Travel and Office Supplies 60 TJ *t? 0) § S3 v< W) C ,7 I. "1 "J oi i, n H>> ^ u) tl b •rH •H H "3 rH >-. *J C G 0) > P. C -r. v« oi 4 01 *> >-. B-T3. 01 «-• p >- t : u a,' VI oi d o -1 01 OJ 44 c c *» o 0> O ft O > -rH 3 Oi O d H TJ > rH I- P, ■3 3 to TJ >•£ p •H O *H 4J In 2 s < o a. p. 3 -H I* 45 0> CJ Lt o> c ci to -ri Oi 4* *rt -H fl ^ o] w t* Of 01 *> 3 Vh <:hnno v II tl II ♦>♦»+>♦> el d a) ol «■ f f <> M 10 0) M r-i cni on.-*; unno t— m oooooooooo NI , 1Jinv0f-0)mOH0|l , >Jlf>0t-l0(?iOHNnjOOOOO0O>flt-O0vOrtfgl , >m\0l-(0l>H(\IV0^0)arttll^MOW j3j3^*^^s^s^if\ir\\r\tf\if\\f\u\tr\if\ir\\o\o\o*&*£> M}^ -vot--r— r-r— c— r--t--t--t— coco un u\v\ ir\ m in irv co nosononOc 1 — t— cocococo NO VO SO NO NO NO NO VO OS ON C\ 0\ O ON ON ON ON Ov cm o •p CJ I I I I I I I IUUOUHH OUMMMI-IMMUMl-l +J T3 ♦> •H H-> r-i 0) ■0 c § to P 3 O O c 3 ■r* tH •rH CI rH +-> CO 01 TJ O „"- 1 o CI u +> CI •H 01 4-> sou |H O 01 0; -- A a t. HJ 0> rH •rH +) rj i 01 0) 01 0> C 01 1) H 01 rH Ih 01 Ti ° r-\ CD a) ^ 01 0i 3 •r-l 01 * h h -a > 4 3 a t, CO rH 01 01 u H I*: CI r: .* j.: o i. o o o o 01 a. d o 3 oi o -ty cc o o i 'J rH CM M H rH t>-00 1 O O o O O 1 1 M a") CO CO 00 i^i NO NO NO vO > to rH c 01 •H to c r? ■ H-> a. x: to &3 o Ul c t. rH 3 « -H -H 0] 01 -a hJJ V-i C bfl 3 CJ e WO -H B •H eg a 3 o a) co C 13 3 oi .* TJ Ih 0) B CI s *&§ to o 09 60 -H ■*» rH CD >, H( C r-l n t. n rH o> aj o .St > a! s: u rH 0C C S 5 fl 3 rH 01 U oi Vh -H ()C, o ■p >, c 0) •a .-( - Vh J OH r. 01 4J ♦J a. 3 ttj 0) >; 1h 3 XJ CO 01 0) a o CJ S h** 'S o S3 o .C j3. •H rH o H-> c- rH r, cu u +> CO rl U) (J CJ 01 *H 1) > .r; in i. u o c; al 5 P. CQ -3 bl k5 +-> o iifl in 5 Cm vjTh ;- e a •h J-. TJ v« ty i-h aj o o to ,o r-i O, CO V- 4) CJ -rt Cl to P» Ji ^-» -P 0) jJ -p CO CO X (j (J w q t; cj o t? ^-« 13 to -H CJ ■--* «H 11 0) cl r] Oj lH TJ In In W a 3 o d <> Tl 3 O -d s •rH c c o t) p J4 d 0) il •u c Q) and co m and llan a 3 o o f) rH rH Sfl 11 rH c U r-H 0> rH 0> 0) 01 ■o a a o cfl CJ rO rd -(I O rO W n. 01 i o 1, O O rH •H p. o o •j: , i s '3 d ^J 'U it o UJ f0 p. 10 1." f H i> ■-( Im c; ■c 1 c: ) a a 1. 'O vC rfl S 01 5 u T) rH CO r 1 Si "1 a d n 5° ^ CM fj flj n' rO ■«^ cj * CO i«i t. rn P H ♦* CO -P OJ 01 G 1 CO i.fl ^ ,-< r. u u o ■rd'OrHd4)0.rH 0) ^3 p. o tc ■*-> CO hj r i; a h h n * oi a a p rH S d r< «0 3 Ml +» • c o o M rH a « C 3 p *J 10 - 1h o CO 3 U TJ +J H-> Ch OJ 3 Cl 3 •*-> C (f c: d l-H 3 o « 3 9 •H CO t: r. *» II ^1 rH d o 01 *H r3 r-1 HJ 01 u -a • 01 H-> y 5r» U J » 2 r, -r. °< Tl o g c, u Oi ^ s s M O O C oJ •" ) and (Xl+^sl+I* •" X 2L* X 3L' X 3L+1* X UL+1 ' •** ) ' If the covariance estimates are adjusted for the sample mean, then the tests for the hypothesized means and covariances can be made separately, each having valid significance level, even under certain natural dis- crepancies from the other hypothesis. Also, a theorem in section 20.6 of Cramer (19U6) shows that the large sample distribution theory is exactly the same as described above. 67 size times the maximum absolute difference between sample and hypothesized distribution functions. The P-level is defined as the probability that a truly normal sample would have achieved a larger value for the K-S statis- tic than the value actually observed. Given the independence hypothesis, the P-levels should be independent and uniformly distributed on the unit interval. This fact allows the use of the following summary statistic: if P..«'«P are independent and uniform on the unit interval, then n * -2*£ ln(P.) is chi square with 2n degrees of freedom. This sample 1 X statistic and its own P-level are included in Table 1 along with the K-S statistics and their P-levels as an additional check and summary. TABLE 1 Statistic P-level 1 .Ihk .6U 2 .869 M 3 .5^8 .92 k .807 .53 5 .772 .59 -2*Z ln(P ) 5.03 .89 This derives from the fact that if P is uniform on [0,1], then -2*ln(P) is exponentially distributed with distribution function l-e'X'2^ The result now follows by verifying that the associated density function is that of a chi-square random variable with two degrees of freedom. 68 A plot of the sample distribution function with the worst fit (from seed #2 in this case) is shown in Figure 1. Even in this worst case note how closely the sample distribution function fits the hypothesized distribution. The K-S tests described above provide a check on independence in the time domain; a check in the frequency domain was also performed. By summing the Fast Fourier Transform of each of the shuffled sequences, sample integrated periodograms were obtained. Under the independence hypothesis, the integrated periodogram values should increase linearly from zero to one as the frequencies increase from zero to one-half. The Grenander-Rosenblatt (G-R) statistic may be used to measure the discrep- ancy of the sample integrated periodogram from linearity. If the hypo- thesis is true then a factor of V20U8 / \J2 = 32 times the maximum ab- solute difference between the sample integrated periodogram and twice the corresponding frequency should have the distribution calculated in Hannan (1967) ; departures from independence will tend to make the sample sta- tistics too large to fit the distribution. The five sample statistics and corresponding P-levels are shown in Table 2 below along with the chi- square summary statistic defined in the last section. Seed #1 had the worst P-level so a graph of the corresponding sample integrated periodo- £ram is included in Figure 2. Computed with SOUPAC program FASPER available from Computing Services Office, University of Illinois, Urbana, Illinois 618OI. 69 8 en o o CO o o o COM ee*o SL'Q CETO k.:o &e*o ;s'o ti o 04- 3 o o t o O I 70 OOOOOOOOOOOOOCOOOOOOOCOOOOOOOOCOOOOOCOOCOOCOOOOGOOOOOOO I I i , I . I I I I I UJU-'U-A'JUiU L UJUJUiUHJ UJU LJUU-U U UjUili UU Uj... U u UJUIOi" OJU-LL UJUm U tUU.U'U-'liiUJUJUjUJUJULiUlUJUIU'UJ ^r»rg^— '^oir>o>»C'(^a'(Mr-— O'-'irvo^r-f^ xiMf-rsjo^oo^rc**? •ri^r-rN.-o-^.oOinc'^ c<i | C , -<<»»)C'<*<;i-' r >'CX— i^^S-c.iroccr^^c or^incco.0 ©»0 , Occeoecacccaui^r"-f»^h»f»-^^^o.Cii"M/ v i<"vc">intf">»y ■f*?-t^-Tc<\r*r r n r \r~ fNf\i(\if\/rMr\ip— h« *— -•——>'J)r*\f\e>t->if\ OOOCOOOOCOOOOOOOOCOOOOOOCcOOCOOOOOOOCOOOOCpOOOOOOOOOOQO ON* p* />#• • • NO« •H ♦ i*. ©* o« «\ • K, £ • • fsiO# «M * «*\ * ' E* • • I • • * ox X r-i* r 3 § ' 4 t 4 • I • • rte* |0 8 £ ► IT s>* r < a. O co OK *o III ►- < a" O UJ 2 I coi:-J.MiU>i'IU'l.:w>l. •; {. m .*• j-oo— •!*■<•* t. o-oir — mwtj r-M. l«l!Tj*-»^0""' np ^ »f"^~.»rv*"- — < J«-<'0 r-C'^ .»►-'» N>0'.' -s'.'.'-i.fj' — •f M— < ~" s -'"y ^r- - ic *"t~-it. •'i — ir .j>. j- ~-». CCi oo- x x| 3T X mo x *»«■*(**** C x (MOO X z xo n <\J00 HttMU.M £X 3 xai u * • » COIN* X XX c X: * «■ » « oc-o CO o CO C.J T y ccaccjcioojocoaoocaooo i i i Ull. I ■4 -.' r . • fill c.i'i'iu u u.uin-ti u.ut'i u:'l ^1 . • ^^ * ^ *^» - f\ '.»■ % *- T3 m ■ f . rxj — «r*^l T* .■r --\ -^ .-^. *^ ^j <^j -g r> i pvj <%:-^— < ^i --* _4 ^^ ^ j **• *f\ **\ II I HIU mo OOjCOOO">00 JOQOOOOonO i \ ooqorocnooc'JOOQiooiooooiciooooooocoao T I o c a> & w <^i (0 o O •H CVJ to •H TABLE 2 Statistic P-level .1 2.21+3 .OU 2 .725 .88 3 1.578 .33 k l.OOU .63 5 .671 .92 -2*1 ln(p.) 1 10.00 .Ul» The res\ilts of this series of tests, like those of the preceding section, are quite satisfactory and give no cause to suspect interdependence between simulation runs . 72 D.3 INDEPENDENCE BETWEEN INPUTS TO A MONTE CARLO SAMPLE While the frequency domain tests of the last section actually include a test of the independence of sample inputs, further tests were performed and are described below. The next two series of tests paralleled the two series of the last section in methodology, but this time attention was focused on the two sequences of 102U numbers generated from each of the five seeds. In the time domain, the autocovariances at lags L = 1,512 of a sequence were adjusted by a factor of /102U-L and tested with the Kolmogorov-Smirnov statistic. Again, under the hypothesis of independence within each of the ten sequences of 102^ numbers, the adjusted autocovariances of each sequence should constitute a standard normal sample. The resulting sample statistics and P-levels along with the chi-square summary statistic and its P-level are shown in Table 3. Similarly, the ten sequences were tested for independence in the frequency domain with the Fast Fourier Transform just as in the previous section. The sample statistics and summary statistic are shown in Table k. Again, the worst cases are illustrated for the time and frequency domain tests in Figures 3 and k. TABLE 3 Statistic P-level 1A IB 2A 2B 3A 3B ua kB 5A 5B 1.132 1.21U 1.077 .573 .882 1.106 .7^5 .657 .61*5 .U91 .15 .11 .20 .90 .U2 .17 .61* .78 .80 .97 -2*1 ln(P.) l 18.81 .53 73 3 rj CD TU r ,,.,, ...u ii, a. u iLu.uJii'UJU'U a u-U-iu UIU.U I'.U'U.IDU ujujUiu uiuu uju'UUiuju'u-uiu uiilu'liuiuui I ■tU'00-(M(0^i.'>inONc:^c , CMNr v *ir<.'\of>-xroOHM0-» in * m A. mm m X' in m i in: xm m m nim »n mm. *x 5- CM m m o.-g», o» m# m*; • t -, m«\i tn x«*- ' o «*-x in *x n n ■tf-m in in mm x>* o X.J- mm m LI mm in o mm 1 >n *x m x>t .n in in tn X«f in x*r m mm> xy ;mm mm >rx o Li X. ►X | w-l I mr^ h*x .n «j-x m « * « « « •J « * *rx mm booc'cncrr ocr>r »?">c c itiror it c-r-ocoorjc oooooooo oo^oo ;ccoo?oo?c;ccoooccooocjc I 'a-H'ji mm u. l U' J i.l. j UiuiI:-i'« O'.'-'ncw »>»i ?!•»'». 'cc- •mm^"».'«L* j'- i i^v. ; v — i">^.- — ;?-:—— « Jm o r ~j j»—-r^-' — •<. r - •" .- " (M* V i- — n;f^f^f^r^(-~r^ oo^j • o >. ■>>•" >-" x -* Oi "W vT >T >r r -Tfi iWitiHi^NNNMM'JH'' "(JOOOC OO DO I I I I I I a. i' UIUIL.1..1.H1 j.uj 00000000 r 0000000 00 3o,oocjoo-j 00 J ooooooocooqo^o 00 c -3100000. ^00 <\J 3! I o g a; CO -p to >H •H o Pi o u o •H CD Pi cu 13 w CD •H TABLE h ' Statistic P-level 1A .659 .93 IB 2.259 .05 2A 2.1+39 .03 2B l.U6l .29 3A .602 .96 3B 1.52U .25 UA 1.U23 .31 ub .910 .71 5A .866 .75 5B .833 .79 -2*Z in (P.) 22.55 .31 These results, like those of the preceding section are quite acceptable. Overall then, the independence properties of GGNRF are very satisfactory. 76 D.U TESTING FOR NORMALITY A final series of tests was undertaken to examine the suitability of GGNRF in an application calling for normally distributed random numbers. First, the Kolmogorov-Smirnov goodness of fit test was applied to the ten sequences of 102U numbers with the results shown in Table 5 Following the same format as in previous sections, the sample statistics and their P-levels are given along with the summary chi-square statistic, The plot of the sample distribution function with the worst fit is shown in Figure 5. TABLE 5 Statistic P-level 1A 1.210 .10 IB .799 .55 2A .70U .70 2B .578 .89 3A 1.111 .17 3B 1.187 .12 kA 1.095 .18 UB .590 .88 5A 1.053 .22 5B .609 .85 -2*1 ln(P.) l 21.57 .36 77 o o GO - 1 78 Although these results indicate good fit , especially given the relatively large sample size, the sample variances and means were also checked. Given the standard normality hypothesis, the sample variances multiplied by 1021+ should be chi-square distributed with 1023 degrees of freedom. The sample variances and the corresponding P-levels under this hypothesis are listed below in Table 6 along with the usual summary statistic. . TABLE 6 Statistic P-level 1A .979 .77 IB .915 .97^ 2A 1.106 .009 2B .936 .925 3A .997 .51 3B 1.038 .19 HA 1.033 .22 Ub 1.022 .30 5A .963 .79 5B .998 .50 -2*1 ln(P.) 22.11 .33 79 The sample means should be normal with mean and variance 1/1021+ . Multiplying the sample means by a factor of 32 should result in numbers drawn from a standard normal population. However, the large size of the P-levels in Table 7 gives cause for suspicion that the hypothesis is not true. On the other hand, the sample size of 1021+ was originally chosen to be large enough to signal even acceptably small deviations from ideal behavior; the very worst sample mean was only -.062. Furthermore, given the amount of testing undertaken in this study, it is to be expected that sooner or later some test results will go awry. To shed more light on the matter, further tests of the mean tendency of GGNRF seemed appro- priate. Since only one seed would be needed for the actual Monte Carlo application, seed #3 was selected for a more intensive examination. Eleven thousand numbers were drawn from seed #3 and then discarded in order to skip over the two strings of 1021+ numbers previously tested. Then ten consecutive strings of 102U numbers were drawn and their sample means were computed. The results are listed in Table 8. In addition, ten new seeds were selected and samples of 1021+ numbers drawn from each. The data for these sequences is shown in Table 9. TABLE T Statistic P-level 1A -.01+9 .9* IB -.019 .72 2A .0083 .1+0 2B .0069 .1+1 3A .031+ .11+ 3B -.01+2 .91 UA -.067 .98 UB -.021 .75 5A -.056 .95 5B -.019 .72 -2*1 1 (P ) 9.89 .97 n 1 An TABLE 8 1 Statistic P-level 1 2 3 b 5 6 7 8 9 10 .035 .013 -.029 -.0035 -.036 -.016 -.005^ .025 .06b -.0060 .13 .3U .82 .5h .87 • 70 .57 .21 .02 .58 -2*1 ln(P.) 21.925 ,3k TABLE 9 Statistic P-level 1 .OUl .10 2 .0053 .U3 3 -.0U .90 1* .023 .23 5 -.010 .63 6 -.02U .78 7 .029 .17 8 .018 .28 9 .0026 M 10 -.0035 .5b -2*Z ln(P.) 19.1b .U8 81 The results of these tests certainly do much to allay the suspicions raised by the results in Table 7. Neither would it seem that seed 03 has run into an area of systematically bad behavior (witness Table 8), nor would it seem that there is an overall bias in the generator (witness Table 9). These tests together with the preceding K-S tests and sample variance tests indicate that GGNRF has satisfactory distributional properties. Overall then, GGNRF tests out as a satisfactory normal random number generator. 82 REFERENCES 1. C. Bullard and A. Sebald, "Effects of Parametric Uncertainty and Technological Change in Input-Output Models," CAC Document No. 156, Center for Advanced Computation, University of Illinois, Urbana, IL 6l801, March 1975. 2. C. Bullard, A. Sebald, D. Putnam, D. Amado , "Stochastic Analysis of Un- certainty in a U.S. Input-Output Model," CAC Document (forthcoming). 3. C. Bullard and R. Herendeen, "Energy Cost of Consumption Decisions", Proc. IEEE, March 1975- Also available as Document 135, Center for Advanced Computation, University of Illinois , Urbana IL 6l801. h, C. Bullard, "Uncertainty in the 1967 U.S. Input-Output Data," CAC Document No. 191, Center for Advanced Computation, University of Illinois, Urbana, IL 618OI, April 1976. 5. H. Cramer, Mathematical Methods of Statistics , Princeton University Press, Princeton , NJ, 19*+8. 6. Hannan, E. J., Time Series Analysis , Science Paperbacks and Methuen & Co. LTD, London, 1967. 7. W. Hayes and R. Winkler, Statistics: Probability, Inference and Decision , Holt Rinehart and Winston Inc., New York, 1970. 8. B. Jansson, Random Number Generators , Victor Pettersons Bokindustri Aktiebolag, Stockholm, 1966. 9. R. Knecht, "Reliability Measures of CAC-ERG Direct Energy Use Data," CAC Technical Memorandum No. 67, Center for Advanced Computation, University of Illinois, Urbana, IL 6l801, November 1975- 10. H. Kuki, "A Fast Normal Random Number Generator", Computation Center, The University of Chicago, Chicago, IL, 197*+. 11. W. Leontief , The Structure of the American Economy 1919-1939 , Oxford Univer- sity Press, New York, 19 Ul. 12. 0. Morganstern, On the Accuracy of Economic Observations , Princeton University Press, Princeton, NJ, 1950. 13. A. Sebald, "An Analysis of the Sensitivity of Large Scale Input-Output Models to Parametric Uncertainties," CAC Document No. 122, Center for Advanced Computation, University of Illinois, Urbana, IL 6l801, November 197*+. lU. D. Smith & D. Simpson,' "Direct Energy Use in U.S. Economy, 1967," CAC Technical Memorandum No. 39, Center for Advanced Computation, University of Illinois, Urbana, IL 6l801, January 1975. 83 15. M. A. Stephens, "Asymptotic Results for Goodness-of-Fit Statistics with Unknown Parameters", Annals of Statistics , V.k, no. 2, 1976, pp. 357-369. 16. U.S. Department of Commerce Bureau of Economic Analysis, Input-Output Structure of the U.S. Economy 1967 > U.S. Government Printing Office, Washington, D.C., 197^. 17. U.S. Department of Commerce Bureau of Economic Analysis, Definitions and Conventions of the 1967 Input-Output Study , (mimeo) 197*+. 18. S. S. Wilks , Mathematical Statistics, John Wiley & Sons, New York, 1962. 8k