BEBR 
 
 
 FACULTY WORKING 
 PAPER NO. 89-1592 
 
 An Empirical Comparison of Probit and 
 ID3 Methods For Accounting 
 Classification Research 
 
 The Ulraiy of the 
 OCT t | 1989 
 
 cf Urt)an«-CJwmpalgn 
 
 
 John S. Chandler 
 Ting-peng Liang 
 Ingoo Han 
 
 College of Commerce and Business Administration 
 Bureau of Economic and Business Research 
 University of Illinois Urbana-Champaign 
 
BEBR 
 
 FACULTY WORKING PAPER NO. 89-1592 
 
 College of Commerce and Business Administration 
 
 University of Illinois at Urbana- Champaign 
 
 August 1989 
 
 An Empirical Comparison of Probit and ID3 Methods 
 for Accouunting Classification Research 
 
 John S . Chandler 
 Associate Professor 
 
 Ting-peng Liang 
 Assistant Professor 
 
 Ingoo Han 
 Ph.D. Student 
 
 Department of Accountancy 
 
 The authors thank James Gentry for his helpful comments on earlier 
 versions of the paper. This research was partially supported by the 
 Department of Accountancy of the University of Illinois at Urbana- 
 Champaign. 
 
ABSTRACT 
 
 This paper investigates some properties of applying Probit and ID3 methods to 
 the analysis of accounting classification problems. The particular accounting problem 
 examined is the LIFO/FIFO choice. Both original and hold- out samples are used to 
 study the effects of the training sample size and the nature of the data set on the 
 accuracy of classification. The results indicate that (1) Probit and ID3 identify different 
 factors that affect LIFO/FIFO choice;' (2) in hold-out tests, ID3 performs better when 
 the sample size of the input data set is small relative to the total population; whereas 
 Probit performs better when the sample size is relatively large; and (3) ID3 performs 
 better when the input data set is dominated by nominal variables; whereas Probit 
 performs better otherwise. 
 
Digitized by the Internet Archive 
 
 in 2011 with funding from 
 
 University of Illinois Urbana-Champaign 
 
 http://www.archive.org/details/empiricalcompari1592chan 
 
Page 1 
 1. INTRODUCTION 
 
 In the past decade, Probit has been one of the primary methods in studying 
 accounting classification problems such as LIFO/FIFO choices or bankruptcy prediction 
 (e.g., Dopuch and Pincus 198S; Hagerman and Zmijewski 1979; Lee and Hsieh 1985). 
 Although Probit has been argued to be theoretically superior to both multivariate 
 discriminant analysis (MDA) and ordinary least square regression (e.g., Dietrich and 
 Kaplan 1982: Ohlson 19S0) 1 in classification research, limitations exist when nominal 
 variables are involved. In this case, dummy variables must be used to represent 
 different values of the nominal variables, which may result in a violation of the 
 normality assumption that the relationship between the dependent variable and 
 independent variables is a cumulative normal distribution function (Aldrich and Nelson 
 19S4). In addition, the assumption that the dependent variable is a linear function of 
 the independent variables may be questionable when nominal variables exist. 
 
 Recently, nonparametric classification techniques have been considered as 
 alternatives to traditional parametric methods in classification problems. For example. 
 Marais. Patell. and Wolfson (1984) applied a recursive partitioning algorithm (RPA) to 
 commercial loan classification and found it to be "a viable competitor to parametric 
 methods such as polytomous Probit even when the assumptions underlying the 
 parametric model are satisfied. " Frydman, Altman, and Kao (19S5) also report that 
 RPA outperforms discriminant analysis in most original sample and hold-out 
 comparisons. In addition to the feature of making no assumption on data distributions, 
 non-parametric methods usually derive a decision tree that shows the interaction of 
 variables. After proper transformation, decision rules suitable for developing expert 
 systems or rule-based decision support systems can be derived from the decision tree. 
 
 Counter-arguments also exist. In a recent study, for example, Noreen (19SS) 
 shows that (1) the rejection regions for the Probit test statistics are not well-specified 
 for small samples, and (2) the ordinary least square regression seems to perform at least 
 as well as Probit for the cases considered. 
 
Page 2 
 which may make the resulting model easier to use and to understand. 
 
 The primary purpose of this paper is to investigate the properties of another 
 nonparametric algorithm, the ID3 method, in analyzing accounting problems. ID3 
 algorithm is an inductive learning technique that derives decision models from data. It 
 originated from Hunt, Martin, and Stone's work (1966) on conceptual learning and was 
 later implemented and expanded by Quinlan (1979. 1982). The primary difference 
 between ID3 and RPA is that the former uses a criteria derived from information theory 
 to determine the relative importance of independent variables and constructs decision 
 trees accordingly; whereas the latter minimizes the observed expected cost of 
 misclassification. Recent studies on ID3 have provided evidence that it can outperform 
 expert judgment and discriminant analysis (e.g., Braun and Chandler 1987, Messier and 
 Hensen 1988). In this paper, we use both original and hold-out samples to investigate 
 its sensitivity to training sample size and the nature of the data set. The particular 
 accounting problem studied was the LIFO/FIFO decision. 
 
 Our empirical results include the following. First, ID3 and Probit identify 
 different factors that affect LIFO/FIFO choice. This raises a concern about the effect 
 of research methods on the interpretation of research findings. Second, in hold-out 
 tests, ID3 performs better when the sample size of the input data set is small relative to 
 the total population; whereas Probit performs better when the sample size is relatively 
 large. Third, ID3 performs better when the input data set is dominated by nominal 
 variables; whereas Probit performs better otherwise. 
 
 The remainder of this article is organized as follows. Section 2 describes the ID3 
 algorithm. Section 3 briefly reviews some methodological issues in LIFO/FIFO 
 research. Section 4 discusses the first experiment that compares the internal validity of 
 the models (i.e., the degree to which the cases in the data set from which the model was 
 derived are correctly classified by a model). Section 5 presents the results of the second 
 
Page 3 
 experiment in which hold-out samples are used to examine the external validity of the 
 
 resulting models (i.e., the degree to which hold-out cases are correctly interpreted by a 
 
 model). Section 6 concludes the findings and discusses some implications. 
 
 2. THE ID3 ALGORITHM 
 
 The input to ID3 is a data set consisting of observed data of N cases (called 
 training sample data). For each case, the input data include its actual group 
 classification and values associated with a finite number of factors potentially affecting 
 its group classification. The function of the algorithm is to induce a model from the 
 observed data, which is capable of identifying the relationships between the factors and 
 the actual classification. Instead of relying on sample distribution statistics, the 
 algorithm uses entropy to measure the relative information content attributed to each 
 factor and generates a decision tree model. The factor with the highest information 
 content is considered the more important factor and selected as the root node of the 
 tree. Other factors are then examined based on their relative information content. In 
 this section, we shall discuss the measurement of information content and the model 
 construction process of the ID3 algorithm. 
 2.1. A Measurement of Information Contents— Entropy 
 
 Entropy was originally developed to measure the amount of information 
 transmitted in a communication process (Shannon and Weaver 1949). It indicates the 
 observational variety and has a value range from zero to one (Krippendorff 19S6). 
 Entropy is zero when all observations are of the same kind (i.e., no variety), and is one 
 when observations have equal opportunities to be classified as any one of the classes 
 (i.e., maximum degree of variety). Entropy assumes nothing about the nature of the 
 frequency or probability distribution and are, thus, nonparametric. When applying 
 entropy to classification problems, the entropy of a variable shows the extent to which 
 the accuracy of a classification can be improved (or the uncertainty can be reduced) by 
 
Page 4 
 introducing the variable. The purpose of the ID3 algorithm is to construct a decision 
 
 tree capable of classifying all cases in the input data set. 
 
 Mathematically, entropy is a logarithmic function of related frequencies or 
 probabilities. Consider a data set of N cases, with each case described by a number of 
 variables and a category. A given variable X classifies the N cases in k categories, C-., 
 ..., Ci , and has m values, Vi, ..., V m . For a particular value, V-, of X, there is a 
 probability of p-- that V- classifies a case into class C-. The entropy of X = V- is 
 
 H(Vj) = - £ P H log2 Pj: (1) 
 
 The entropy of X, the weighted sum over all of its m values, is 
 H(X) = g -§- H(V ; ) (2) 
 
 Where N- = number of cases where X = V-. 
 i l 
 
 For numerical variables, the calculation of entropy by ID3 includes two steps. 
 First, a value is chosen to split the range of values for that variable into two regions: 
 high and low. Second, the entropy of the variable is computed based on that split 
 value. This process is performed for each possible split. The value that minimizes 
 entropy is selected as the split value for the variable. In other words, for each case W- 
 (1 < i < N), ID3 divides X values into two subsets (V-i, .. , V-) and (V- , -,, .. , V^), 
 which allows ID3 to compute the entropy resulting from the split. If the division of 
 (Vi, .. ,V+) and (V\ , -., .. , V^) has the the lowest entropy, then we split the variable 
 at S, where 
 
 S = (V t + V t+1 )/2 (3) 
 
 Insert Table 1 Here 
 
Page 5 
 Table 1, for example, shows a set of highly simplified LIFO/FIFO data including 
 
 one nominal variable and one integer variable. For the variable of industry type, in the 
 
 lumber industry, six firms use FIFO and no firm uses LIFO; while in the metal industry 
 
 one firm uses FIFO and seven firms use LIFO. Based on equations (1) and (2), the 
 
 entropy of the variable industry type can be calculated in the following: 
 
 H(Industry=lumber) = - | log 2 | - jj log 2 Q = 
 
 H(Industry=metal) = - g log 2 | - 1 log 2 | = 0.54 
 
 H(Industry) = £ * + ^ * 0.54 = 0.308 
 
 Since net sales is an integer variable, we need to find the split with the minimum 
 entropy. Among the thirteen possible splits in the example, the optimum split is 450 
 million which divides the values of net sales into two groups: (63, .. , 400) and (500, .. , 
 2300). The first group includes five FIFO firms and no LIFO firm; whereas the second 
 group includes two FIFO firm and seven LIFO firms. Its entropy is 
 H(Net sales) = - - £ (| log 2 | + 1 log 2 1 ) = 0.491 
 
 The values of 0.308 and 0.491 indicate the resulting varieties after introducing 
 industry type and net sales into the classification model, respectively. 
 2.2. Model Construction Process 
 
 Since lower entropy implies lower level of variety and lower uncertainty, the ID3 
 algorithm considers the variable with the lowest uncertainty as the most important one 
 and gives it higher priority in constructing models. Its model construction process 
 begins with the whole input data set from which the root node of the classification tree 
 can be constructed. This includes several steps. First, the entropy of each variable is 
 calculated based on the input data. Second, the variable with the minimum entropy is 
 chosen as the root node of the tree. If the variable is nominal and has m levels, then 
 the tree will have m branches at the first level and all input cases will be divided into m 
 
Page 6 
 groups according to their values of the root variable. For numerical variables, the tree 
 
 will have two branches containing cases whose values are higher than and lower than 
 
 the split value, respectively. 
 
 After splitting the original cases, each of the m groups of input cases is 
 considered a separate data set. If all cases in the group are in the same category, then 
 no further analysis on the group is needed. This indicates that the preceding variable 
 is capable of classifying those cases completely. The category to which these cases 
 belongs becomes a leaf node of the h'< Otherwise, entropies of the variables will be 
 calculated again based on the cases in the subgroup and the variable with ill- minimum 
 entropy will be attached to the branch. The ca>c-s in the moup will be lurther split 
 based on the value of the selected variable. This process continues until no further 
 improvement is possible. 
 
 In the previous example, the entropy of industry type is lower than that of net 
 sales. Therefore, the industry type forms the root node, which divides the firms into 
 lumber and metal groups. Since all firms in the lumber group use FIFO, no further 
 analysis is possible and the leaf node of this branch is FIFO. 
 
 In the metal group, one firm uses FIFO and seven firms use LIFO. A further 
 analysis splits the net sales of the firms into two groups: (500, .. , 1000) and (1420, .. , 
 2300). The first group includes one FIFO and two LIFO firms, while the second one 
 includes five LIFO firms and no FIFO firm. The split value is 1210 and the entropy is 
 calculated to be 0.344. Since the two firms in the first group are not of the same class, 
 we can further classify them into two categories: firms with net sales less than 825 
 (LIFO firms) and firms with net sales higher than 825 (FIFO firms). All firms in the 
 second group are LIFO firms and cannot be further decomposed. The process stops 
 when all firms in the same group are using the same inventory method. Figure 1 shows 
 the resulting decision tree. 
 
Page 7 
 
 Insert Figure 1 Here 
 
 2.3 A Comparison of ID3 and Probit 
 
 Probit method uses statistical inference procedures to derive a linear model from 
 a set of input data. The model estimates the likelihood that, given the input data, the 
 case falls in a particular class. It has several assumptions. First, the dependent 
 variable is categorical. Second, the relationship between the dependent variable and the 
 independent variables is a cumulative normal distribution. Third, no two or more 
 independent variables are perfectly correlated. Fourth, there is no serial correlation of 
 the dependent variable among the cases. Based on these assumptions, Probit estimates 
 the parameters of the linear model by the Maximum Likelihood Estimation (MLE) 
 procedures (see for example [Aldrich and Nelson 1985] for a detailed discussion). 
 
 ID3 is different from Probit in at least the following aspects. First, the ID3 
 algorithm makes no assumption on data distribution. In fact, the algorithm treats 
 continuous variables as discrete and uses a recursive decomposition process to divide 
 their values into several discrete ranges. Probit, on the other hand, assumes that the 
 relationship between the dependent and independent variables is a cumulative normal 
 distribution function. Therefore, it seems that the ID3 algorithm is more appropriate 
 when the normality assumption is likely to be violated, and Probit is more appropriate 
 otherwise. 
 
 Second, the ID3 algorithm generates decision tree models in which the weakness 
 of a factor may not be compensated by the strength of the other. Probit models, 
 however, assume a linear compensatory relationship among independent variables. This 
 implies that ID3 may be more appropriate when the problem involves nominal variables 
 that make a linear model inappropriate. 
 
Page 8 
 Third, the model construction process of ID3 is essentially an exhaustive 
 
 decomposition process, which tries to cover every instance; whereas the Probit method 
 
 focuses on optimizing the probability of correct classification. Therefore, ID3 seems to 
 
 be more likely to overfit the sample data and hence may be more sensitive to the noise 
 
 in the input data set. 
 
 Finally, the entropy function is a logarithmic function generally biased toward 
 variables with more levels and against variables with less levels (Mingers 1987). In 
 other words, variables with more levels are more likely to be given higher priority in the 
 model construction process. Probit models do not have this bias in processing 
 numerical variables, but may be in favor of attributes with less levels when dummy 
 variables are used in handling nominal variables. 
 
 Given these differences, it would be interesting to know whether these two 
 methods have different properties when they are applied to accounting classification 
 problems. How different will the models derived from different methods? Do different 
 models have different internal and external validities? Which method is better? When 
 and why does a particular method outperform the other? In the remaining sections, we 
 describe two experiments investigating these issues in the context of LIFO/FIFO 
 choices. 
 
 3. BACKGROUND OF LIFO/FIFO RESEARCH 
 
 Choice of inventory accounting methods has been a research issue for the past 
 decade. Theoretically, the LIFO method has tax advantages when inflation exists and 
 is considered more attractive than the FIFO method. In practice, however, a majority 
 of firms still adopt FIFO as their primary inventory accounting method. As a result, 
 much research has been conducted to investigate the factors affecting the adoption of a 
 certain method (e.g., Biddle [1980], Cushing and LeClere [19SS], Dopuch and Pincus 
 [19SS], Lee and Hsieh [19S5], Morse and Richardson [1983]). 
 
Page 9 
 Previous literature has examined at least three potential explanations of 
 
 LIFO/FIFO choice: Ricardian costs, agency costs, and political costs (Lee & Hsieh 
 1985). The Ricardian hypothesis assumes that the inventory method choice is based on 
 a firm's comparative advantage in tax minimization associated with the production- 
 investment opportunity set. A particular method (e.g., LIFO) will be adopted if its tax 
 savings exceed the implementation costs. Therefore, LIFO may be the optimal choice 
 for some firms; whereas FIFO is the optimal choice for others. The agency cost 
 hypothesis assumes that some firms remain on FIFO to report higher earnings because 
 of managers' concerns about the impact of a LIFO switch on the securities market or 
 their compensation contracts (e.g., Abdel-Khalik 1985; Ricks 1982). Managers are 
 willing to forego potential tax savings to obtain other benefits. The political costs 
 hypothesis assumes that a method will be chosen if its political costs exceed the 
 potential tax savings. For example, the dominating firm in an industry may choose 
 LIFO to reduce its reported earnings to avoid being the target of the anti-trust laws. 
 
 Probit has been the major method used in previous studies to test these 
 hypotheses. Empirical findings, however, are inconclusive in many aspects. For 
 example, the relative frequency of price increases was found significantly different 
 between LIFO and FIFO firms by Lee and Hsieh [1985]; but the effect was insignificant 
 in Dopuch and Pincus [19SS]. The inconsistency in previous research findings may be 
 due to several reasons. 
 
 (1) Data effects — the data collected for hypothesis testing in different studies 
 may have different characteristics. In terms of long-term LIFO and FIFO firms, for 
 example, Lee and Hsieh (1985) chose firms using a certain method consecutively for 
 more than seven years; whereas Dopuch and Pincus (19S8) used 20 years as the 
 criterion. 
 
Page 10 
 
 (2) Variable effects — variables selected for examination may be different in 
 
 different studies. There are usually more than one variable that can be used to test a 
 theory. For example, both net sales and total asset can be used as surrogate variables 
 for firm size. In addition, the correlation between variables may make it difficult to 
 clearly relate the significance of a variable to a single theory. 
 
 (3) Method effects — the Probit method used for hypothesis testing may have 
 limitations that prevent it from providing unbiased results. In generally, there are three 
 potential biases in Probit models. First, the effect of nominal variables may be 
 underestimated. Using dummy variables to handle nominal factors dilutes the overall 
 effect of the factor. This bias is particularly significant in a multivariate analysis when 
 the nominal variable has many levels. In LIFO/FIFO studies, for example, industry 
 type was found significant in univariate analysis but insignificant in multivariate 
 analysis (Lee and Hsieh 1985). 
 
 Second, the linear compensatory model assumption may be inappropriate in 
 studying LIFO/FIFO decisions. The linear compensatory model is appropriate only if 
 we assume that the manager uses a weighted-sum strategy to make LIFO/FIFO 
 decision. Otherwise, we need to consider other functional forms. The decision tree 
 model derived from ID3 may be an appropriate alternative form for other strategies 
 such as conjunctive selection, disjunctive selection, or elimination by aspects. 
 
 Third, the cumulative normal assumption may be violated, which results in 
 unreliable parameter estimations. There arc at least two factors that may cause the 
 violation of the normality assumption: nature of data and training sample size. When 
 the decision is primarily affected by nominal variables or the data distribution is 
 extremely skewed, the normality assumption is likely to be violated. When the size of 
 the input data is small, the normality assumption is also likely to be violated. Based on 
 the discussion in the previous section, the ID3 algorithm does not have these biases 
 
Page 11 
 (although it certainly may have some other biases) and can be a promising alternative 
 
 to Probit in investigating the inventory accounting decision. 
 
 In order to compare the ID3 and Probit methods, we conducted two 
 experiments. In the first study, we examined the data and variable effects of Probit 
 models, and compared the internal validity of the ID3 and Probit models. In the 
 second experiment, we used hold-out samples to examine how training sample size and 
 the nature of data affected the external validity of these methods. 
 
 4. THE FIRST EXPERIMENT 
 
 The first experiment focuses on comparing the models resulting from ID3 and 
 Probit. Data collected from the COMPUSTAT data base and DRI tape are analyzed 
 by both Probit and ID3 methods. The results are then compared with previous 
 empirical findings. Our primary purpose is to examine the methodological issues such 
 as the variable and method effects. Therefore, we have no intention of arguing whether 
 previous LIFO/FIFO research findings are appropriate. 
 4.1 Data Collection 
 
 Data collection included two stages. An initial data base consisting of eighteen 
 variables was constructed. This data base was then used to compile six data sets for the 
 experiments. 
 4.1.1 Initial data base 
 
 Based on theories and previous research findings, eighteen explanatory variables 
 considered having effect on LIFO/FIFO choices were selected, which included one 
 nominal and seventeen numerical variables. This set of variables was chosen to reflect 
 the following concerns. 
 
 (1) Nature of industry -- Some industries have unique environments in favor of a 
 
Page 12 
 certain inventory accounting method. Most previous research uses two-digit SIC codes 
 
 to represent the nature of industry. This variable has been found significant in 
 
 Eggleton, Penman, and Twombly (1976) and the univariate analysis of Lee and Hsieh 
 
 (1985). 
 
 (2) Firm size — The benefits of using LIFO are expected to be more significant 
 for larger firms. It was found important in Morse and Richardson (1983), Abdel-Khalik 
 (1985), Cushing and LeClere (1988), and Dopuch and Pincus (1988). Three variables 
 were used as surrogates for firm size in our study: net sales, net income, and total 
 
 assets. 
 
 (3) Inflation and its variability— Higher and stable inflation rates arc expected to 
 generate higher tax benefits from using LIFO. We used the average growth of input 
 price to measure inflation rate, and used coefficient of variation (CV) of input price 2 
 and CV of growth of input price to measure price variability. 
 
 (4) Inventory and its variability -- A stable and non-decreasing inventory level is 
 expected to generate the maximum tax savings from LIFO adoption. We used average 
 inventory to measure the inventory level and CV of inventory to measure inventory 
 variability. In general, inventory variability may be affected by the variabilities of 
 demand and production. Firms with lower demand or production variability more 
 easily maintain a stable inventory level. We used net sales growth, CV of net sales 
 growth, and relative frequency of net sales growth to measure demand variability; and 
 used CV of net income and CV of net sales to measure the operational variability of a 
 firm. 
 
 (5) Inventory controllability - Tax savings obtained from using LIFO depends 
 on the inventory controllability of a firm. The ability to control inventory is a 
 favorable factor for a firm to adopt LIFO. We used two ratios, inventory/net sales and 
 
 Coefficient of Variation (CV) = Standard deviation / mean. 
 
Page 13 
 inventory/total assets, to measure the inventory controllability. 
 
 (6) Capital intensity -- Lee and Hsieh (1985) argue that capital-intensive firms 
 have higher fixed-to-variable-cost ratios and should have a stronger incentive to use 
 LIFO. We included gross capital intensity in our variable set. 
 
 (7) Debt/equity ratio — A higher debt/equity ratio may force the firm's manager 
 to increase current earnings by adopting LIFO. We included long-term debt/equity 
 ratio as its surrogate measure. 
 
 Table 2 lists the seventeen numerical variables and indicates those tested in Lee 
 &: Hsieh (1985) and Dopuch and Pincus (1988). Please note that our point is not to 
 determine the best set of explanatory variables for LIFO/FIFO studies but to develop a 
 set of LIFO/FIFO data on which the impact of different methodologies can be 
 investigated. 
 
 Insert Table 2 Here 
 
 After determining the variables, data were collected from the COMPUSTAT 
 database. The inflation data were collected from the DRI tape. Since many firms 
 switched from FIFO to LIFO in 1974 in response to the oil crisis, we set 1975 as the 
 starting year to obtain samples. The criterion for selection was that the firms must 
 have used LIFO or FIFO firms consecutively for at least ten years. Initially, 220 FIFO 
 firms and 60 LIFO firms were identified. Three of them were later eliminated because 
 of missing data. These firms were distributed in 23 industries, as listed in Table 3. 
 Table 4 shows the means and standard deviations of LIFO and FIFO firms for the 
 numerical variables. 
 
 Insert Tables 3 and 4 Here 
 
Page 14 
 4.1.2 Testing data sets 
 
 Since more than one surrogate variable may reflect the same theoretical factor in 
 our initial data base, high correlations among them exist. In order to test the variable 
 and method effects in classification research, we compiled six data sets of different 
 variables from the intial data set. Each resulting data set still has 280 cases. The 
 procedures for composing these data sets are as follows. First, three sets of data with 
 eight numerical variables each were selected after considering the multicollinearity 
 issue. This allowed us to examine the effect of using different surrogate variables in 
 model construction. Second, the nominal variable industry type was added to the three 
 sets to form another three data sets. This allowed us to examine the effect of nominal 
 variables in model construction. Table 5 lists the variables included in each data set. 
 
 Insert Table 5 Here 
 
 4.2 Data Analysis 
 
 For each data set, two analyses were applied to construct models from data. 
 First, Probit was applied to examine the effect of including different variables on 
 hypothesis testing. The results, as shown in Tables 6 and 7, indicate that the variable 
 effect does exist when Probit is applied. For example, long-term debt/equity is 
 significant in model 1 but insignificant in models 2 and 3. In addition, when CV of net 
 sales was replaced by CV of net sales growth, the significance levels of CV of inventory 
 reduced (models 1 and 3 in Table 6). 
 
 Insert Tables 6 and 7 Here 
 
 Another effect we observed is the impact of nominal variables. By comparing 
 the models in Tables 6 and 7, we find that three variables becomes significant because 
 
Page 15 
 of the existence of a nominal variable (industry type): net sales, long-term debt/equity, 
 
 and growth of input price. However, the significance of gross capital intensity decreases 
 
 (models 1 and 3 in Tables 6 and 7). All the dummy variables for different industries 
 
 were not statistically significant. In summary, the results in Tables 6 and 7 suggest 
 
 that the addition or deletion of a variable may change the significance levels of other 
 
 variables and hence affect the reliability of hypothesis testing. 
 
 In the second analysis, ID3 method was applied to the data sets that included 
 industry type 3 . The resulting decision-tree models are shown in Figures 2, 3, and 4. 
 Assuming the variables included in the decision rules to be significant factors, we find 
 three differences between Probit and ID3 models. First, the factors selected by the 
 different methods were different. For example, industry type was considered the most 
 significant one in ID3 models but insignificant in Probit models. Inventory/net sales 
 was very significant in Probit model (model 1 in Table 7); but only appeared in 
 industries 2600, 3600, and 3700 for the ID3 method. Second, different factors were 
 identified by ID3 for different industries. For instance, long-term debt/equity was 
 found important in printing, publishing, and allied industries (SIC code 2700), but 
 irrelevant in the lumber (2400) or chemical (2800) industries. This implies that ID3 is 
 capable of identifying the industry-specific nature of inventory accounting choices. 
 Third, the ID3 models are relatively less sensitive to the addition or deletion of 
 variables. A large portion of the decision trees remains the same in Figures 2, 3, and 4. 
 
 Insert Figures 2, 3, and 4 Here 
 
 In addition to the differences in model format and variables included in a model, 
 the classification accuracy of the resulting models is also important. Table 8 shows a 
 
 3 The software used to run the ID3 algorithm is called ACLS, which stands for 
 Analog Concept Learning System. 
 
Page 16 
 comparison of the classification accuracy between the models constructed by Probit and 
 
 ID3. Here the classification accuracy is measured by the percentage of the cases in the 
 
 input data set that is correctly classified by the model. Generally speaking, Probit 
 
 models with industry type outperformed those without industry type; and ID3 models 
 
 outperformed Probit models in terms of the percentage of firms correctly classified. 
 
 Since the ID3 algorithm tries to cover all sample data in the process of model 
 
 construction, the perfect classification is no surprise. In fact, this level is usualh r 
 
 achieved unless conflicting data exist in the samples. A potential problem associated 
 
 with the high classification accuracy of ID3 is that it may overfit the input data and 
 
 hence may be heavily influenced by the noise present in the input data set. Therefore. 
 
 it is necessary to conduct another experiment to compare the prediction accuracy, i.e., 
 
 the accuracy when the models are applied to hold-out samples, and the circumstances in 
 
 which a particular method is more appropriate. 
 
 Insert Table 8 Here 
 
 5. THE SECOND EXPERIMENT 
 
 The second experiment uses hold-out samples to compare the external validity of 
 Probit and ID3. In order to examine the situations where a particular method is better, 
 two factors that may affect the applicability of a particular method were investigated: 
 nature of the data set and training sample size. The experimental design included three 
 independent variables: data analysis method (METHOD), characteristics of the data set 
 (DATA), and training sample size (SIZE). They were organized into a 2*2 >3 factorial 
 design. 
 
 The methods investigated were Probit and ID3. The characteristics of the data 
 set also had two levels: one was dominated bv a nominal variable, the other was not. A 
 
Page 17 
 data set is said to be dominanted by a nominal variable if the variable alone can 
 
 correctly classify a significant portion (e.g., 70%) of the input cases. The training 
 
 sample sizes included three levels: large/small (L/S), medium/medium (M/M), and 
 
 small/large (S/L). Large/small means using a large portion of the samples to derive the 
 
 model for predicting a small number of holdouts. Medium/mediun means using about 
 
 half of the samples to predict another half. Small/large means using a small portion of 
 
 the samples to predict the remaining samples. 
 
 The dependent variable was the prediction accuracy of the model derived in a 
 particular setting. It was defined as follows: 
 
 # of hold-out cases correctly predicted 
 Prediction Accuracy = 
 
 Total # of hold-out cases 
 
 The hypotheses tested in this experiment can be formulated as follows. 
 
 (1) Effect of data characteristics 
 
 Since Probit and ID3 are substantially different in many aspects, we anticipate 
 that they will have different performance in analyzing different types of data. In 
 particular, we expect ID3 to perform better when a nominal factor has significant effect 
 on the decision outcome and Probit to perform better otherwise. That is, 
 
 Hl.l: In a situation where actual classification is dominated bv a nominal 
 
 •j 
 
 variable, ID3 performs better than Probit. 
 
 HI. 2: In a situation where actual classification is not dominated by a nominal 
 variable, Probit performs better than ID3. 
 
 (2) Effect of training sample size 
 
 The normality assumption usually is true only when the training sample size is 
 large. Since ID3 makes no assumption on data distribution, we expect ID3 to be less 
 sensitive to the decrease of sample size. That is, 
 
Page IS 
 
 H2: The decrease in the size of training sample set has more effect on Probit 
 than on ID3. 
 5.1. Data Collection 
 
 The data sets used to test the hypotheses were colic < tod through a two-step 
 process. First, two sets of data with different characteristics were compiled from the 
 initial data set constructed in the pilot study. One was composed of firms in the 
 industries not dominated by a particular inventory accounting method, whereas the 
 other consisted of firms in the industries dominated by a single method. They 
 represented different effects of the nominal variable industry type. The effect of the 
 industry SIC code was relatively low in the first set and high in the second. The degree 
 of industry dominance used to differentiate these two sets was 3/4. In other words, 
 industries with more than three-fourths of their firms using the same method were 
 classified as industry-dominated (DOM). The rest were classified as non-industry- 
 dominated (NDOM). If we define the degree of industry dominance as the percentage 
 of the firms in the data set whose actual inventory method can be correctly classified by 
 observing the industry type only, these two data sets have different degrees of industry 
 dominance. They are 67.5% and 99.4% respectively. The industries with less than five 
 firms in the original data set were eliminated to avoid potential biases in the next stage 
 of the experiment. Table 9 lists the two-digit SIC codes and number of firms included 
 in these data sets. 
 
 Insert Table 9 Here 
 
 In the second step of the process, thirty data subsets with three different sizes 
 were randomly sampled from each of the two data sets. The sample sizes of these 
 subsets were divided into three levels: large, medium, and small. The large subset 
 
Page 19 
 included roughly two-third of the samples, the medium subset included one-half of the 
 
 samples, and the small subset included one-third of the samples in the data set. Table 
 10 shows the sample sizes of these subsets. Since the industry-dominated and non- 
 dominated data sets had different number of firms, their subsets also had different 
 number of firms. These subsets were used as training data from which decision models 
 were derived. The samples not included in a training subset formed a counterpart, a 
 testing subset, for evaluating the prediction accuracy of the model derived from the 
 training subset. 
 
 Insert Table 10 Here 
 
 5.2. Data Analysis 
 
 For each pair of training and testing data subsets, the following analysis was 
 performed. First, Probit was used to derive a linear model from the training set. 
 Second, the model was used to predict the LIFO/FIFO choices of the firms in the 
 testing set and to calculate the prediction accuracy of the model. Third, ID3 was used 
 to analyze the same training data sets and derive decision-tree models. Fourth, the 
 resulting models were used to predict the corresponding testing data sets to provide 
 comparable results. 
 
 This analysis was conducted over all sixty pairs of data subsets. Table 11 shows 
 the means and variances of prediction accuracy under various settings. Table 11- (a) 
 shows the statistics involving a single factor. Table ll-(b) shows the statistics involving 
 the interaction of two factors (SIZE-DATA and METHOD*DATA). Table ll-(c) 
 shows the statistics involving the interaction of all three factors. The average prediction 
 accuracy ranges from 0.60SS to 0.9000. 
 
Page 20 
 Insert Table 11 Here 
 
 One-way, two-way, and three-way analyses of variance (ANOVA) were 
 performed to test the hypotheses. The results of one-way ANOVA, as illustrated in 
 Table 12, show that DATA (the characteristics of the data set) had significant effect on 
 the prediction accuracy (p=0.01%, R = 0.7062). Both methods performed better in 
 dealing with DOM (the industry-dominated data set). This result is no surprise. It 
 could be because that DOM was less noisy. For example, the degree of industry 
 dominance was by definition much higher in DOM than in NDOM, which increased the 
 prediction accuracy. The result indicates that the less noisy a data set is, the more 
 accurate the resulting model will be. The effects of METHOD and SIZE were not 
 significant at the 5% level. 
 
 Insert Table 12 Here 
 
 Since the insignificance of METHOD and SIZE could be attributed to the 
 overwhelming DATA effect, a two-way ANOVA was conducted on DOM and NDOM 
 data sets separately. The results, as shown in Table 13, indicate that METHOD was 
 significant in both DOM (p=4.93%) and NDOM (p=0.01%), whereas SIZE and the 
 interaction of SIZE and METHOD were significant in DOM only (p= 0.17% and 
 p=2.53% respectively). Combining these findings and the descriptive statistics in Table 
 ll-(b), we found that the ID3 algorithm outperformed Probit in DOM (0.S910 versus 
 0.8633) but was significantly worse in NDOM (0.6192 versus 0.7244). This confirms the 
 hypotheses on data characteristics. 111. I and HI. 2. In DOM, the prediction accuracy 
 decreased significantly (p=0.17%) when rhe sample size decreased. The same trend was 
 observed in NDOM, but the effect was not significant at the 5% level (p=7.14%). The 
 
Page 21 
 significance of SIZE * METHOD in DOM indicate that the reduction of sample size had 
 
 different effect on different methods. In order to further understand the details of the 
 
 interaction among factors, a three-way ANOVA was conducted. 
 
 Insert Table 13 Here 
 
 Table 14 illustrates the results of the three-way ANOVA. The main effects of all 
 three factors became significant in the analysis. In other words, they all had significant 
 effect on the prediction accuracy of the derived model. In addition, the interactions of 
 SIZE and METHOD and of METHOD and DATA were also significant at p=3.S9% and 
 0.01% respectively. The effect of SIZE * DATA and the interaction of the three factors 
 were not significant at the 5% level, but the latter was significant at 10% level 
 (p=8.47%). 
 
 Insert Table 14 Here 
 
 The significance of the interaction between SIZE and METHOD again supports 
 our previous argument that the reduction of sample size had a different effect on both 
 methods. Combining this result with the statistics in Table ll-(c), we found that the 
 prediction accuracy had two sharp decreases. In DOM. its accuracy decreased from 
 0.7666 in L/S to 0.7000 in M/M. In NDOM. the accuracy decreased from 0.S91S in 
 M/M to 0.S092 in S/L. This effect, as portrayed in Figure 5, was not seen in the ID3 
 case, although the accuracy did reduce slightly when the training sample size decreased. 
 The result confirms hypothesis H2 that Probit is more sensitive to the reduction of 
 training sample size. The significance of METHOD * DATA indicates that the 
 characteristics of data set affected the prediction accuracy. This is consistent with the 
 results of two-way ANOVA that supports hypotheses Hl.l and HI. 2. 
 
Page 22 
 Insert Figure 5 Here 
 
 6. IMPLICATIONS AND CONCLUSION 
 In this paper, we have presented a non-parametric method for accounting 
 research and two experiments examining some methodological issues. In the first 
 experiment, we found that selection of variables affected the significance of variables 
 and hence the interpretation of research findings. In order to reduce inconsistent 
 findings in classification research, therefore, special attention must be paid to the 
 selection of variables to be included in a study. In addition, if previous findings in 
 different research are to be compared, the effect due to variable selection must be 
 considered. 
 
 In the second experiment, we found that data characteristics, training sample 
 size, and data analysis method had significant effect on the performance of the resulting 
 model of LIFO/FIFO choice. In addition, the interaction between data characteristics 
 and method, and between sample size and method were also significant. The 
 implications of these findings are two-fold. 
 
 First, concerning research on LIFO/FIFO choice, the effect of different data 
 analysis methods and the dominance of industry SIC-code need to be investigated. As 
 observed in the first study, the industry SIC-code was considered the most important 
 factor in the decision model derived by ID3, but was insignificant in the model derived 
 by Probit. Most previous research adopted Probit and tended to seek firm-specific 
 economic reasons to explain the LIFO/FIFO decision. This may have been subject to 
 the limitation of Probit in handling discrete variables, as discussed in Sections 2 and 3. 
 Therefore, studies using methods different from Probit or focused on industry-level that 
 use either industry-specific data or data aggregated by industry will be desirable. 
 
Page 23 
 Second, concerning accounting classification research in general, classification 
 
 algorithms other than statistical methods may provide more reliable results under 
 
 certain circumstances. Researchers need to consider all alternative methods available in 
 
 order to increase the reliability of the results. The ID3 algorithm studied in this article 
 
 is only a representative of AI methods. Other algorithms and new versions of ID3 may 
 
 provide different results. Of course, there is no one universally best methodology. 
 
 Therefore, selection of methodology becomes very important to the validity of the 
 
 results. 
 
 If a choice is to be made between Probit and ID3, data characteristics and 
 sample size are two major factors that need to be considered. In general, Probit 
 performs better when the effect of nominal variables in the data set is less significant 
 and ID3 performs better otherwise. This is due to their assumptions on data 
 distribution and criteria for constructing decision models. The normality assumption of 
 Probit makes it more sensitive to the decrease of sample size and difficult to handle 
 nominal variables. Its hurdle level, where a sharp decrease of accuracy occurs, is higher 
 than that of ID3. Lack of the normality assumption in ID3, however, causes its poor 
 performance in handling large number of samples with dominant continuous variables; 
 but its repetitive decomposition algorithm allows it to handle nominal variables well. 
 
 From this brief analysis, we have compared the effect of using Probit and ID3 in 
 studying the LIFO/FIFO choice and shown that Probit and ID3 are complementary 
 methods for accounting classification research. Due to the exploratory nature of the 
 work and the complexity of the issue, further research needs to be conducted to fully 
 understand the choice of methodology for accounting research. Directions include at 
 least the following: 
 
 (1) Other data characteristics. In this work, we only examined the degree of 
 dominance of a single nominal variable. The cases of multiple nominal variables and 
 
Page 24 
 other criteria for classifying data characteristics will need to be investigated. 
 
 (2) Other accounting problems. The results were obtained from the LIFO/FIFO 
 choice data. Further work may be done in studying other accounting classification 
 problems. Bankruptcy prediction from financial reports, for example, may also include 
 both nominal and numerical variables and have similar effects. 
 
 (3) Other methodologies. As stated previously, ID3 is only a representative of 
 AI methods. There are other AI methods, such as Michalski's AQ approach (Michalski 
 and Chilausky, 1980), and algorithms outside AI area that may also be useful for 
 accounting research and need to be examined. 
 
Page 25 
 
 8. REFERENCES 
 
 Abdel-Khalik, A.R. (1985), "The Effect of LIFO-Switching and Firm Ownership on 
 Executives 1 Pay," Journal of Accounting Research . Autumn, pp. 427-447. 
 
 Abdel-Khalik, A. R. and J. C. McKeown (1978), "Understanding Accounting Changes 
 in the Efficient Market: Evidence of Differential Reaction," The Accounting Review , 
 pp. 851-868. 
 
 Aldrich, J.H. and F.D. Nelson (1984), Linoar Probability. Logit and Probit Models . 
 Beverly Hills, CA: Sage Publications. 
 
 Alterman, R.B., R.A. Avery, R.A. Eisenbeis, and J.F. Sinkey, Jr. (1981), Application 
 of Classification Techniques in Business . Banking, and Finance, Greenwich, CT: JAI 
 Press. 
 
 Biddle, G.C. (1980), "Accounting Methods and Management Decisions: The Case of 
 Inventory Costing and Inventory Policy," Supplement to the Journal of Accounting 
 Research , pp. 235-80. 
 
 Braun, H. and J. Chandler (19S7), "Predicting Stock Market Behavior Through Rule 
 Induction: An Application of the Learning-From-Examples Approach," Decision 
 Sciences , pp. 415-429. 
 
 Cushing, B. and M. LeClere (19S8), "Evidence on the Determinants of Inventory 
 Accounting Policy Choice," Working Paper, Pennsylvania State University. 
 
 Daley, L.A. and R.L. Vigeland (19S3), "The Effects of Debt Covenants and Political 
 Costs on the Choice of Accounting Methods: The Case of Accounting for R&rD Costs," 
 Journal of Accounting "and Economics . 5, pp. 195-211. 
 
 Dhaliwai, D., G. Salamon, and E.D. Smith (1982), !, The Effect of Owner versus 
 Management Control on the Choice of Accounting Method," Journal of Accounting and 
 Economics , July. pp. 41-53. 
 
 Dietrich, R. and R. Kaplan (19S2), "Empirical Analysis of the Commercial Loan 
 Classification Decision," The Accounting Review . 57:1, pp. 18-3S. 
 
 Dopuch, N. and M. Pincus (19S8), "Evidence on the Choice of Inventory Accounting 
 Methods: LIFO versus FIFO," Journal of Accounting Research . Spring, pp. 2S-59. 
 
 Eggleton, I.R.C., S.H. Penman, and J.R. Twomblv (1976), "Accounting Changes and 
 Stock Prices: An Examination of Selected 1 ucontrolled Variables," Journal of 
 Accounting Research , pp. 66-88. 
 
 Frydman, H., Altman, E.I., and Kao, D.L (19S5), "Introducing Recursive Partitioning 
 for Financial Classification: The Case of Financial Distress," Journal of Finance , Vol. 
 XL, No. 1, pp. 269-291. 
 
 Hagerman, R.L. and M.E. Zmijewski (1979), "Some Economic Determinants of 
 Accounting Policy Choice," Journal of Accounting and Economics , August, pp. 141-161. 
 
 Halperin, R.M. and Lanen, W.N. (19S7), "The Effects of the Thor Power Tool Decision 
 on the LIFO/FIFO Choice," The Accounting Review . April, pp. 378-384. 
 
Page 26 
 
 Hunt, E.B., J. Martin, and P. Stone (1966), Experiments in Induction . New York: 
 Academic Press. 
 
 Hunt, H.G. (1985), "Potential Determinants of Corporate Inventory Accounting 
 Decisions," Journal of Accounting Research , Autumn. 
 
 Krippendorff, K. (1986) Information Theory: Structural Models for Qualitative Data . 
 Beverly Hills: Sage Publications. 
 
 Lee, C. and D. Hsieh (1985), "Choice of Inventory Accounting Methods: Comparative 
 Analysis of Alternative Hypotheses," Journal of Accounting Research . Autumn, pp. 468- 
 485. 
 
 Marais, M.L., J.M. Patell, and M.A. Wolfson (1987), "The Experimental Design of 
 Classification Models: An Application of Recursive Partitioning and Bootstrapping to 
 Commercial Bank Loan Classifications," Supplement to the Journal of Accounting 
 Research , pp. 87-114. 
 
 Messier, W.F., Jr. and J.V. Hansen (1988), "Inducing Rules for Expert Systems 
 Development: An Example Using Default and Bankruptcy Data," Management Science . 
 December, pp. 1403-1415. 
 
 Michalski, R.S. and R.L. Chilausky (1980), "Learning By Being Told and Learning 
 From Examples: An Experimental Comparison of the Two Methods of Knowledge 
 Acquisition in the Context of Developing Expert Systems for Soybean Disease 
 Diagnosis," International Journal of Policy Analysis and Information Systems , 4:2, pp. 
 125-161. 
 
 Mingers, J. (1987), "Expert Systems — Rule Induction With Statistical Data," Journal 
 of Operational Research Society , 3S:1, pp. 39-47. 
 
 Morse, D. and G. Richardson (1983), "The LIFO/FIFO Decision," Journal of 
 Accounting Research . Spring, pp. 106-127. 
 
 Noreen, N. (1988), "An Empirical comparison of Probit and OLS Regression Hypothesis 
 Tests," Journal of Accounting Research . Spring, pp. 119-133. 
 
 Ohlson, J. (1980), "Financial Ratios and the Probabilistic Prediction of Bankruptcy," 
 Journal of Accounting Research . 18:1, pp. 109-131. 
 
 Quinlan, J (1979). "Discovering Rules From Large Collections of Examples: A Case 
 Study," in D. Michie (ed.), Expert Systems in tJii! Micro Electronic Age , Edinburgh, 
 Scotland: Edinburgh University Press. 
 
 Quinlan, J.R. (1982), "Semi-autonomous Acquisition of Pattern-bn^rl Knowledge," in 
 D. Michie (ed.), Introductory Readings m Expert Systems . London: George &: Breack. 
 
 Ricks, W. (19S2), "The Market's Response to the 1974 LIFO Adoptions," Journal of 
 Accounting Research . Autumn, Part I, pp.367-3S7. 
 
 Shannon, C.E. and W. Weaver (1949), The Mathematical Theory of Communicatio n. 
 Urbana, IL: The University of Illinois Press. 
 
Page 27 
 Sunder. S. (1973), "Relationship between Accounting changes and Stock Prices: 
 Problems of Measurement and Some Empirical Evidence," Journal of Accounting 
 Research (Supplement), pp. 1-45. 
 
Industry Type 
 
 Net Sales 
 
 Accounting Method 
 
 Lumber 
 
 200 
 
 FIFO 
 
 Lumber 
 
 152 
 
 FIFO 
 
 Lumber 
 
 312 
 
 FIFO 
 
 Lumber 
 
 600 
 
 FIFO 
 
 Lumber 
 
 63 
 
 FIFO 
 
 Lumber 
 
 400 
 
 FIFO 
 
 Metal 
 
 1000 
 
 FIFO 
 
 Metal 
 
 500 
 
 LIFO 
 
 Metal 
 
 1521 
 
 LIFO 
 
 Metal 
 
 2300 
 
 LIFO 
 
 Metal 
 
 1420 
 
 LIFO 
 
 Metal 
 
 650 
 
 LIFO 
 
 Metal 
 
 2000 
 
 LIFO 
 
 Metal 
 
 1500 
 
 LIFO 
 
 Note: 1. Net sales are in millions of dollars. 
 
 Table 1. A Sample Data Set 
 
 Variable Name 
 
 This 
 
 Lee & 
 
 Dopuch&: 
 
 
 Paper 
 
 Hsieh 
 
 Pincus 
 
 Net sales 
 
 * 
 
 * 
 
 * 
 
 Total assets 
 
 * 
 
 * 
 
 
 CV of net sales 
 
 * 
 
 
 
 Relative frequency of sales grow 
 
 •th * 
 
 
 
 Net sales growth 
 
 * 
 
 
 
 CV of net sales growth 
 
 * 
 
 
 
 Inventory 
 
 * 
 
 
 
 CV of inventorv 
 
 * 
 
 * 
 
 * 
 
 Inventory/Net sales 
 
 * 
 
 * 
 
 * 
 
 Inventory/Total assets 
 
 * 
 
 * 
 
 * 
 
 Net income 
 
 * 
 
 
 
 CV of net income 
 
 * 
 
 
 
 Long-term debt/Equity 
 
 * 
 
 * 
 
 * 
 
 Gross capital intensity 
 
 * 
 
 * 
 
 * 
 
 CV of input price 
 
 * 
 
 * 
 
 
 Growth of input price 
 
 * 
 
 
 * 
 
 CV of growth of input price 
 
 * 
 
 
 se 
 
 Table 2. Numerical variables Included in the Initial Data Set 
 
SIC CODE 
 
 DESCRIPTION FIFO FIRMS 
 
 LIFO FIRMS 
 
 2000 
 
 FOOD AND KINDRED PRODUCTS 6 
 
 
 
 2200 
 
 TEXTILE MILL PRODUCTS 3 
 
 3 
 
 2300 
 
 APPAREL AND OTHER FINISHED 
 
 
 2400 
 
 2500 
 2600 
 2700 
 
 2800 
 
 2900 
 
 3000 
 
 3100 
 3200 
 
 "300 
 
 3500 
 
 3600 
 
 3700 
 3800 
 
 3900 
 
 5000 
 
 5100 
 
 5300 
 5900 
 
 14 
 
 5 
 I 
 3 
 
 10 
 
 13 
 
 PRODUCTS MADE FROM FABRICS 
 
 AND SIMILAR MATERIALS 
 
 LUMBER AND WOOD PRODUCTS, 
 
 EXCEPT FURNITURE 
 
 FURNITURES AND FIXTURES 
 
 PAPER AND ALLIED PRODUCTS 
 
 PRINTING, PUBLISHING, AND 
 
 ALLIED INDUSTRIES 
 
 CHEMICALS AND ALLIED 
 
 PRODUCTS 
 
 PETROLEUM REFINING AND 
 
 RELATED INDUSTRIES 
 
 RUBBER AND MISCELLANEOUS 
 
 PLASTIC PRODUCTS 
 
 LEATHER AND LEATHER PRODUCTS 
 
 STONE, CLAY, GLASS, AND 
 
 CONCRETE PRODUCTS 
 
 PRIMARY METAL INDUSTRIES 
 
 FABRICATED METAL PRODUCTS, 
 
 EXCEPT MACHINERY AND 
 
 TRANSPORTATION EQUIPMENT 
 
 INDUSTRIAL AND COMMERCIAL 
 
 MACHINERY AND COMPUTER 
 
 EQUIPMENT 
 
 ELECTRONIC AND OTHER ELECTRICAL 
 
 EQUIPMENT AND COMPONENTS 
 
 EXCEPT COMPUTER EQUIPMENT 
 
 TRANSPORTATION EQUIPMENT 
 
 MEASURING, ANALYZING, AND 
 
 CONTROLLING INSTRUMENTS; 
 
 PHOTOGRAPHIC, MEDICAL AND 
 
 OPTICAL GOODS; WATCHES AND 
 
 CLOCKS 
 
 MISCELLANEOUS MANUFACTURING 
 
 INDUSTRIES 
 
 WHOLESALE TRADE -DURABLE 
 
 GOODS 
 
 WHOLESALE TRADE - NONDURABLE 
 
 GOODS 
 
 GENERAL MERCHANDISE STORES 
 
 MISCELLANEOUS RETAILS 
 
 8 
 
 14 
 
 60 
 
 15 
 
 20 
 
 12 
 
 11 
 1 
 6 
 
 1 
 
 2 
 
 3 
 
 4 
 
 3 
 
 4 
 1 
 
 2 
 
 7 
 
 3 
 1 
 
 2 
 
 2 
 
 4 
 
 1 
 
 3 
 
 TOTAL 
 
 217 
 
 60 
 
 Table 3. Distribution of Sample Firms 
 

 FIFO FIRMS 
 
 LIFO FIRMS 
 
 VARIABLES 
 
 MEANS 
 
 ST. DEV 
 
 MEANS 
 
 ST. DEV 
 
 Net sales 
 
 $341M 
 
 $885M 
 
 $1,247M 
 
 $3,079M 
 
 CV of net sales 
 
 .3954 
 
 .2073 
 
 .3089 
 
 .1114 
 
 Net sales growth 
 
 .1410 
 
 .1165 
 
 .0992 
 
 .1114 
 
 CV of net sales 
 
 
 
 
 
 growth 
 
 3.723 
 
 11.36 
 
 3.155 
 
 7.418 
 
 Relative frequency 
 
 
 
 
 
 of sales growth 
 
 .7839 
 
 .1798 
 
 .7852 
 
 .1556 
 
 Total assets 
 
 $220M 
 
 $650M 
 
 $1,023M 
 
 $2,781M 
 
 Inventory 
 
 $ 69M 
 
 $ 171M 
 
 $ 148M 
 
 $ 315M 
 
 CV of inventory 
 
 .4144 
 
 .2265 
 
 .2776 
 
 .1164 
 
 Net income 
 
 $ 30M 
 
 $ 97M 
 
 $ 92M 
 
 $ 284M 
 
 CV of net income 
 
 .4498 
 
 20.516 
 
 1.051 
 
 7.2037 
 
 Long-term debt/ 
 
 
 
 
 
 Equity 
 
 .5094 
 
 .7222 
 
 .3643 
 
 .2637 
 
 Inventory/net 
 
 
 
 
 
 sales 
 
 .2126 
 
 .0910 
 
 .1636 
 
 .0727 
 
 Inventory/total 
 
 
 
 
 
 assets 
 
 .3081 
 
 .1211 
 
 .2627 
 
 .1263 
 
 Gross capital 
 
 
 
 
 
 intensity 
 
 .3141 
 
 .2164 
 
 .4649 
 
 .2SS0 
 
 CV of input price 
 
 .1961 
 
 .0304 
 
 .20S6 
 
 .0542 
 
 Growth of input 
 
 
 
 
 
 price 
 
 .0679 
 
 .0132 
 
 .0734 
 
 .0171 
 
 CV of growth of 
 
 
 
 
 
 input price 
 
 .6285 
 
 .4146 
 
 .6230 
 
 .3096 
 
 Table 4. Means and Standard Deviations for LIFO and FIFO firms 
 

 CO 
 
 CO 
 
 *J 
 
 
 y 
 
 "y 
 
 CO 
 
 -o 
 
 flj 
 
 o 
 
 — : 
 
 ^ 
 
 «j 
 
 <. 
 
 Q 
 
 
 *— ^ 
 
 
 ^— ^ 
 
 
 c-i 
 
 cs 
 
 
 
 ^» 
 
 ~v 
 
 y 
 
 CO 
 
 -o 
 
 
 o 
 
 «5 
 
 £ 
 
 «3 
 
 
 Q 
 
 
 1 — » 
 
 
 ^-^ 
 
 f— 1 
 
 1— 1 
 
 "y 
 
 o 
 
 CO 
 
 o 
 
 
 >«;< 
 
 d 
 
 ,< 
 
 -tj 
 
 
 nJ 
 
 
 Q^ 
 
 
 
 V 
 
 
 <-« 
 
 
 /-■ 
 
 
 
 
 3 
 
 
 2; 
 
 
 y 
 
 
 
 
 X 
 
 
 rt 
 
 
 
 
 
 
 «3 
 
 
 > 
 
 
 
 CN 
 
 * 
 
 
 
 
 * * 
 
 
 
 
 * * 
 
 * 
 
 
 
 OO CO CO 
 
 i— ( 
 
 C"> 
 
 *# 
 
 lOVOr-l 
 
 CN 
 
 r- 
 
 CO 
 
 MOO 
 
 ! cs 
 
 o 
 
 .— < 
 
 CO 
 
 CO 
 
 CN 
 
 rH 
 
 co ! 
 
 * 
 
 * 
 
 ^ r>- cn 
 
 —< o o 
 
 O CN O 
 
 .-H CO r- 1 
 
 <£> CO o lO 
 
 n°oo 
 
 lO W C7> -^ 
 
 CO CO 
 
 m co 
 
 * * 
 
 * * 
 
 * * * 
 
 co o o 
 
 CN 
 
 CN 
 
 * 
 
 omcs 
 
 —■ CN CN 
 CN O ^ 
 CN r-J ,-! 
 
 
 o 
 
 t-t 
 
 CO CO 
 
 y y 
 
 CO CO 
 
 CO 
 
 co co o O 
 
 3 
 
 cr 
 y 
 
 co fi ~ 
 *J CU _, ■ 
 
 u > Es 
 
 .2 3 
 
 **- "V 
 ^ o to 
 
 Or' O 
 
 o 
 
 .x a, 
 
 « g g O..S 
 
 "3 •* J *j <-»-< 
 
 w-j S 3 O 
 
 J 5 J^ *» "* > 
 
 a 
 
 fr- §*° EL 
 
 O O o J3 **• 
 
 G s £ > o 
 
 « 8 n ft 
 
 > > 2 2> 
 
 CO 
 iO 
 
 o 
 
 co 
 
 CN 
 OS 
 
 iO 
 
 & 
 
 -a 
 o 
 o 
 
 y 
 
 o 
 
 ►J 
 
 y "S y 
 > > > 
 
 "* rt nj 
 
 "co ** *» 
 
 I s a 
 
 J£ _y y 
 
 3 S 3 
 
 ,x y y 
 
 c '2 '3 
 
 .5?bbu3 
 ^COOO 
 
 y 
 o 
 
 y 
 
 >> 
 Eh 
 
 3 
 
 a 
 •— « 
 
 hO 
 
 a 
 
 'S 
 
 =3 
 
 O 
 2 
 
 y 
 co 
 
 «3 
 P 
 
 s 
 
 o 
 
 1-1 
 
 fa 
 
 y 
 > 
 
 — 
 y 
 
 Q 
 
 y 
 o 
 
 3 
 
 n 
 
 <o 
 
 iO 
 
 y 
 oo -* 
 
 4 
 
 Q co 
 
 CM 
 
 y 
 2 
 
 > 
 
 * * * 
 
 * * * * 
 
 » * * 
 
 * * 
 
 * * ♦ * 
 
 y 
 
 — > 
 
 U3 
 CO CO 
 
 y y 
 
 CO CO 
 
 co 
 
 a 
 
 y 
 
 53 -S 
 
 3 
 
 cr 
 y - 
 
 y 
 
 "3 
 
 3 1 
 
 "3.S. 
 
 
 3 o 
 cs. ~ 
 
 *-• ■*- ^j ■" > 
 ^^^ — _. . ^ 
 
 o 2 
 
 bO 
 
 o o 
 
 y 
 
 *S «>> o> 
 
 o 
 
 ci 
 
 y 
 > 
 
 >% Oh 
 
 O U J 
 
 C w 
 
 ^ o 
 
 y 
 co 
 
 Q 
 
 X 
 
 CO 
 
 y 
 
 y 
 -O 
 
 "y 
 .S 
 
 CO 
 
 > 
 
 ■- 
 
 C- 
 
Variable Name Model 1 Model 2 Model 3 
 
 (Data set 4) (Data set 5) (Data set 6) 
 
 Industry type Coefficients < 0.01 &; insignificant 
 
 Net sales ' 1.781* — 1.783* 
 
 CV of net sales .993 1.291 
 
 CV of net sales growth -1.119 
 
 Total assets 
 
 CV of inventory 
 
 Long-term debt/equity 
 
 Inventory/net sales 
 
 Inventory/total assets 
 
 Gross capital intensity 
 
 Growth of input price 
 
 CV of growth of input price 
 
 Log Likelihood Ratio 127.27 115.94 128.30 
 
 Note: * ... Significant at least at 5% level. 
 ** ... Significant at least at 1% level. 
 *** . Significant at least at .1% level. 
 
 — 
 
 1.450 
 
 — 
 
 -2.742*** 
 
 -3.082*** 
 
 -3.154*** 
 
 -3.188*** 
 
 -3.017*** 
 
 -2.837*** 
 
 -3.193*** 
 
 — 
 
 -3.221*** 
 
 — 
 
 -1.097 
 
 — 
 
 1.471 
 
 1.254 
 
 1.331 
 
 2.191* 
 
 2.121* 
 
 2.319* 
 
 1.328 
 
 1.333 
 
 1.3S0 
 
 Table 7. Models Derived From Data Sets Including Industry Type 
 
Situation 
 
 Model 1 
 
 Model 2 
 
 Model 3 
 
 Probit Method 
 (No Industry Type) 
 
 83.03 
 
 81.59 
 
 82.67 
 
 Probit Method 
 (With Industry Type) 
 
 86.28 
 
 87.00 
 
 S5.92 
 
 ID3 Method 
 
 (With Industry Type) 
 
 100.0 
 
 100.0 
 
 100.0 
 
 Table 8. Percentage of Correct Classification 
 
 Non-industry-dominated 
 
 Industry-dominated 
 
 Sic code 
 
 FIFO 
 
 LIFO 
 
 SIC code 
 
 FIFO 
 
 LIFO 
 
 22 
 26 
 27 
 28 
 30 
 34 
 35 
 39 
 50 
 59 
 
 Total 
 
 3 
 
 3 
 
 10 
 
 13 
 
 3 
 
 8 
 
 14 
 
 6 
 
 12 
 
 6 
 
 78 
 
 3 
 2 
 3 
 4 
 4 
 7 
 7 
 2 
 4 
 3 
 
 39 
 
 20 
 23 
 24 
 33 
 36 
 37 
 38 
 51 
 
 Total 
 
 6 
 
 14 
 
 5 
 
 1 
 
 60 
 
 15 
 
 20 
 
 11 
 
 132 
 
 
 
 1 
 7 
 3 
 1 
 2 
 1 
 
 15 
 
 Table 9. Composition of Two Data Sets for The Second Experiment 
 

 Non- industry-dominated 
 
 Industry-dominated 
 
 Train 
 
 Test 
 
 Total 
 
 Train 
 
 Test 
 
 Total 
 
 Large 
 
 Medium* 
 
 Small 
 
 78 
 58 
 39 
 
 39 
 58 
 78 
 
 117 
 116 
 117 
 
 98 
 78 
 49 
 
 49 
 
 78 
 98 
 
 147 
 146 
 147 
 
 * One was randomly held out to make the training 
 and testing sample sizes equal. 
 
 Table 10. Sizes of the Training and Testing Data Subsets 
 
Factor 
 
 Level 
 
 N 
 
 Mean 
 
 Variance 
 
 SIZE 
 
 L/S 
 M/M 
 S/L 
 
 40 
 40 
 40 
 
 .7969 
 .7771 
 .7518 
 
 .0150 
 .0155 
 .0150 
 
 METHOD 
 
 ID3 
 Probit 
 
 60 
 60 
 
 .7552 
 .7953 
 
 .0209 
 .0091 
 
 DATA 
 
 NDOM 
 DOM 
 
 60 
 60 
 
 .6718 
 ,8787 
 
 .0060 
 .0031 
 
 (a) Means and Variances by Factor Levels 
 
 Factor 
 
 Level 
 
 
 NDOM 
 
 
 DOM 
 
 N 
 
 Mean 
 
 Variance 
 
 N 
 
 Mean 
 
 Variance 
 
 
 L/S 
 
 20 
 
 .6948 
 
 .0077 
 
 20 
 
 .8990 
 
 .0012 
 
 SIZE 
 
 M/M 
 
 20 
 
 .6630 
 
 .0035 
 
 20 
 
 .8911 
 
 .0010 
 
 
 S/L 
 
 20 
 
 .6576 
 
 .0065 
 
 20 
 
 .8459 
 
 .0056 
 
 
 ID3 
 
 30 
 
 .6193 
 
 .0033 
 
 30 
 
 .8910 
 
 .0011 
 
 METHOD 
 
 Probit 
 
 30 
 
 .7244 
 
 .0031 
 
 30 
 
 .8663 
 
 .0049 
 
 (b) Means and Variances by Factor Levels and 
 Data Sets 
 
 SIZE 
 
 METHOD 
 
 NDOM 
 
 DOM 
 
 N 
 
 Mean 
 
 Variance 
 
 N 
 
 Mean 
 
 Variance 
 
 L/S 
 
 ID3 
 Probit 
 
 10 
 
 10 
 
 .6230 
 
 .7666 
 
 .0028 
 .0020 
 
 10 
 10 
 
 .9000 
 .8980 
 
 .0010 
 .0016 
 
 M/M 
 
 ID3 
 Probit 
 
 10 
 10 
 
 .6261 
 .7000 
 
 .0016 
 .0027 
 
 10 
 10 
 
 .8940 
 .8918 
 
 .0012 
 .0010 
 
 S/L 
 
 ID3 
 Probit 
 
 10 
 10 
 
 .6088 
 .7064 
 
 .0060 
 .0024 
 
 10 
 10 
 
 .8827 
 .8092 
 
 .0011 
 .0078 
 
 (c) Means and Variances by Experimental Cells 
 
 Table 11. Means and Variances of the Prediction Accuracy 
 
Factor Source DF SS MS F P(%) R 2 
 
 SIZE Model 2 .0410 .0205 1.35 26.36 .0225 
 
 Error 117 1.7764 .0152 
 
 Total 119 1.8174 
 
 METHOD Model 1 .0484 .0484 3.23 7.5 .0266 
 
 Error 118 1.7690 .0150 
 
 Total 119 1.8174 
 
 DATA Model 1 1.2835 1.2835 283.7** .01 .7062 
 
 Error 118 .5339 .0045 
 
 Total 119 1.8174 
 
 * ... Significant at .01% level 
 
 Table 12. One-way ANOVA Results 
 
Source DF SS MS F P (7o) R 2 
 
 SIZE 2 .0162 .0081 2.8 7.14 .5524 
 
 METHOD 1 .1655 .1655 56.8** .01 
 
 SIZE*METHOD 2 .0126 .0063 2.2 12.55 
 
 ERROR 54 .1574 .0029 
 
 TOTAL 59 .3517 
 
 ** ... Significant at .01% level 
 (a) Two-way ANOVA on the Non-industry-dominated Data 
 
 Source DF SS MS F P (%) R 2 
 
 SIZE 2 .0328 .0164 7.23** .17 .32S2 
 
 METHOD 1 .0092 .0092 4.04* 4.93 
 
 SIZE*METHOD 2 .0179 .0089 3.94* 2.53 
 
 ERROR 54 .1224 .0023 
 
 TOTAL 59 .1822 
 
 * ... Significant at 5% level 
 ** ... Significant at 1% level 
 
 (b) Two-way ANOVA on the Industry-dominated Data 
 
 Table 13. Two-way ANOVA Results 
 
 Source 
 
 DF 
 
 SS 
 
 MS F 
 
 P(%) 
 
 R 2 
 
 SIZE 
 
 2 
 
 .0410 
 
 .0205 7.9** 
 
 .06 
 
 .8460 
 
 METHOD 
 
 1 
 
 .0484 
 
 .0484 18.7** 
 
 .01 
 
 
 DATA 
 
 1 
 
 1.2835 1 
 
 L.2835 495.4** 
 
 .01 
 
 
 SIZE*METHOD 
 
 2 
 
 .0173 
 
 .0087 3.4* 
 
 3.S9 
 
 
 SIZE*DATA 
 
 o 
 
 .OOSO 
 
 .0040 1.5 
 
 21.85 
 
 
 METHOD*DATA 
 
 1 
 
 .1263 
 
 .1263 48.75** 
 
 .01 
 
 
 SIZE*METHOD* 
 
 2 
 
 .0131 
 
 .0066 2.53 
 
 8.47 
 
 
 DATA 
 
 
 
 
 
 
 ERROR 
 
 103 
 
 .2798 
 
 .0026 
 
 
 
 TOTAL 
 
 119 
 
 1.8174 
 
 
 
 
 * ... Significant at 5% level 
 ** ... Significant at .1% level 
 
 Table 14. Three-way ANOVA Results 
 
lumber 
 
 FIFO 
 
 Figure 1. A Sample Decision Tree 
 
Figure 2. Decision Tree for Model 1 
 
< 605M 
 
 Net sales 
 
 > 605M 
 
 FIFO 
 
 LIFO 
 
 33 
 
 < .5514 
 
 CV of net sales 
 
 > .5514 
 
 < .3833 
 
 < .3130 
 
 < 48^r 
 
 CV of net sales 
 
 Net sales 
 
 > .3833 
 
 > 48M 
 
 34 
 
 CV of inventory 
 
 >.3130 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 LIFO 
 
 LIFO 
 
 FIFO 
 
 < .308 
 
 Long term debt/equity 
 
 < .353i 
 
 > .30 
 
 < .2291 
 
 iross capital intensity 
 
 >19M 
 
 35 
 
 Gross capital intensity 
 
 > .2291 
 
 2.3534 
 
 N 
 
 CV of net sales 
 
 < .19 
 
 > .1907 
 
 FIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 LIFO 
 
 < .11 
 
 LIFO 
 
 36 
 
 < 39M. 
 
 CV of inventory 
 
 < .101 
 
 Net sales 
 
 > .119 
 
 
 Inventory/net sales 
 
 >.101 
 
 CV of net sales 
 
 < .1251 
 
 Net sales 
 
 37 
 
 LIFO 
 
 FIFO 
 
 2 414 
 
 Inventory/net sales 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 Figure 2. Decision tree for Model 1 (cont'd) 
 
< .2034 
 
 FIFO 
 
 CV of net sales 
 
 .2034 -.3097 
 
 LIFO 
 
 >.3097 
 
 FIFO 
 
 39 
 
 Net sales 
 
 FIFO 
 
 LIFO 
 
 < 15M 
 
 <.3369 
 
 Net sales 
 
 >15M 
 
 FIFO 
 
 LIFO 
 
 SIC 
 
 CV of net sales 
 
 < .248 
 
 50 
 
 Long-term debt/equity 
 
 >3369 
 
 FIFO 
 
 FIFO 
 
 51 
 
 53 
 
 < 2.227M 
 
 Net sales 
 
 > 2.227M 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 59 
 
 < .073 
 
 Growth of input prices 
 
 >.073 
 
 FIFO 
 
 LIFO 
 
 Figure 2. Decision Tree for Model 1 (conclusion) 
 
Figure 3. Decision Tree for Model 2 
 
< .2604 
 
 SIC 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 FIFO 
 
 LIFO 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 > .9176 
 
 > FIFO 
 
 < .2837 
 
 CV of net sales 
 
 > .28 
 
 Inventory/net sales 
 
 >.2784 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 Figure 3. Decision Tree for Model 2 (cont'd) 
 
< .2034 
 
 FIFO 
 
 CV of net sales 
 
 .2034 -.3097 
 
 LIFO 
 
 >.3097 
 
 FIFO 
 
 39 
 
 Total asset 
 
 FIFO 
 
 LIFO 
 
 < 11M 
 
 <.3369 
 
 Total asset 
 
 >11M 
 
 FIFO 
 
 LIFO 
 
 SIC 
 
 CV of net sales 
 
 < .248 
 
 50 
 
 Long-term debt/equity 
 
 >3369 
 
 FIFO 
 
 FIFO 
 
 51 
 
 53 
 
 < 1.5547M 
 
 Total asset 
 
 > 1.554M 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 59 
 
 < .073 
 
 Growth of input prices 
 
 > .073 
 
 FIFO 
 
 LIFO 
 
 Figure 3. Decision Tree for Model 2 (conclusion) 
 
FIFO 
 
 Growth of input price 
 
 <.0567 
 
 >.0567 
 
 FIFO 
 
 < 1 .22 
 
 Inventory/net sales 
 
 FIFO 
 
 LIFO 
 
 Long-term debt/equity 
 
 > .051 
 
 > .401 
 
 CV of net sales growth 
 
 < .127 
 
 < .3084 
 
 Long-term debt/equity 
 
 CV of inventory 
 
 > .89 
 
 < .1235. 
 
 CV of inventory 
 
 > .1235 
 
 LIFO 
 
 LIFO 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 Net sales 
 
 <6,857M 
 
 CV of net sales 
 
 Net sales 
 
 > 601 M 
 
 Figure 4. Decision Tree for Model 3 
 
<605M 
 
 SIC 
 
 Net sales 
 
 <.3130 
 
 CV of inventory 
 
 >605M 
 < 2.725 
 
 Long-term debt/equity 
 
 > 2.725 
 
 < 1 .66 
 
 >.3130 
 
 <.308 
 
 Long-term debt/equity 
 
 >.308 
 
 > 19M 
 
 Gross capital intensity 
 
 > .3! 
 
 CV of net sales 
 
 >.2777 
 
 Net sales 
 
 <39M 
 
 < .1014 
 
 Inventory/net sales 
 
 >39M 
 
 <.3132 
 
 > .1 01 4 
 
 CV of inventory 
 
 < .1 251 
 
 Net sales 
 
 >414M 
 
 Inventory/net sales 
 
 > .1 251 
 
 FIFO 
 
 LIFO 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 Figure 4. Decision Tree for Model 3 (cont'd) 
 
SIC 
 
 >.141 
 
 CV of net sales 
 
 < .141 
 
 < 1.36 
 
 <18M 
 
 CV of net sales growth 
 
 > 1.36 
 
 Net sales 
 
 >18M 
 
 Net sales 
 
 >295M 
 
 Net sales 
 
 > .2108 
 
 Inventory/net sales 
 
 < .98 
 
 Long-term debt/equity 
 
 CV of net sales 
 
 > .248 
 
 Net sales 
 
 < 2,227 
 
 CV of net sales growth 
 
 LIFO 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 LIFO 
 
 FIFO 
 
 FIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 LIFO 
 
 FIFO 
 
 Figure 4. Decision Tree for Model 3 (conclusion) 
 
Accuracy 
 
 90.. 
 
 80.. 
 
 70 
 
 60 
 
 50 
 
 L/S 
 
 -o- 
 
 M/M 
 
 DOM 
 
 NDOM 
 
 S/L 
 
 Sample Size 
 
 Labels : 
 
 O ID3 
 
 q Probit 
 
 FIGURE 5. prediction Accuracies of ID3 and Probit