College and Research Libraries GEORGE V. HODOWANEC An Acquisition Rate Model for Academic Libraries With circulation assumed to imply use and thus need, multiple regression analysis was employed to determine which variables best correlate with cir- culation. Three were identified: number of books added; full-time equivalent size of student body; and undergraduate and graduate courses offered. A " T'' test showed no significant difference between the means of per student circulation differentiated by collection size and population mean of the en- tire sample. A similar " T'' test for per student rate of acquisition revealed no significant difference between the means of individual libraries and the population mean. A regression equation recommending a predictive value for the number of books to be added was developed. How CAN LIBRARIANS QUANTIFY the acquisi- tion rate and substantiate their request for annual funding of library resources? The criteria espoused in existing acquisition for- mulas are based on minimum collection size or the number of faculty and students as well as graduate and undergraduate pro- grams. The recommended number of vol- umes to be added for each such component is based on empirical analysis. Can variables that may have an effect on the rate of ac- quisition be identified and analyzed and later put into some type of formula? This is the question analyzed in this study. An assumption was made that, in spite of certain built-in weaknesses , circulation im- plies use, which, in tum, is a valid predic- tor of user needs. If this is the case, then what factors affect circulation? A longitudi- nal study showed that the rate of circulation of newly acquired materials drops off at a rate approximately equal to one-half of the previous year's circulation. 1 In general, new George V . Hodowanec is director, William Allen White Library , Emporia State University , Emporia, Kansas. The author acknowledges the critical review and valuable suggestions made by Jasper G . Schad, director of libraries , Wichita State University, and John M. Burger, professor of mathematics, Emporia State University. materials circulate more frequently than older materials. 2 It has also been shown that course-related materials circulate more fre- quently than books that are not subject- related to the courses offered. 3 • 4 Based upon the responses of libraries in this study, the correlation coefficient be- tween the number of books in the library and the number of books checked out is 0. 72. However, the same two variables cal- culated on a per student basis yield a much lower correlation coefficient, that of 0.35. Both coefficients show the existence of a definite relationship between circulation and collection size . Naturally, the larger the col- lection the greater the number of books that will be circulated. s,s However, it also has been shown that only a fraction of the collection meets the major- ity of user needs. 7 Therefore, as collection size grows, the corresponding per student circulation does not increase at the same rate. For this reason , the correlation coefficient between collection size and circu- lation , calculated on a per student basis , is smaller. The mean number of volumes per student (PSV) in the collection in this sam- ple was found to be 82.0 books; the mean number of books checked out was 25.4. It appears that many libraries circulate more than 25.4 books per student with fewer than I 439 440 I College & Research Libraries • November 1978 82.0 volumes per student in the collection. With a lower correlation coefficient be- tween library holdings (books) and circula- tion, calculated on per full-time equivalent student basis, it was concluded that there ought to be an acquisitions rate range for any university or college library that can be justified in terms of the frequency of circu- lation; and, conversely, acquiring materials beyond the suggested rate of acquisition would do little to further increase circula- tion . William B. Rouse ' s mathematical model for predicting circulation based on a recommended rate of annual acquisitions suggests that calculating such acquisition rate guidelines is feasible . 8 In the most frequently quoted guidelines for collection size, the Clapp-Jordan, and its modified version, the Washington State formula , the acquisition budget is justified in terms of collection size. A specified number of books are to be added for every student , faculty member , and academic program until the collection reaches a pre- scribed size. Both formulas recognize the need for specifYing the annual growth allo- cation . The Clapp-Jordan formula suggests an increment of 6 percent of the base col- lection, while the Washington State formula recommends a 5 percent increment of the minimum size of the collection calculated by the formula. The acquisition growth rates recom- mended by both formulas are based on em- pirical analysis. The question is: How valid are these recommended figures? In one of the most thorough analyses of the Clapp- Jordan formula, Mcinnis concluded that this formula as stated is not statistically veri- fiable . 9 The weight assigned to this formula, therefore , may be useful as a general guide but lacks statistical validation. Since the Washington State formula is a modification of the Clapp-Jordan formula, one can also question the statistical validation of this formula . Melvin J. Voigt developed an acquisition rate model formula for large universities with extensive advanced graduate programs . It is based upon an empirically developed base figure of forty thousand books , which is to be added annually by a university offer- ing doctorates in at least ten areas. There are further adjustments , made by adding a specific number of volumes to the base figure, for any additional graduate programs or number of undergraduate students over the initial five thousand students or exten- sive involvement in sponsored research . It appears that the base figure as well as the adjustments for additional academic pro- grams is related to the annual rate of pub- lishing.10 The Voigt model emphasizes the annual rate of acquisition, which is not based upon the existing size of the collection; however, it is geared to larger academic libraries where annual acquisitions are less selective and more inclusive. The formula is rather general with no apparent statistical valida- tion and with limitations as in the Clapp- Jordan formula. 11 METHOD A questionnaire was mailed to 1,001 ran- domly selected academic libraries in the U .S. Only those institutions offering at least a bachelor's degree were included. About 400 questionnaires were returned after one follow-up. Not all questionnaires were com- pletely filled out by responding libraries . Depending upon the nature of the compari- son, usable responses varied from 97 to 325. This represents 5.4 percent to 18 percent of libraries in the U.S. A twelve-variable correlation and multiple regression analysis was conducted to ascer- tain what factors affect the circulation rate. R2 and F test were calculated to determine the variance and statistical significance of the variables analyzed. 12 A regression line was developed using circulation and acquisi- tion variables calculated on a per FTE (full- time equivalent) student basis. The circula- tion and the acquisition range as reported by the responding libraries was compared with the corresponding acquisition range on the regression line. Using the three variables with the highest correlations (see table 1) , the regression equation for predicting the recommended rate of acquisition was developed. The re- sponding libraries were grouped by collec- tion size, and the average values of 1. the number of books added by each group of libraries; 2. the size of the FTE students of each group of libraries; and An Acquisition Rate Model I 441 TABLE 1 SIMPLE CORRELATION AND ANALYSIS OF VARIANCE BETWEEN CIRCULATION AND SELECfED VARIABLES df Degrees of Freedom F test Circulation Measure of the Fluctu- Regression Residual Measurement of Signif- icant Dependency-of the Circulation vs . Each of the Variables Listed in the Far Left Column ations Acrounted For by the Introduction of the Circulation Factor Number of Books Added .74 Undergraduate/Graduate Courses .71 FTE Size/Students .69 •p < .01. 3. the number of the undergraduate and graduate courses offered by each institution with a given size of library collection were used to calculate the recommended rate of acquisition for each group of libraries within a given collection size. The libraries were grouped according to collection size to standardize comparative analysis. To arrive at a uniform unit of measure independent of the size of the stu- dent body, the rate of circulation and the rate of acquisition' were figured on a per student basis. Thus PSC (per student circu- lation) represents the number of volumes circulated per FTE student as reported by the responding institution. Per student ac- quisition (PSA) was calculated in the same way. The mean per student circulation for the 292 libraries responding was 25.4 books per student. The number of volumes in the collection calculated on per student basis and the PSC or PSA values for the responding libraries that deviated from the mean by more than three standard deviations were eliminated from further calculations. Based upon statis- tical probability, the chance of having a li- brary report such a deviation can occur once in 500. Any more frequent occurrence is atypical. This working sample included about 300 libraries, with thirteen reporting that one of the three variables varied by more than three standard deviations from the mean. Retaining these thirteen libraries not only would have distorted the total sample but also would have been unrepre- sentative of this kind of sample. .55 4 .50 8 .48 11 92 88 85 RESULTS 75.9* 40.6* 28.6* Simple correlations for the three most significant variables and the analysis of var- iance for the same variables based upon a multiple stepwise regression analysis are shown in table 1. The F ratios are all sig- nificant at p < .01. The F values include the intercorrelation- ary effects of other variables. As shown in the regressional degrees of freedom, the "number of books added" variable had three other variables affecting its F value, while the "FTE size/students" had ten other var- iables. To double-check against possible cumulative intercorrelationary effects of other variables on the variables listed in table 1, separate F tests were run for the "number of books added vs. circulation" and "FTE size/students vs. circulation." The values for F ratios are 403.8 and 533.3 re- spectively with 1 and 298 degrees of free- dom, 13 all significant values at p < .01. After the variables were identified that most strongly correlated with circulation, a statistical test was applied to determine whether use as measured by circulation var- ies significantly with size of collection and, thus, indirectly with the size of the student body. There appears to be little deviation from the mean in PSC among the six groups of libraries, as is shown in table 2. The null hypothesis tested was as follows: The mean of individual groups of libraries differentiated by collection size does not vary significantly from the population mean. A "T" test was applied, and the null hypothesis cannot be rejected at less than 442 I College & Research Libraries • November 1978 TABLE 2 P E R STUDE NT CIRCU LATIO N (PSC) MEANS GRO UP ED BY COLLECTIO N SIZE Collection Size PSC Mean 0-99,999 23 .65 100,000-199' 999 26.77 200,000-299,999 25.59 300,000-399 ' 999 28.37 400 ' 000-899,999 24.90 900,000+ 29.48 • p < . 10 .10 level of significance. This means that six groups of libraries do not deviate sig- nificantly from the mean of the entire sample population; therefore , collection size is not a significant factor in PSC var- iations.14 The second null hypothesis tested was to determine if there is any significant varia- tion in the PSA rate between the means for libraries grouped by collection size and the mean of the entire population sample. The hypothesis tested was as follows : There is no significant difference between the mean of per student acquisition rate for each group of libraries differentiated by collection size and the mean for the entire population sample. As shown in table 3, the hypothesis can- not be rejected i n any of the library groups. This would indicate that the mean PSA ex- penditure for individual library groups dif- fer e ntiated by collection size is not sig- nificantly different from the PSA population mean for all the libraries in the sample . Therefore , expenditures for books on per student basis do not vary significantly be- tween the smaller and the larger libraries. Naturally , the total amount spent varies with the size of the student body; however, since use is dependent upon continuous ac- quisitions , this dependency is proportionally df " t'' Value• 123 -1.50 70 0.85 39 0.08 20 1.01 18 -0.17 24 1.50 uniform and does not differ significantly with collection size . The PSA population mean was 3.48 with a standard deviation of 2. 5. The PSA range as reported by the responding libraries ranged between one and seven books. The two variables that were found to have the best predictive potential for the recom- mended rate of per student acquisition were PSC (per student circulation) and UGC/ps (the number of undergraduate and graduate courses offered by the institution calculated on per FTE student basis). On the basis of the data provided by the responding librar- ies , the following multiple regression equa- tion was developed to calculate the recom- mended number of books to be added on per FTE student basis: PSA = 1. 98 + (0. 0345) (PSC) + (2.39) (UGC/ps) Where: PSA PSC UGC/ps recommended value for the number of books to be added on per FfE student basis per student circulation number of undergraduate and graduate courses offered by the institution calculated on per FfE student basis TABLE 3 PER STUDENT A CQ UISITIO N (PSA) MEAN S GRO UPED BY COLLECTIO N SIZE Collecti on Size 0-99,999 100' 000-199' 999 200,000-299,999 300,0()()....:399,999 400,000-899,999 900,000+ PSA Mean 3 .73 3.34 3.50 3.28 3.48 3 .40 *the null hypoth esis ca nn ot be rejected at less than .05 level of co nfid ence •• the null hypothesis cann ot be rejected at less than .20 level of confide nce df 121 70 39 ,20 18 24 " t'' Value 1.10* -0.47** o.o5•• -0.36** 0 •• -0.16** Other values in the equation represent equation constants derived during the pro- cess of developing the predictive multiple regression equation. The predictive multiple regression equa- tion enables one to calculate the recom- mended number of books to be added. It does not, however, offer any means of com- parison between a particular library and other libraries that are similar in size but perhaps different due to unusual factors such as above-average size of student body, extremely large collections, special educa- tional programs, or other distinguishing characteristics. To provide each library with such means of comparison, the responding libraries were grouped by collection size, and aver- age values were calculated for 1. the number of books added by each group ; 2. the actual per student acquisition for each group; 3. the number of undergraduate and graduate courses offered; and 4. size of the student body. The average value for any of the above categories was obtained by dividing the sum of reported values-such as the total number of books added by libraries with collection size 0-99, 999 volumes-by the total number of FTE students . A similar process was used for obtaining other aver- age values. The predictive multiple regression equa- tion was used to calculate the recommended number of books to be added using the av- erage values as shown in table 4. It was rea- soned that, if these figures are used to pre- dict the recommended number of books to be added for each group of libraries grouped by collection size, a most represen- tative predicted value for PSA will have been calculated. Any deviation from average values would have to be accounted for lo- cally by the individual library. Column B in table 4 gives the actual av- erage number of books added as reported by the responding libraries. Column C gives the recommended number of books to be added , calculated for an average size of FTE student body. Columns D and E represent the same figures as columns B and C except they are given on a per student basis. An Acquisition Rate Model I 443 ~~~~~~ c.;c.;c.;c.;c.;c.; ~~g~~~ c.;c.;c.;c.;c.;c.; 444 I College & Research Libraries • November 1978 DISCUSSION The assumption was made that circulation implies use and, therefore, predicts user needs. The effort was made to identify var- iables that correlate with the rate of acquisi- tion and circulation. After such variables were identified, a multiple regression for- mula was developed showing that the pre- dicted rate of acquisition can be best de- scribed on the basis of past circulation and the number of undergraduate and graduate courses offered by the institution. The recommended figure is more a meas- ure of the average relationship than a suggestion of minimum rate of acquisition. It simply suggests that, given a particular set of conditions, the recommended rate of acquisition represents the best fit for that specific college or university library in rela- tion to other libraries in the population sample. One of the questions raised earlier con- cerned the acquisition rate range for a col- lege or university library that could be jus- tified in terms of use. Is it possible to iden- tify such a range and show that acquiring materials beyond it would do little to fur- ther increase circulation? To answer this question, two equations with PSC as a de- pendent variable and PSA as an independ- ent variable were developed. The linear equation with a moderate slope showed an incremental relationship between the PSA and PSC. The quadratic equation, which sh~wed a higher correlation coefficient than the linear equation, was plotted and superimposed over the linear equation graph. It was concluded that the relation- ship between the PSC and PSA variables was represented better with a quadratic equation than a linear equation. The linear equation demonstrated a con- tinuous incremental relationship between PSA and PSC; the curvilinear equation showed PSC increase for corresponding PSA increment between 2.66 and 8.8 books per student. At the point where PSA equaled 8.8 and corresponding PSC equaled 33.7 books per student, the curvilinear equation reached . the maximum, indicating that additional PSA will not yield a correspond- ing increase in PSC. The two equations and correlation coefficients (R) follow, and the range of values within which the increased PSA yielded a corresponding increase in PSC is shown in table 5. Ypsc = 20.5 + (1.3) Xpsa R = 0.24 Ypsc = 13.9 + (4.45) Xpsa - (0.25) (Xpsa)2 R = 0.35 The comparison between two forms of equations of the same variables showed that the increased rate of PSA from 2 .66 to 8.8 books resulted in a corresponding increase in PSC from 23.97 to 33.70 books. Further increase in PSA would not yield any further increase in PSC as shown by the curvilinear equation. Whether the increase in use (cir- culation) from 23.97 to 33.70 books checked out per student is justifiable in terms of per student acquisition increase from about 2.66 all the way up to 8. 8 books per student is up to the individual library to determine. The above range, naturally, reflects the central tendencies of "average" libraries. There are libraries with a smaller PSA rate and above-average circulation as well as li- braries that buy more books per student than the recommended average and yet cir- culate fewer books per student than other, comparable libraries. Two libraries were randomly selected to determine how close the four-variable mul- tiple regression formula comes to the actual annual rate of acquisition as reported by the library , (see figures 1 and 2). To apply this formula, one has to calcu- late the PSC and UGC/ps for the individual library and multiply them by the constants. The constants for the predictive multiple regression equation are 0.0345 and 2.39 re- spectively. By adding these products to another constant, 1. 98, one comes up with the recommended PSA. To calculate the recommended number of books to be added by the institution with a given FTE student size , one needs simply to multiply the cal- culated PSA by the number of FTE stu- dents . The library in figure 1 acquired 86 per- cent of what is recommended by the multi- ple regression formula . Howe ver , the number of courses offered is 36 percent higher than the overall average for the number of courses offered by a university of this size . If the overall average number of under- graduate and graduate courses offered by the university with this collection size is An Acquisition Rate Model I 445 used (2,455) in place of the actual number of courses offered (3,346), then the recom- mended number of books to be added is 47,856, reducing the difference between the actual and recommended rate of book acqui- sitions from 14 percent to 10 percent. This clearly points to the conclusion that the courses offered by the institution have a definite effect on library use and, therefore, acquisition of books. The number of volumes per student in this library's collection is within one-half standard deviation below the mean: not an outstanding, but a tolerable, condition. The reported PSA rate for this library is well below the average PSA rate as recom- mended in table 4 or the recommended PSA rate calculated using the predictive multiple regression equation. Referring to the comparison of the linear and curvilinear equations (table 5) which show an incremental relationship between PSA and PSC, it appears that the circulation (and thus use) in this particular library would increase with corresponding increase in the rate of PSA. Its present PSC is 24.51, and the PSA is 2.89. If the library increased its PSA to the recommended PSA rate of 3.34, the corresponding student circulation could go up, according to the curvilinear equation, to 25. 98, or roughly 26 books per student. Following is the quadratic equation show- ing projected PSC based on the recom- mended PSA. Ypsc 13.9 + (4.45) (Xpsa) - (0.25) (Xpsa) 2 13.9 + (4.45) (3 .34) - (0.25) (3.34) 2 25.98 In the case shown in figure 2 the actual acquisition rate is 92 percent of the recom- mended number of books to be added . The PSA rate calculated by the predictive multi- ple regression equation is higher than the one recommended in table 4 , possibly be- cause the number of per student volumes in this collection is more than one standard deviation below the mean. The number of books per student in the collection (PSV) for the entire population sample is 82 with a standard deviation of 48. This particular li- brary's PSV is 24 books. At the same time , its per student circulation is 40.48 books , or 446 I C allege & Research Libraries • November 1978 Library No. Circ. Number of FTE Books Added U&G Courses UGC/ps PSC PSA Collection Size (as reported) 585 366,493 43,219 14,955 PSA = 1.98 + (0.0345) (PSC) + (2.39) (UGC/ps) = 1.98 + (0.0345) (24.51) + (2.39) (0.22) = 1.98 + 0.84 + 0 .52 = 3 .34, or 49,950 books for the student body of 14,955 3,346 0.22 24.51 2.89 783,515 Fig. 1 Application of the Predictive Acquisition Rate Formula for a Randomly Selected Library: Case I Library No. 197 Circ. Books Added FTE (as reported) 181,816 15,875 4,491 PSA = 1.98 + (0.0345) (PSC) + (2. 39) (U GC/ps) = 1.98 + (0 .0345) (40.48) + (2 .39) (0.20) = 1.98 + 1.39 + 0 .47 = 3 .84 , or 17,245 books for the student body of 4,491 U&G Courses 904 UGC/ps PSC 0.20 40.48 PSA 3.53 Collection Size 106,572 Fig. 2 Application of the Predictive Acquisition Rate Formula for a Randomly Selected Library: Case II almost 15 books per student above the mean. One possible explanation of these deviations would be that an overly small li- brary collection forces heavy reliance upon a small fraction of the library's resources, such as the reserve book collection. Naturally, this is only an assumption and serves to il- lustrate that very few libraries will fit into most "average categories" as shown on table 4. Local peculiarities must be accounted for, using the mean values as a frame of refer- ence. IMPLICATIONS The validity of the mathematical formula used to justify the acquisition rate must bear all inconsistencies inherent in the var- iables used to derive such a formula. Refer- ring to the above two libraries in particular, and to all libraries in general, one must ac- count for the inaccuracies present in the data that weaken the predictive value of the dependent variable (number of books to be added). Factors that account for such inac- curacies include the following: 1. Similar courses are offered by more than one department. 2. Different institutions use a different frame of reference to calculate the FTE stu- dent body. 3. Circulation figures and acquisition figures are not arrived at uniformly by all libraries. 4. Each subject discipline has its own peculiarities and patterns of use. 5. Government documents are included as part of the total collection by some librar- ies and excluded by others. The natural tendency is to attribute more to any mathematical formula than what it can possibly do. The multiple regression formula and correlation coefficients show that use and rate of acquisition are related. This relationship has been quantified to rep- resent the best fit for the responding librar- ies. Naturally, it would be an error to as- sume that predictive values based on the practices of responding libraries reflect the best acquisition needs for all libraries. Quantification of user needs is a very elu- sive area of research. The effort to quantifY user information needs is based upon the assumption that circulation implies not only use but actual need. There is no way, for instance, to measure now frequently the user checks out a certain book simply be- cause the exact book the reader wanted was not available. Therefore, not only must each library applying this formula carefully analyze its own peculiarities, but an effort to quantifY acquisition rate must be validated with further research. FURTHER RESEARCH The acquisition rate formula is designed to provide a recommendation as to the number of books that should be acquired by a given library. Nothing was said concerning which books to acquire. Since it has been shown that use is curriculum-related, efforts should be undertaken to study frequency of use as related to specific academic disci- An Acquisition Rate Model I 441 plines. Further correlationary analysis of the circulation patterns affected by the curricu- lar programs and related to the publishing output in corresponding subject areas should give new insight into the desirable rates of acquisition. REFERENCES 1. Stephen Bulick, K. Leon Montgomery, John Feltermann, and Allen Kent, "Use of Library Materials in Terms of Age," journal of the American Society for Information Science 27:175-78 (May-June 1976). 2. H. E. Fussier and J. L. Simon, Patterns in the Use of Books in Large Research Libraries (Chicago: Univ. of Chicago, 1969). 3. William E. McGrath, "The Significance of Books Used According to a Classified Profile of Academic Departments," College & Re- search Libraries 33:212-19 (May 1972). 4. George M. Jenks, "Circulation and Its Rela- tionship to the Book Collection and Academic Departments," College & Research Libraries 37:145-52 (March 1976). 5. William E. McGrath, "Predicting Book Cir- culation by Subject in a University Library," Collection Management 1:7-23 (Fall/Winter 197~77). 6. Thomas John Pierce, "The Economics of Li- brary Acquisitions: A Book Budget Allocation Model for University Libraries" (Ph . D. dis- sertation, Univ. of Notre Dame, 1976). 7. Richard W. Trueswell, "User Circulation Satisfaction vs. Size of Holdings at Three Academic Libraries," College & Research Li- braries 30:204-13 (May 1969). 8. William B. Rouse, "Circulation Dynamics: A Planner's Model," journal of the Af!ierican Society of Information Science 25:258-63 (Nov.-Dec. 1974). 9. R. Marvin Mcinnis, "The Formula Approach to Library Size: An Empirical Study of Its Efficacy in Evaluating Research Libraries," College & Research Libraries 33:100-98 (May 1972). 10. Melvin J. Voigt, "Acquisition Rates in Uni- versity Libraries," College & Research Li- braries 36:263-71 Guly 1975). 11. Michael Moran, "The Concept of Adequacy in University Libraries," College & Research Libraries 39:85-93 (March 1978). 12. R2 (index of determination or coefficient of multiple determination) measures the extent of variance that comes from the independent variable(s). For instance, R2 = .64 tells that 64 percent of the variance of the Y (depend- ent variable) must have come from the X (in- dependent variable[s]). Thirty-six percent must have come from other variables. For more detailed discussion of R2 the reader should consult Neil R. Ullman's Statistics: An Applied Approach (Lexington, Mass.: Xerox College Publishing, 1972), chapter 20. F test uses a ratio of the mean square due to regression compared to the residual mean square. If this number is large, it indicates that the "dependent" variable is truly de- pendent upon the "independent" variable(s) . 13. The df (degrees of freedom), the divisor of the sum of squares of the deviations, is used to obtain the best estimate of the population variance. For example, the total degrees of freedom for the sum of squares of the de- viations of Yi (the observed values of y) from y (the average of the sample values of y) for 97 observations would be 97 (the number of independent observations) minus 1 because y is obtained from the sample rather than a known mean of the entire population. Thus the total number of degrees of freedom is 96. In obtaining the prediction e·1uation us!!!g four_prediction variables, the equivalent of xl X 2 , X 3 , and X 4 must be obtained from the data, as in y, each "uses" one degree of free- dom. This leaves 92 degrees of freedom for the sum of squares for (Yi - Yi), the residual sum of squares where (yi) is the predicted value associated with Yi· For more detailed discussion consult H. M. Blalock, Jr., Society Statistics (New York: McGraw-Hill, 1972), chapter 12. 14. T test is used to measure whether the mean of the sample chosen is significantly different from the assumed population mean. The df, as explained above, refers to the number of variables free to vary. If the "t'' value is less than the number given in the "t'' distribution table with corresponding df, then "t'' value is not significant. It means, then, that there is no significant difference between the popula- tion mean and the means of any one of the groups. Consequently, the PSC mean does not differ significantly with collection size. For a more detailed discussion see Ullman's Statistics, chapter 15.