Zhc ITlniverstt? of Cbicago The Analysis of Mental Functions A DISSERTATION submitted to the faculty of the Graduate School of Art and Literature in candidacy for the degree of doctor of philosophy (DEPARTMENT OF psychology) BY CURT ROSENOW A Private Edition Distributed by The University of Chicago Libraries A Trade Edition is Published by The Psychological Review Company, Princeton, N.J. As Psychological Monograph No. 000 Gbe iftnivereiti? of Chicago The Analysis of Mental Functions A DISSERTATION submitted to the faculty of the Graduate School of Art and Literature in candidacy for the degree of doctor of philosophy (department of psychology) BY CURT ROSENOW A Private Edition Distributed by The University of Chicago Libraries A Trade Edition is Published by The Psychological Review Company, Princeton, N. J., As Psychological Monograph No. 000 71^3 L .. - ■ &fe" 4 : . " 7 A graceful custom offers me the welcome opportunity to give expression to a part of the sense of indebtedness which I feel. If this little work be found to have any merit at all, I believe that merit to lie largely in the point of view from which it is written. There is a sense in which that point of view is my own. There is another sense in which it is entirely due to the minds with which I have had contact. In that sense, it is my privilege to express my obligation to William James, whom I was not for- tunate enough to know in person, to James R. Angell, to George Herbert Mead, and to Addison W. Moore. Many others have given their grateful testimony with regard to the spirit of scien- tific comradeship which prevails in the department over which Professor Angell presides. I wish to add to this my apprecia- tion of his rare gift of appreciating and encouraging many di- vers and divergent types of mind. I feel that he who cannot work out his intellectual and psychological salvation here, can do so nowhere. My thanks are due to Prof. Harvey A. Carr for thorough and stimulating instruction. I wish to say, furthermore, that my in- terest in the subject of modern statistical method is due to my contact with the enthusiasm and the ability of Dr. Beardsley Ruml. And I wish to voice my gratitude to Dr. H. D. Kitson whose kindness and scientific courtesy in furnishing me with data has made this little study possible. Curt Rosenow. University of Chicago. June 8, 191 7 I. The Problem. Whipple, in the introduction to his Manual, tells us that a "mental test" is the experimental determination for a given in- dividual of some phase of his mental capacity, the scientific measurement of some one of his mental traits. He goes on to say that its purpose is practical and diagnostic rather than theo- retical and analytic, but he recognizes that theory and analysis are likely to interact with practice and diagnosis in a way which is beneficial to both. Then, with a candor which cannot be praised too highly, he tells us that there is as yet no such thing as a science of mental tests. "... there is, at the present time, scarcely a single mental test that can be applied unequivocally as a psychical measuring rod we too often do not know what we are measuring; and we too seldom realize the astound- ing complexity, variety and delicacy of form of our psychical nature." And finally we are told that the pressing need of the day is not the inventing of new tests, but the exhaustive investi- gation of those we already have. I cannot subscribe too heartily to some of these sentiments. To be sure, Whipple thinks that rigid standardization and the setting up of norms is the most urgent need, whereas I believe that new tests of the right kind and evaluation of old tests, is what the situation calls for. It is all too true that we do not know what we are measuring, and it seems a rather futile pastime to standardize the measures of we know not what. We need to devise tests so that we do know what we are measuring, and in order to do so we must subject our so-called tests to intensive study and analysis. Only so can standardization proceed intel- ligently. What is it that we are measuring? In the first place, and it seems to me that some of our mental testers have quite lost sight of this obvious fact, we are measuring actual, factual perform- ance at some definite specific task, like thinking of the "opposites" of a list of adjectives, or recalling the names of a number of 2 CURT ROSEN OW familiar objects which have been seen during a brief interval of time. In the second place we postulate, assume, or fondly hope that this actual performance is symptomatic of ability in some wider range of mental ability which we call a mental function. 1 We hope that excellence at the "opposites" test is symptomatic of ability quickly to recall the appropriate associate in a much wider range of situations than the test situation. We expect that ability in the "Objects Seen" test is indicative of the capacity to perceive quickly and accurately a large number of perceptual fac- tors any one of which may later become significant. At any rate we entertain some such hopes as this unless we are willing, stu- pidly, to define "associative power" as ability in the "opposites test," and "quickness and accuracy of perception" as ability in the "objects Seen" test. But how are we to know whether our postulates are justifiable, our hopes fulfilled? Where is "Speed of Association," "Quickness of Perception" to be found? The answer, it would seem, is correlation. We must correlate performance in tests with performance in some larger line of activity which clearly is indicative of ability of one kind or another. It might perhaps be conceivable that we should find activities, suitable for this purpose, which are more or less clearly symptomatic of the various categories of the psychologist or even of the mental tester. But be that as it may, practically it is quite impossible. We are face to face here with another of the difficulties to which Whipple calls attention, the infinite com- plexity and variety of life. We cannot deal with functon di- rectly. We are obliged to deal with concrete specific fact. Fur- thermore the need of a mass of quantitative data confines us to the fields in which men engage 'en masse' and, preferably, where quantitative data are available. Practically, the science of Men- tal Tests is in its infancy in the field of education, and is in em- bryo in psychiatry, criminology, and industry. The data for this study were gathered in the field of education and we may proceed at once to the consideration of the criterion with which we are to correlate. We consider only two criteria. 1 The term "function" is used loosely to indicate almost any kind of psycho- physical process. THE ANALYSIS OF MENTAL FUNCTIONS 3 They are academic marks, and judgments of "intelligence." The latter are usually given by the same instructors who give the marks and occasionally by other judges alleged to be competent. Marks are a perfectly straightforward, objective form of achieve- ment and, in spite of the many objections which can be made to them, they have been used extensively on account of their avail- ability and on account of the quantitative form in which they are immediately given. Judgments have been used chiefly by those in- vestigators who are dissatisfied with marks as a measure of in- telligence, and they are supposed to reach "intelligence" in more direct if less objective fashion. Even the objection of lack of objectivity tends to disappear when the scale according to which judgments are made is the one originated by Karl Pearson to which Mr. Ruml has called the attention of psychologists in most interesting fashion. 2 However the choice of a criterion depends on the duty we expect that criterion perform, and inasmuch as my ideas on this subject are very different from those of Mr. Ruml, even though I also am inclined to prefer judgments to marks, it will pay us to look into that phase of the matter. Mr. Ruml argues that the criterion is, so to speak, an absolute criterion. If it does not measure "intelligence," it at any rate de- fines it, and thus becomes the sole measure of the value of the tests. If we do not take this position, he would urge, we are reasoning in a circle. If we decide on a given criterion because it correlates more highly with a given set of tests than some other criterion, we are choosing the criterion on the basis of the tests rather than the reverse. Now such considerations are perfectly sound if one is interested in measuring intelligence. The present writer is not. He prefers to analyze it, and he conceives the analysis of the factors which enter into some objective performance such as the obtaining of marks or judgments from an instructor to be so useful and interesting a prolegomenon that perhaps it may render the more ambitious task superfluous. For this reason he prefers the criterion which gives the highest correlation with a given set of tests, for high correlations give the analyst some- thing to work with, whereas, as we shall see, it is next to impos- 2 Psychological Monograph, No. 105. 4 CURT ROSENOW sible to reason from low correlations. The writer believes that judgments will be productive of higher correlations with "good" tests (note the circle) because he thinks that in practice judg- ments amount to little but a revision of the marks and the cor- rection of a few very flagrant cases, where achievement most evidently differs from ability to achieve. However this is a matter for future research to decide. Whatever criterion gives the highest correlation with tests will be best for purposes of analysis quite apart from a priori considerations. In the present study marks were used as a criterion because judgments were not available. Otherwise a comparison would have been made. If now we review the ground we have been over, it would seem that we are in a difficult position. On the one hand we have urged that tests should not be credited with symptomatic value for a priori reasons. On the other hand we have alleged that it is practically impossible to find other facts which do pos- sess such value. How then, we may be asked, can one set of facts whose meaning is unknown to us serve as a measure for another set about the significance of which we are equally ignorant? If the conditions have been represented correctly, the problem is indeed difficult. And yet it is precisely this problem with which the present paper deals. So it will be worth while to dwell a lit- tle on the question whether the problem does indeed take this form, even though, to many minds, the position will scarcely need defence. To anyone at all acquainted with the canons of scientific evi- dence it will be obvious that the good intention of the psycholo- gist who devises a "test" cannot be accepted in lieu of its symp- tomatic value. We may grant that the psychologist has knowl- edge of the workings of the human mind which, in some respects, goes beyond that of the man of affairs. We cannot grant that this knowledge, basic and fundamental though it may be, makes him a competent judge of specific social and practical efficiency or of general intelligence. To be sure, perception, memory, imagination, association, attention, generalization, etc., enter for- mally into every operation of the mind. But the relation of con- tent to form is one of the major problems of education and of THE ANALYSIS OF MENTAL FUNCTIONS 5 psychology, not one of its fundamental laws whose formula we know and can apply with confidence. The whole problem of transfer of training and of formal discipline confronts us here. Some transfer, or better some identity of function in divers ac- tivities, is indeed a presupposition which is necessary so that tests may be thinkable. But to ascertain the kind and degree of iden- tity constitutes a problem which can be solved only by investiga- tion. Nor are things otherwise when we come to the consideration of the criterion. Whence can the authority be derived by virtue of which we credit any set of facts whatever with being a meas- ure of "general intelligence," "mental ability," or any other of the loose, vague terms which are in current usage ? How is the authority given, which enables us to take such a set of facts as a measure of specific ability? A moment's reflection will show that such authority is necessarily social. Now I do not think that a clear case can be made out for any of the criteria of intel- ligence which have been offered. The ordinary man is more likely, I think, to look upon excellence at school as the sign of special ability, rather than of general intelligence. And I ques- tion whether the average teacher working under average condi- tions is a competent judge of anything save average ability. On the other hand, academic marks are surely a measure of ability of some kind, and it is equally certain that it would be better to say — abilities. In other words the obtaining of marks is a complex achievement. Unless we accept the opinion of the "man on the street" or of the average teacher, we must analyze this achievement into its causal factors, if we wish to gain a bet- ter understanding of what it does stand for. Now such an analy- sis can be made in terms of tests. That is, the functions which are active in obtaining marks can be expressed in terms of the functions which are active in making scores in tests. And after such an analysis has been made, we shall be able to approach the problem of interpreting both the test and the criterion to better advantage. 28 In a later portion of this paper the results of such 2a The advantage gained by such an analysis may not be obvious to some readers. I trust it will become so as we proceed. At present I will say that 6 CURT ROSENOW an analysis will be presented with the purpose of showing what it really is and how it enables us to approach nearer to the prob- lem of determining the meaning of tests jointly with the analy- sis of the criterion. At present we must give our attention to the method of analysis. I may say at once that it is the method of partial correlation and that it is largely for the purpose of calling attention to its possibilities that the present paper is written. But as this method is relatively unfamiliar, and inas- much as my chief purpose is the arousal of interest, I have thought it best to keep the methodological discussion informal and non-technical. The position of the writer, then, up to this point, is that the end we should have in view in correlating tests with a criterion is the analysis both of the test and of the criterion. He protests against the naif assumption of the mental tester that the test is, ipso facto, of a definite, more or less simple mental function. And he protests even more strongly against the more sophisti- cated, and hence more dangerous contention of the psychologist that the criterion, even though it is complex and vague in its' significance, should be given artificial precision and simplicty by definition. The need is analysis. But it is a great deal easier to make this demand than to satisfy it. The difficulties with which the analyst has to deal lead us to considerations of another kind. II. Discussion of the Method The first difficulty which we encounter is wholly artificial. The statement is often made by writers with a tender metaphysical conscience that a coefficient of correlation tells us nothing of true causal relations. Be that as it may, it does tell us quite as much about them as any other quantitative statement aiming at rep- resenting relations. For example, v = gt, such writers would say, tells us nothing of the nature of gravity. It informs us merely that, on this earth, the velocity of falling bodies varies I conceive the process of acquiring knowledge of the symptomatic worth of tests as a growth. If we desire logical demonstration, we must put into our definitions what we mean to take out of them. THE ANALYSIS OF MENTAL FUNCTIONS 7 directly with the time, and that 'g' is the increment of velocity corresponding to an increment of time. Now, as we shall see, v = gt is nothing but a regression equation. We can substitute for V performance at the criterion, for ( t l performance at the test, and for 'g the coefficient of regression, and we are furnished with information which is analogous, though not perfectly so, to the case of falling bodies. One difference is that in the one case further interpretation of the "law" is centuries old and familiar, and it therefore seems obvious and simple. In the other case we do not, in most cases, have any very plausible interpretation which goes beyond the facts. We encounter a more serious difficulty in the following. The statement is frequently made that the coefficient of correlation is meaningless in the case of non-linear regression. 3 If this be true, the usefulness of correlation as a means of psychological analysis is seriously curtailed, for cases of strictly linear regres- sion are rare. The "proof" of linearity simply means that non- linearity cannot be proven. 4 It does not and cannot show that a straight line is the most probable regression, unless the line, (or lines), which passes through the means of the arrays is (are) actually a straight line. (The reader who is unfamiliar with this terminology is asked to reread this passage after he has read the next few pages.) Linearity is then assumed on account of its practical workability. But, fortunately, even when non-linearity can be proven, the statement is not true that the coefficient of correlation becomes meaningless. It merely loses some of its meaning. It is one of the great contributions of Yule to have shown the precise significance of the coefficient of correlation under any and all circumstances. This leads us to the consider- ation of points of a more technical kind. In what follows it is assumed that the reader has a slight de- gree of familiarity with the terminology and the mathematical theory of correlation. Although the discussion is elementary, it does not aim to be an elementary exposition of the subject. 3 Brown. Essentials of Mental Measurement. Pp. 44-45. 4 The test for linearity is a farce with say one hundred observations, for the value of rj can be made to vary within wide limits, for a single set of data, by varying the magnitude of the class-interval. 8 CURT ROSENOW Still less does it pretend to give mathematical proofs. The reader who wishes to go into that phase of the subject is referred to the literature, and specific references will be made in their proper place. The aim is merely to present as simply as may be the points which the writer conceives to be essential for his purpose. Indeed if it should turn out that the reader who has had no acquaintance at all with the subject can follow the argu- ment, the writer will be gratified. On the other hand I wish to guard very carefully against creating the impression that I am trying to pose as an expert mathematician. I conceive myself to have barely enough proficiency so that I can seize on some of the essentials and attempt some manipulation of a simpler sort. I am particularly anxious to make this acknowledgment in view of a number of instances where it should be made and is not. Let us now suppose that a number of observed concomitant variations of two variables are expressed as deviations from their respective means and are then plotted, using rectangular coordinates, the means of the variables being at the origin of coordinates. Let us suppose, furthermore, that the points so plotted form a smooth continuous curve of some kind. This is a state of affairs approximated in the exact sciences. If now we are able to determine the equation of this curve, it is clear that we have in such an equation an expression which portrays ac- curately the amount of concomitant variation which actually oc- curs. For example, if we were to plot the results of a number of observations on the distance which a freely falling body covers during varying durations of time, we would find that the points would form a parabola, whose equation would enable us to esti- mate distance from time and vice versa. Suppose now that in- stead of a smooth curve our points form a jagged irregular line, and that it is required to find the straight line which most closely approximates the actual line. This may be done by the method of least squares, i.e., upon the condition that the sum of the squares of the deviations from the straight line be a minimum. It may be shown that the line which satisfies this condition passes through the origin of coordinates. Its equation therefore will be x = by, where b is the tangent of the angle which the straight THE ANALYSIS OF MENTAL FUNCTIONS 9 line makes with the y axis. This equation may then serve ap- proximately the same purpose as the equation for the parabola in the case of falling bodies. It represents, with absolute accuracy, the average amount of concomitant variation exhibited by our data. And if its use yields us a set of individual values which approximate the truth to the point of useful approximation, "re- gression" is said to be rectilinear. (Needless to say, this is not intended to be a technical definition of regression. I believe how- ever that it gives a faithful account of its meaning). Any curve or line which "fits" the data to the point of useful approximation is called a curve of regression. Such a line satis- fies the condition of least squares for some given type (shape) of line. It is often said to be the line which passes through the means of the columns or rows respectively. Such a line ob- viously satisfies the condition of least squares. As has just been said, if it is straight, regression is said to be recti-linear or, briefly, linear. Its equation is the regression equation. Its tan- gent with the appropriate axis is the coefficient of regression. There are, of course, two such lines, one passing through the means of the x arrays, the other through the means of the y arrays. We are not at all concerned, for our present purposes, with ways and means of evaluating the coefficient of regression. It is important to point out that the coefficient of correlation and the coefficient of regression are identical in value, when the devi- ation of the variables from their means are expressed in terms of their respective standard deviations as the unit of measure- ment. It follows that the meaning of the coefficient of correla- tion, in so far as it is a means of diagnosis and a measure of re- lation, is exhausted by the meaning of the coefficient of regres- sion. Indeed it is possible to look on the coefficient of correlation as a convenient algebraic expression which enables us to find the value of the regression coefficient, to establish its validity or the degree of confidence which may be given to it, and to show the magnitude of error which may be looked for when it is used for prediction and diagnosis. With this in mind let us now enter upon the closer examina- io CURT ROSEN 0W tion of the claim that the coefficient of correlation is meaningless in cases of non-linear regression. In the first place we may ad- mit without argument that the coefficient is meaningless with respect to the form of the true relation existing between the vari- ables. If it is desired to describe the form, there can be no pos- sible meaning in describing say a parabola by means of a straight line. Next we may note, without formal proof, that the coeffi- cient will of necessity have a lower value than an expression which measures the amount of deviation from a curve of closer fit in analogous terms. 5 It follows that, taken merely as an indi- cation that an actual relation does exist between two variables, r, the coefficient of correlation, is actually entitled to increased confidence if non-linear regression is shown. Indeed the mere proof of non-linear regression is in and of itself proof of the existence of a true relation, and also of the fact that it is greater than indicated by r. It can hardly be claimed that a positive as- sertion which errs only on the conservative side is meaningless. As a special case we may note that r = o does not necessarily indicate the absence of relation. Again, in the case of strictly linear regression, the errors of estimate will be equal for every part of the line and will be sym- metrical as to sign. 6 In the case of non-linear regression this will not be the case. For example, if the "true" regression is a sine-curve, the errors of estimate will be least at the points of intersection with the straight line of best fit, and will tend to be of opposite sign at different parts of the line. In short, the errors will tend to be systematic. But, when all is said and done, the straight line of best fit is what it claims to be, and, in the case of more than two variables, predictions and analyses made by its use will, on the average, be closer to the truth than any other conclusion based on the data and arrived at by any of the prac- tically possible means. The subject of non-linear regression for the psychologist amounts simply to this. If he is investigating the relation of two variables to each other he can get nearest to 5 Such an expression is the correlation ratio, rj- It does not, however, give us the equation of the curve of closer fit and hence is of no use for diagnosis. 6 The statement is true only for the types of regression usually found. THE ANALYSIS OF MENTAL FUNCTIONS n the truth by "fitting" a curve and determining its equation. Even in that case useful results are practically always obtainable by assuming linearity. But if one is dealing with a complex situa- tion the only practical possibility with our present technique is to assume linearity. The results, when properly interpreted, will not be meaningless. Suppose now, to take a hypothetical example, that we wish to ascertain the relation between crop yield and water supply in connection with a proposed irrigation scheme. We have avail- able for the purpose data on the concomitant variation of rain- fall and crop yield, and we find the coefficient of correlation of these two variables to be 0.40. 7 We are then able to estimate the most probable crop yield we may expect from supplying a certain amount of water per acre. We will also be able to estimate, to any desired degree of probability, within what limits the actual yield will fall. In the case given, for any respectable degree of probability, the limits will be so wide as to render the informa- tion of little practical use. Now it may be argued that the re- lation between water supply and crop yield in a climate in which the sun is always shining is different from the relation which exists where water supply and sunshine very probably are in- versely related. We wish to know the relation between yield and moisture supplied when there is a constant amount of sun- shine, for this relation may be very much closer. If now in ad- dition to our other data, we have data on the concomitant varia- bility of "sunshine," we shall be able to supply this information. For suppose that we use the regression equation describing the relation between sunshine and yield to estimate the yield per acre. We estimate say 60 bushels of wheat. Actually it turns out to be 30 bushels. The difference then is an error of estimate. But, as we shall see at once, it is more than that. For consider that the correlation which we found, however low or high its prob- ability, is in any case more probable than any other value known to us at present. Were it the "true" value, the error of estimate would be due exclusively to causes other than variation in the 7 The illustration, greatly modified and expanded, is taken from Yule's Introduction to the Theory of Statistics. 12 CURT ROSENOW amount of sunshine. It would, in fact, be an exact measure of the total effect of all operative causes, with the effect of sunshine eliminated. As it is, it is just such a measure to the highest de- gree of probability possible to us with our present technique, and on the basis of the data at hand. We may therefore call an error of estimate a residual. Similarly, if we estimate the amount of rainfall during a given period from the amount of sunshine dur- ing that period, the residual will be a measure of the amount of rainfall associated during the period with all facts other than the somewhat obvious one that the sun is not shining when ob- structed by clouds. (Obvious as it is, the fact is not irrelevant. At any rate, the reader is asked to fix his attention on the fact that a residual is a measure or representation of the association which exists between the fact we wish to estimate, and all other associated facts except one.) If now we actually compute all of the residuals which represent the relation of yield to all facts other than sunshine, and also all of the residuals representing the relation of rainfall to all facts other than sunshine, we shall have two sets of measures strictly analogous to our original data on the concomitant variation of yield and rainfall, except that the effect of sunshine has been eliminated from both measures. Now, if we compute the coefficient of correlation from these data, we shall have a measure of the relation which exists between yield and rainfall, the effect of sunshine being eliminated. In the no- tation of Yule, 8 if crop yield is X l5 rainfall X 2 , and sunshine X 3 , we shall have the value of r 12 . 3 . 8 It may be desirable to give some account of this notation, sufficient to render its use in the text intelligible. All variables are denoted by sub- scripts of X, such as Xi, X 2 , etc. The coefficient of correlation between any two is indicated by writing their subscripts beneath the symbol r, e.g. r 12 . The coefficient of correlation between two variables, after the influence of other variables has been eliminated, is called a coefficient of partial correla- tion, and is written as follows. Let the variables whose relation is being expressed be X t and X 2 , and let the variables which have been eliminated be X 3 , X 4 , X 5 Xn. Then the coefficient of partial correlation is written 1*12.345 n. The subscripts denoting the variables whose relation is being expressed are called "primary" subscripts and are written to the left of the point. Those denoting the eliminated variables are called "secondary" and are written to the right, r^ is called a coefficient of zero order, r 12 . 3 is a coefficient of the first order, r^ . 34 of the second order, etc. In general, the THE ANALYSIS OF MENTAL FUNCTIONS 13 To be sure, this is not the method which is actually used. If it were, the amount of arithmetic would be almost infinite in complex cases. But Yule has shown that the method which is used is equivalent to such a method both in meaning and result. 9 And, to my mind, this fact shows in the clearest fashion the meaning of partial correlation. Indeed it is largely for the sake of this single point which shows, I think, the simple causal reas- oning, the simple logic, which underlies the complex mathematics which befuddles us, that the previous discussion has been given. It follows that every claim we have made for the coefficient of correlation when only two variables are taken into account, is valid for the partial coefficient. That is, (1) Its meaning in cases of non-linear regression is clear and definite, (2) Its val- idity or "significance can be computed in the ordinary way, (3) The probable magnitude of average error incident to its use can be computed in the ordinary way. Indeed, as Brown says and says truly, "The full significance of correlation in psychology is to be found in the general theory of multiple correlation, of which the correlation of two variables is only a special case." 10 We may express the relation of crop yield to both rainfall and sunshine in a single equation by simply adding the yield due to rain with sunshine constant to the yield due to sunshine with rain constant. Obviously such a sum is due to the combined effect of the two. That is, x x = b 12 . 3 x 2 + b 13 . 2 x 3 . This is a partial regression equation. It would be represented graphically by a plane. Now the differences between the values of x x obtained order of a coefficient denotes the number of secondary subscripts. The coefficient of multiple correlation, the meaning of which will be discussed below, denotes the relation which exists between a single variable, and the results which are obtained by estimating the values of that variable from a number of others by means of the regression equation. Its symbol is R, and is not to be confused with Spearman's R. The single variable is called "dependent," the others "independent." R is written Ri( 2 s4.- .») where 1 is the subscript of the dependent variable, 2, 3, 4, etc., the subscripts of the independent variables. 9 G. U. Yule. Proc. Roy. Soc. Series A, vol. 79, 1907. "On the theory of normal correlation for any number of variables treated by a new system of notation. 10 Wm. Brown. Essentials of Mental Measurement, p. 128. 14 CURT ROSEN OW by the use of this equation, and the actual values of xi, will be residuals containing the part of x x not associated with either x 2 or x 3 . These residuals might in turn be used to find the rela- tion of a fourth variable to Xi, with x 2 and x 3 eliminated, and so on indefinitely. As has been said, the correlation which exists between the actual values of Xi and the values estimated from an equation of partial regression is called multiple correlation, and the sym- bol of its coefficient is R. It is a measure of the closeness with which Xi can be estimated from x 2 , x 3 , etc. It has some very useful properties which are important for the purposes of this paper and which we will discuss later. Before leaving this part of the discussion, I feel that I am under moral obligation to call the attention of the reader to a statement by the highest authority on the theory of correlation, Prof. Karl Pearson. "The method (multiple correlation) . . . does assume that linearity applies within the degree of useful approximation. . . . The general linearity ought to be tested in all cases. Nothing can be learned of association by assuming linearity in a case with a regression line like A, much in a case like B." 11 (See diagram, page 15.) I wish to say that I realize fully my audacity — perhaps impertinence would be a more fitting phrase — in commenting on this statement. I am perfectly willing to follow blindly any course indicated by Karl Pearson. But, I am equally willing to do this when Mr. Yule leads the way. Now it does not seem to me that there is any real conflict of authority here and my interest in the subject compels me to point this out. It seems to me that the issue hinges on the meaning of the phrase "point of useful approxima- tion." Mr. Yule has shown, and so far as I know his proof has not been challenged, that r retains an average significance under any and all conditions having to do with regression. If this be useful, the assumption of linearity is legitimate provided only an average significance is attached to the result. In a case like A there isn't any linear association. But if now the average slope 11 K. Pearson. Biom. vol. 8, 1911-1912, p. 439. "On the general theory of the influence of selection on correlation and variation." THE ANALYSIS OF MENTAL FUNCTIONS 15 I e c of the curve be changed (see diagram C) there will be such association and r will be its measure. The absence of associa- tion, on the average, was due to its real absence, not to the form of the regression. Whether or no such results are "useful" will be determined by the special conditions of the particular problem in hand. At any rate, they have a definite meaning. Before leaving the topic, let me repeat what I said at its intro- duction. I have tried throughout to make clear the meaning of certain phases of the topic which I deem essential for my pur- poses. I have not tried to prove anything. The reader has been asked to accept the statements made on the authority of Yule. 12 What now is the significance of partial correlation for the Mental Test situation? In the first place it furnishes us with the only means at present available for the analysis of the test and the criterion. For suppose that we find a correlation of +0.28 for the Logical Memory test with academic marks. The previous discussion should have made it clear that this is not a measure of the extent to which "logical memory" is a factor in "general intelligence." As any instructor will testify, marks are not even a very good measure of academic intelligence, and a 12 In addition to references given above, See G. U. Yule. Proc. Roy. Soc. Vol. 60, 1897. "On the significance of Bravais' formulae for regression, etc. - - ." 16 CURT ROSEN OW cursory examination of the test in say Whipple's Manual will show that most of the logic is contained in the name. This is further emphasized when the test is evaluated quantitatively, as it must be for purposes of correlation. Mechanical memory is by no means confined to the learning of nonsense syllables and digits. If it were, it would not be a problem for education. I can conceive of at least the possibility of so organizing a non- sense syllable test that, as a criterion of logical memory, it would be superior to the test which bears that name. At any rate, it is not all certain, a priori, that a student who depends largely on a rather verbal and mechanical type of memory is at a serious dis- advantage either in this test or in the matter of marks. If now we are able to devise a test which, a priori, carries a somewhat stronger presumption of logical memory, we have at any rate some material for analysis. For let X x stand for Marks, X 2 for the original logical memory test, and X 3 for the supposedly more logical test, then r 12 . 3 carries a stronger imputation of being rather verbal than r 12 , and there is a greater probability that r 13 . 2 stands for a more logical type. For that which the two have in common has been eliminated in each case in so far as it is associated with marks, and therefore the relation to marks of that which is peculiar to it stands out more clearly. We see that we have here the promise of a fruitful way of combining experimental and statistical research. For a slight modification of the test, or even a change in the method of scoring, may lead to results of significance with regard to the nature of the test. This will be illustrated later. Speaking generally, a partial coeffi- cient (r 12 . 3 4 . . • n ) is more easily interpreted than a coefficient of zero order (r 12 ), for in the case of r 12 we have to face the vague question why X 2 is related to X 2 , while in the other case we may ask what in X 2 is related to X x that is peculiar to it, and has nothing to do with X 3 , X 4 , etc. In other words, we have more data on which to base analysis. Another use to which partial correlation might be put is in connection with so-called "practical" diagnostic work. If a num- ber of tests have been given, it is usually desired to combine them into a single measure of their diagnostic value, with reference THE ANALYSIS OF MENTAL FUNCTIONS 17 to a single criterion such as marks. The method sometimes used is that of expressing all the scores as deviations from their mean with their respective standard deviation as the unit of measurement. The scores for each test for each individual are then added, and these combination scores are correlated with the criterion. 13 The method is somewhat faulty. It attaches equal importance to all tests regardless of their correlation with say marks, and it is perfectly obvious that this is false. Worse, when handled wrongly, it may even serve to conceal linear relations easily discernible in the data. For, to take an extreme case, sup- pose we combine r 12 = + 1.00 and r i3 = — 1.00 according to this method. The result, if the two distributions happen to be nearly parallel, will approximate zero, and we shall have suc- ceeded in converting two perfect diagnostic tools into an abso- lutely useless one. This source of error can be obviated by com- puting all correlations of type r and reversing the signs of the scores of all tests showing a negative correlation, reversing the meaning to correspond. At its best, the method is purely em- pirical and the meaning of the coefficient obtained by its use is neither clear nor definite (except mathematically). We cannot reason from it, we cannot use it as an analytic tool. If it does not possess diagnostic value, it possesses no value whatever. The best method of combining a number of tests is to find their regression equation. R, the coefficient of multiple corre- lation, will be the indirect measure of the diagnostic value of this equation. As we shall see, R is exceedingly valuable for analysis, is more easily calculated than the regression equation, and, unless we wish to apply diagnosis to the case of single indi- viduals, it renders the finding of the regression equation unneces- sary. In so far as he knows, the use of R for such purposes is original with the writer. When regression is truly linear, R gives us, within the limits of accuracy of sampling, a measure of the actual relation which exists between two complex set of facts, the criterion and the tests. Besides it will enable us to make an accurate analysis of the criterion in terms of the tests, in so far as it is associated with the tests. When regression is non-linear, 13 R. S. Woodworth. Psy. Rev., 1912, p. 97. 18 CURT ROSENOW the results have the same significance to a lesser degree of ap- proximation. In view of all this, it may seem somewhat surprising that the method has not found more frequent application. One reason for this is perhaps that the subject is somewhat difficult and not well understood generally. Another reason is probably the large amount of arithmetical labor called for in complex cases. Yule states that the working out of a case involving eight variables is practically beyond the powers of a single individual. 14 Kelley 15 states that in the case of eight variables it is practically necessary to resort to an approximation. 16 But Kelley himself has reduced the amount of mechanical labor materially through the publica- tion of a very useful set of tables, contained in the bulletin re- ferred to. I myself have devised a scheme of procedure, involv- ing the use of R, which makes it possible to reach all of the re- sults which Kelley reaches, and perhaps a little more, with ap- proximately half the work indicated by him. A full exposition of this schema will be found in the appendix. These mechanical improvements, as well as the fact that a complete working out of all possible relations is not necessary, bring the method within the reach of a single individual. For the purposes of the present paper I worked out a case of sixteen variables without resorting to approximation. It took me a little over two months, but I did a lot of useless work. I could do this work now in five or six weeks at the most. Of course I was working with a compara- tively small number of observations, but, after the coefficients of zero order have been found, the amount of arithmetic does not depend on the number of observations. But be that as it may, the method must come into use if scientific analysis is ever to take the place of blind fumbling about. For it is the only method 14 G. U. Yule. Roy. Stat. Soc. Journ., vol. 60, p. 182. 15 Kelley is the only American psychologist who has exploited the method of partial correlation. See his "Educational Guidance," Teachers College, Columbia University, Contributions to Education, No. 71. In the opinion of the writer, Kelley's work loses some of its value through his failure to call at- tention to some of the difficulties and sources of misunderstanding with which the subject is hedged about. These difficulties are still in front of us. 16 Bull. No. 27, U. of Tex., May 1916, p. 18. THE ANALYSIS OF MENTAL FUNCTIONS 19 available at present that holds out even a hope of making syste- matic progress in attacking situations as complex as those with which we have to deal. For the present however let us dwell rather on the limitations of the method. In the first place it would be quite erroneous to suppose that the magnitude of R, and consequently its value, can be indefinitely increased by simply increasing the number of tests. To take another illustration from Yule, 17 if r 12 = 0.8, r 13 = 0.4, and r 23 = 0.5, it would be quite natural to suppose that Xi could be estimated with greater accuracy from both X 2 and X 3 , than from X 2 alone. But this would be quite wrong, be- cause 1*13 . 2 = o. In other words, everything in X 3 that has diagnostic value, or that is associated with X lf is contained in X 2 . Pearson dwells at length on this point. 18 For example, if an infinite number of variables are correlated equally with each other, the value of r being 0.5, he shows that R with reference to any one of them is 0.71. In the case of ten such variables, R = 0.67, for five variables, R = 0.65. The difference in diag- nostic value between 0.65 and 0.71 is negligible. How serious a difficulty this is will appear later. At present let us note that the trouble is not with the method, but with the material with which it deals. The other difficulty to which I wish to call attention is one of interpretation. R is essentially positive regardless of the signs of the coefficients of the regression equation. It is therefore subject to biased error, that is, errors due to fluctuations of sampling will not tend to neutralize each other, but will be cumu- lative. Consequently the "probable error" of R is not a true measure of its validity. It should be compared to the value of R in the case of a number of really uncorrelated variables owing to fluctuations of sampling alone. Pearson, in the article re- ferred to below, promises us a formula for finding such a value, but, to date, I have not found it in the literature accessible to 17 Intro, to the Theory of Statistics, p. 237. 18 K. Pearson. Biom. vol. 10, 1914-15, p. 181. "On certain errors with re- gard to Multiple Correlation occasionally made by those who have not ade- quately studied the subject." 20 CURT ROSENOW me. 19 Yule 20 publishes an approximation formula which gives the value. It is (n — i )** / N**, where n is the number of variables, and N the number of observations. Inspection of this formula will show that its value may easily become unpleasantly large. For example, if n = 16, and N = 92, as is the case in the prob- lem I worked out in the present paper, then R === 0.40 ± 0.06. That is, although R is seven times as large as its probable error, it has no validity whatsoever. It is the failure to call attention to this fact which I alluded to in discussing the work of Kelley. III. Application of the Method to Correct Data In view of these two serious limitations I might perhaps be asked why I undertook so laborious a task as the computation which forms a part of this paper. The answer is simply that it is no part of my purpose to advertise a method for making a silk purse out of unsuitable material. There is in this paper no at- tempt to break the record for altitude. The value of R does not interest me except in so far as it is an instrument of analysis. My object was to sift a mass of typical material down to its sig- nificant constituents. The general character of the results I was fairly sure of before I undertook the work, but I could not know precisely what material would be retained by the meshes of the sieve. I could have reduced the number of variables to eight or nine and have been morally certain that I was not discarding any- thing of value, but it is questionable what weight my moral cer- tainty would have carried with others. Besides I fancied that a drastic concrete illustration of the difficulties which I have just called attention to would do no harm. Furthermore I am very much interested in the subject of partial correlation and hope that the present paper, in conjunction with the appendix, will have a methodological value. The particular material was used simply because, owing to the kindness and scientific attitude of Dr. Kitson, it was available. But tests are not the only things capable of analysis. Nor are they the only source of information on which diagnosis can be based. For example, if Kelley 's work 19 Pearson, Biom. vol. 8, p. 437, op. cit., p. 18. 20 Proc. Roy. Soc. 1907. Op. cit. THE ANALYSIS OF MENTAL FUNCTIONS 21 were based on more than thirty-three observations, it would show that the six tests which he uses and combines with academic marks in grammar school and with judgments of ability by the teachers — are negligible for the purpose of predicting perform- ance in High School in comparison with these other means of diagnosis. 21 However I ought to say that up to a time when the work was more than half finished I thought I would have a con- siderably greater number of observations at my disposal than proved to be the case. This brings us to the consideration of the material with which the present investigation deals. This material was put at my disposition by Dr. Kitson. Neither he nor I dared to hope that the investigation would lead to results of final validity. In the first place the number of ob- servations were not adequate. Besides Dr. Ktson himself looks upon the stage of his work from which the data were taken as its pioneer stage. Since that time he has added new tests and has improved the old tests. If it had been possible to include this later material, there is every reason to believe that more significant results would have been obtained. I do believe that needless duplication of identical functions is a feature of all lists of tests in actual use if they are at all extensive. In order to enable me to test this out, Dr. Kitson furnished me with such material as he had available. Even though the results do not have a direct bearing on his later work, or on similar work done by others, it was thought that the indirect light they would cast would have value. The material then is the same as that obtained by Dr. Kitson for his "Scientific Study of the College Student." 22 As may be seen by referring to this monog-raph, Dr. Kitson gave a large number of tests to the students of the college of Commerce and Administration at the University of Chicago. The work there described covers a period of two years. It has been continued and the results of two more years are now available. At the time my own work was done, the academic marks for the fourth year were not at hand. Besides so many changes had been made 21 T. L. Kelley. Ed. Guid., p. 71 ff. 22 Psy. Rev. Mono., 1917, vol. 23, No. 1. 22 CURT ROSENOW in the tests themselves that their combination with the other three years did not seem feasible. There did not, however, ap- pear to be any a priori reason why the first three years could not be combined, so that I expected to have 150 sets of observations on which to base any conclusions I might reach. I did not in- vestigate the three years separately until, when the work was over half finished, certain differences forced themselves on my atten- tion. Then I did investigate and found, amongst other differ- ences, that the academic marks for the third year differed sig- nificantly from those of the other two years. Inquiry amongst the members of the faculty resulted in conflicting evidence as to the reasons for this, so that I was compelled, regretfully, to dis- card these data. This reduced the number of observations, or sub- jects, to 92, and the period covered became precisely the period described in Dr. Kitson's Monograph, i.e., the academic years 1913-14, and 1914-15- In this group there are included 80 freshmen and 12 sopho- mores; 39 freshmen and 6 sophomores in 1913, and 41 freshmen and 6 sophomores in 191 4. The desirability of the inclusion of the sophomores may be questioned. However it is the practice at the School of Commerce and Administration to test all stu- dents who have not previously had the tests, and in such a group there always are a certain number of individuals who come from other departments, or institutions with advanced standing. Con- sequently such a group is representative in a definite sense. As the Psychological Review Monographs are quite accessible, I deem myself absolved from the uninteresting task of copying the full description of these tests. Most of them are standard and well known. Also their names give a good indication if their general nature. As much additional description as seems essential will be given informally with the discussion. At pres- ent I give only the names in conjunction with the numbers which they were given in the present study, and a brief description of the more important tests. The list follows. As will be seen, the criterion, academic marks, is given No. 1. (1) Academic Marks. (2) Immediate memory for logical material, heard. THE ANALYSIS OF MENTAL FUNCTIONS 23 (3) Immediate memory for logical material, seen. (4) Loss or gain of log. mat, heard, after two weeks. (5) Loss or gain of log. mat., seen. (6) Sentences Built. (7) Hard Directions, printed, speed. (8) Constant Increment, speed. (9) Memory for objects seen. (10) Number-checking (cancellation). (11) Opposites, speed. (12) Memory for numbers heard, (span). (13) Word building. (14) Opposites, accuracy. (15) Constant Increment, accuracy. (16) Hard Directions, accuracy. The description of the seven tests which will be of most in- terest to us follows : No. 2. Logical Memory, immediate, auditory. Materials : Blank sheet of paper and pencil. Directions : "I am going to read you a rather long passage and shall ask you to listen very carefully, for when I have finished I wish you to reproduce the meaning of the passage. The passage is too long for you to remember word for word, but try to get the entire meaning, then in reproducing, use the same words as appear in the text when- ever you can. The Passage : The passage may be characterized as popular science. Method of Scoring: It will be noted that this passage contains a main proposition and three illustrations, the last one of which is amplified. For reproduction of the main proposition, two units were given; for mention of the first, second, and third illustration there were given 14, 13, and 14 units respectively. Thus by merely stating the main propo- sition and the illustrations the individual could score 43. In addition to these gross divisions, the passage was further divided into 81 ideas. Counting each one of these as two thirds of a unit, their united value is 54, which added to the 43 unit mentioned, permits scoring on a basis of 97 points for correct reproduction of the passage." No. 4. Logical Memory, immediate, visual. Materials : See directions. Directions: "On the reverse side of the paper before you will be found a long passage which I wish you to read carefully when I give the signal. Read it but once, then turn it over, and on the back of it write all you can recall of the passage. Be careful to read each sen- tence but once, then turn over the paper and reproduce the meaning as accurately as possible." The Passage: May be characterized as popular psychology. Scoring: Same as in No. 2. 24 CURT ROSENOW No. 3. Loss or gain, Logical Memory, auditory. Direction: Write all you can recall of the passage I read to you at the last psychological examination, beginning "More than once it has hap- pened in the history of science." Scoring: The papers were first scored as in two. Then the difference between No. 2 and No. 4 was taken as the score of No. 4. No. 5. Loss or gain, Logical Memory, visual. Analogous to No. 3 in every way. No. 9. Memory for Objects, visual. Materials : Covered box twelve by twenty-three inches, containing the following objects fastened to the bottom: fountain-pen, pencil, twenty- five cent piece, envelope, inkwell, maroon ribbon, ruler, pen-filler, two- cent stamp, and key. Directions: I am going to show you a group of objects for six seconds, then will ask you to name them aloud from memory. Scoring: The score represents the number of objects correctly reproduced. No. 6. Sentences Built. Directions : I will give you five minutes in which to make as many sen- tences as possible containing three words which I will give you pres- ently. For example, if I gave you the words money, river, Chicago, you might make a sentence like this: "Chicago spends much money improving its river." You may use either singular or plural forms of the words, nominative, objective, or possessive case. Simply use all three of the words in a sensible sentence and make as many different sentences as possible. The three words are, — citizen, horse, decree." Scoring : The score represents the number of sentences formed. Extract from Dr. Kitson's comment : "... the papers which contained a relatively large number of sentences necessarily showed much same- ness in subject matter and structure." No. 8. Constant Increment. Material : Card containing one hundred two-place numbers. Directions : I am going to give you a list of 100 numbers and shall ask you to add four to each number as quickly as possible, giving the sum aloud. You may practice on this list : 22, 34, 92. Begin at the top of each of the four columns and add four to each number. You need not be afraid to go fast for the test is easy and you are not likely to make mistakes. You should be accurate, however, because every error will take off one point from your score. The main thing is to add as rapidly as possible. Scoring: The number of errors was the accuracy score. The number of seconds, the time score. The materials accessible to me were the gross scores in all of the fifteen tests. In no case were the original records at hand, as they had been destroyed some time before. Had they been avail- THE ANALYSIS OF MENTAL FUNCTIONS 25 able, it would have been possible to push analysis further than was actually the case, and many of the suggestions made below could have been investigated. I was not however obliged to ac- cept any empirical indices, as all indices which Dr. Kitson used were based on the gross scores. This, practically, is all the information we need for the present. Now our problem in evaluating these tests is not simply that of determining their total relation to marks. If it were, we could solve it easily and directly by simply computing the fifteen corre- lations which indicate this relation. We should find, for example, that the correlation of the Auditory Logical Memory test with marks is + 0.28, and that of Hard Directions with marks is + 0.25, or using the numbers assigned to these tests, r 12 = + 0.28, r 17 = + 0.25. But such numbers would tell us nothing of the nature of the relations in each case. The functions measured by r 12 and r 17 might in reality be identical, independent, or even mutually exclusive. There would be little sense in debating such an issue on a priori grounds when we can find directly that r 27 = + 0.30. We now have the beginnings of analysis, for we know, to the degree of probability of which our data permit, that the two functions are different in some respects and identical in others. A similar line of reasoning might be applied to every one of these fifteen tests, or variables, paired successively with each one of the others. In table 1 there will be found a complete list of all possible correlations of our fifteen tests amongst themselves and with marks, 120 all told. They are given in full because, aside from the raw data which are too bulky to print, they represent the complete data for this study. But, aside from this, the reader will be able to convince himself by a study of this table that the data in their present shape are far too complex to permit of conclusions more definite than the very vague one we have just stated with reference to No. 2 and No. 7, i.e., that they are alike in some respects and different in others. (Of course, if we have had practice with the method of partial cor- relation, we may be able to go a little further, for we would be able to guess with some accuracy the results of computations based on the data.) But some such problem as whether No. 2 26 CURT ROSENOW has some characteristic peculiar to it, or whether it is exhaustively represented by the other fourteen, would be quite insoluble. To answer such a question we must resort to partial correla- tion. Table 2 shows the effect of successively eliminating the effect of each variable, as we may conveniently call our tests, on the relation between No. 2 and No. 1. Table I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 I 2 28 3 17 44 4 17 -19 -01 5 10 06 -20 55 6 26 05 14 -10 -07 7 25 30 04 08 02 13 8 21 03 08 00 -14 3i 17 9 -12 10 00 02 02 08 -03 25 10 11 10 05 04 -10 -06 2.5 36 17 11 10 41 04 06 09 10 17 23 14 01 12 07 13 22 00 -01 09 22 01 -17 16 -09 13 09 17 18 09 06 20 11 25 08 23 20 18 14 -03 02 -08 -07 -10 11 -04 17 15 -18 18 -01 02 15 08 03 -04 -06 -10 18 -04 40 !5 04 07 01 03 12 16 05 20 14 10 03 03 25 42 09 -04 21 13 10 16 -01 If the reader will now recall what was said about the associa- tion between "Residuals," the meaning of this table should be clear. For example r 12 . 16 = + 0.28 is a measure of the as- sociation of No. 1 and No. 2 in so far as neither No. 1 nor No. 2 Table 2 1-12 = + 0.28 Tl2 16 = + 0.28 1-12 15 16 = + 0.27 I"l2 14 15 10 = + 0.27 l"l2 13 14 • 16 = + 0.26 Tl2 12 13 • 16 == + 0.26 Tl2 11 12 • 16 = + 0.25 ris 10 • • • 16 == + 0.24 r 12 9 •• • 16 = -J- 0.26 r^ 8 •• • 16 = + 0.31 1-12 7 • • • 16 = + 0.28 Tl2 6 • • • 16 = + 0.28 fl2 5 • • • 16 = + 0.28 rja 4 • • • 16 = + 0.36 I-X2 3 • • • 16 = + 0.32 THE ANALYSIS OF MENTAL FUNCTIONS 27 is associated with No. 16, (accuracy, 'Hard Directions'). But this value is identical with the one obtained before X i6 was elimi- nated. It follows, even though r 2 16 is + 0.20, that the relation of No. 2 to No. 16 is of no significance with reference to the relation which exists between No. 1 and No. 2. The converse, however, is not true. For ri i 6 — 0.05 and r x 16 . 2 — o. There- fore the relation of No. 1 to No. 16 is entirely accounted for by that which No. 2 and No. 16 have in common. These conclusions are subject to the limitations imposed on us by the small number of our observations and by the assumption of linear regression. We have discussed the second, and we will presently discuss the quantitative expression of the first of these limitations. Similarly r i2 . 34 .... i 6 = + 0.32 is a measure of the rela- tion of No. 1 and No. 2 in so far as both No. 1 and No. 2 are not associated with any of the other fourteen variables. This rela- tion is peculiar to academic marks and to Immediate Logical Memory (Auditory), alone. This answers the question we pro- pounded, and is analysis in a very real sense. But before we en- deavor to push the analysis to its logical conclusion and try to ascertain the character of this elementary function which we have isolated, we must turn our attention to the disillusionizing sub- ject of validity. We have deliberately avoided this topic so far, because its discussion would have added nothing to our under- standing of the causal reasoning which underlies the theory of correlation. Now that we have reached concrete results it can no longer be postponed. If a number of samples be taken from a universe of discourse {viz., the universe of college freshmen of America), and con- stants such as the Mean, the Standard Deviation, and the Coeffi- cient of Correlation be ascertained, the results will differ from what would have been obtained if the entire "uni verse" had served as a basis. Such deviations from the theoretical true value are called errors of sampling. Theoretically it is possible .to ascertain the probability that any given error of sampling falls within given limits. Practically, in the case of r, this is done on the basis of the assumption that distributon is normal. The conventional expression given to facilitate the computation 28 CURT ROSENOW of this probability is the so-called probable error. In and of itself, however, it merely means that the deviation of r from its "true" value is as likely to be greater than the true value, as it is to be smaller. If a given r is equal to its probable error, the chances are i : i that it would arise by chance even though the "true" value be zero. If r = 2 P.E. (probable error), these chances are 1 : 5, roughly. For r = 3 P.E., they are 1 : 23, for r = 4 P.E., 1 : 143, etc. 1 — . When r = 3 P.E., r is said to be "significant." Of course, such a standard is conventional and arbitrary, and not all authorities recommend the same ratio. In any case it is well to bear in mind the meaning of this "signifi- cance." Wth these considerations in mind let us now return to the consideration of our data and results. In table 3 column 1 (see below) are given the r's of marks with each one of our tests, followed by the probable error of r. They are the coefficients of zero order. In column 2 the same values and their probable errors are given after the in- fluence of the other fourteen has been eliminated by partial cor- relation. These are the coefficients of the 14th order. They rep- resent the correlation with marks of what is unique to each test. Let us return to test 2 and 3 the logical memory test. r 12 = -f 0.28, r 13 = + 0.17. Does this show that "Auditory Presen- tation" is more highly correlated with marks than Visual Pre- sentation ? Not at all, for the difference is o. 1 1 and the probable error of this difference is 0.094. So the chances are about even Table No. 3 2. Log. Mem. Aud. + 0.28 ± 0.065 + °-32 ± 0.063 3. Log. Mem. Vis. + 0.17 ± 0.068 + 0.04 ± 0.070 4. Loss or Gain in No. 2 + 0.17 ± 0.068 + 0.26 ± 0.066 5. Loss or Gain in No. 3 + °- 10 ± °-°7° + °-°3 — 0.070 6. Sentences Built + 0.26 ± 0.066 -f- 0.23 ± 0.067 7. H. Directions, Speed + 0.25 ± 0.066 + 0.09 ± 0.070 8. Con. Increment, Speed -}- 0.21 ± 0.067 + 0.21 ± 0.067 9. Objects Seen — 0.12 ± 0.069 — 0.23 ± 0.067 10. Number-Checking + 0.11 ± 0.069 ± 0.01 ± 0.070 11. Opposites, Speed + 0.10 ± 0.070 + 0.12 ± 0.069 12. Numbers Hard -f- 0.07 ± 0.070 ± 0.04 ± 0.070 13. Words Built + 0.09 ± 0.070 ± 0.07 ± 0.070 14. Opposites, Accuracy — 0.03 ± 0.070 ± 0.01 ± 0.070 15. Con. Increment, Accuracy + 0.08 ± 0.070 ± 0.03 ± 0.070 16. H. Directions Accuracy + 0.05 ± 0.070 — 0.14 ± 0.069 THE ANALYSIS OF MENTAL FUNCTIONS 29 that the difference is due to chance. But if we turn to the cor- responding r's of the 14th order, we find that r 12 . 3 4 . . . . 16 = -f- 0.32 and r 13 . 2 4 • • • 16 = + 0.04. The difference is 0.28, the probable error of the difference 0.094. Hence the chances that this difference is due to fluctuation of sampling alone are 1 : 23. By conventional standards, there is a valid difference between Auditory and Visual presentation. Now we can interpret. The difference is due to something that has not been eliminated. I can think of but three possibilities. They are, ( 1 ) a difference in subject matter, (2) speed of presentation. In the visual pre- sentation the subject reads at his own rate and may violate the instructions against re-reading. In auditory presentation he must accept the rate of speed of the experimenter. (3) A dif- ference specific to the sense avenue, possibly in conjunction with the previous experience of the individual. Our data do not per- mit of a choice between these three possibilities, but it would be a simple matter so to control these conditions in another series of tests that interpretation would be narrowed down practically to a single possibility. Of course someone else might think of other possibilities. But he should remember that the cause of the difference cannot be anything involved in any of the other tests, unless it be a very marked difference of degree, and also that elimination has been from the criterion as well as from the test. (I have not thought it necessary to mention possibilities which would come under the head of obvious control of conditions common to all careful experimentation.) Again, r 12 = + 0.28, r i4 == + 0.17. Recalling that 4 is the difference between the score in test No. 2 and the score made after two weeks, we may note that the factor of "immediate memory" has been eliminated not by partial correlation, but by the method of scoring. The change is so radical that it does not seem profit- able to compare 2 and 4 as modifications of the same test. We may note, however, that r is not significant by conventional standards, but becomes so as a partial of the 14th order, for r i4 • 23 • • • . 16 = + 0.26. Although it is not apparent from table 3 alone, the same thing is true of tests No. 3 and No. 5. No. 3 and No. 5 are identical with No. 2 and No. 4 respectively ex- 30 CURT ROSEN 0W cept as to subject matter and mode of presentation. Their cor- responding values, of zero order, are r 13 = + 0.17 and r 15 = + 0.10. Their coefficients of the 14th order are 0.04 and 0.03. But we have already seen that Auditory presentation contains every- thing of significance in Visual presentation, so that there is nothing left to compare after No. 2 and No. 4 have been elimi- nated, as they have been in the 14th order. To make the com- parison we must contrast the coefficients of the 12th order. We find r 13 . 5 e . • 16 = + 0.19 and r 15 . 3 e • • • ie = + 0.20, results which are quite similar in character to what we found in the case of No. 2 and No. 4. Now the significance of No. 4 (and No. 5) has been covered up in some way. Consulting a table of some 1300 partial co- efficients which were computed in order to get our results, but which are not published, we find that r 14 . 2 = -\- 0.24. The same fact might have been guessed from r 24 = — 0.19 (see table No. 1). The significance of No. 4 then, is obscured by the fact that subjects who make a high score "immediately" tend to forget more than those who do not. No doubt this is at least in part due to the fact that they have more to forget. Of course, different results as to r 24 might have been obtained if the loss or gain had been expressed in percentage terms. (E.g., if X = 80, X after two weeks = 60, then X — 20/80 = 25%.) But this would have been as arbitrary as the method chosen. Be- sides, and the fact is interesting, using partial correlation made us relatively independent in the matter of selecting a unit of measurement, for if we had adopted a percents" and had ob- tained say r 24 = o (instead of — 0.19), we would also have riad a different value for r 14 and would have had the value of r 14 . 2 = + 0.24, as before, unless other factors entered. Returning now to the interpretation of r 14 . 23 . . . . 16 , we have seen that it is significant and that it is a factor in r 12 which not only was itself hidden, but also operated to lower the value of r 12 . This factor cannot be subject matter, for that is the same in both cases. It cannot be "mode of presentation," for X 4 is not stated with reference to the material as presented, but with re- spect to the material as "immediately" retained. Besides the THE ANALYSIS OF MENTAL FUNCTIONS 31 original mode of presentation is the same in both cases. The factor may be that of interest in the tests themselves. Inter- ested subjects will tend to rehearse their performances and will be likely to discuss the tests with others and to compare notes. Or else it is reasonable to argue that the subjects who have seized the essential significance of the passage will have an ad- vantage in the matter of permanent retention over those who depend on a more verbal type of memory, whereas in test No. 2 as scored, only two units out of a possible 97 are allowed for the reproduction of the gist of the passage, and 54 units are allowed for the reproduction of "ideas" which may be nothing but words. 23 Again I am obliged to say that our data do not justify me in saying more. But it is obvious that a comparison of different methods of scoring would be likely to give a defi- nite conclusion to our problem. Let us now consider No. 8, the Constant Increment test. We note that r ls = + 0.21 and r 18 . 23 . . 16 = + 0.21. Both r's are significant, but it is difficult to draw a conclusion beyond the restatement of the fact that No. 8 contains a significant factor not contained by the other fourteen. As we shall see, even the 1300 coefficients referred to above do not contain the necessary information which would enable us to push analysis much further. But it is quite unthinkable that such a complex function as adding a constant increment under test conditions should resist analysis. Indeed analysis is possible on the same lines we have pursued so far. But now the amount of arith- metic necessary for analysis might easily become unthinkable. It is at this point that the method devised by the writer will enable us to proceed. Table No. 4, given below, is somewhat analogous to table No. 2. It shows the effect on the relation of marks to No. 8 of the elimination of each of the other fourteen tests for some one order. It is unlike table No. 2 in that elimination is not consis- tently successive or continuous. The reason for this is practical. To have the order of elimination successive and continuous for 23 Kitson, op. cit.j p. 25. 32 CURT ROSENOW Table No. 4 r« == + 0.21 ris 16 = + 0.21 1-18 15 16 = + 0.19 r 18 14 15 16 = + 0.20 r 18 13 16 = + 0.18 r 18 12> • • -16 = + 0.19 r« 11 • • • «16 = + 0.18 ris • 10 • • • • 16 = + 0.17 r 18 . 9 • • • • 16 = + 0.18 r 18 • 2 — + 0.21 r« •2 3 P + 0.21 r 18 •234 = + 0.21 r 18 •2345 = + 0.21 ris • 2 • • • • 6 = + 0.13 r i8 • 2 • • • • 7 = + 0.12 each of the fifteen correlations of type r, would involve the com- putation of 8400 coefficients, instead of the 1300 which we have. (See appendix.) We note that r 18 fluctuates about its origi- nal value except at the point where No. 6 (Sentences Built) is eliminated, and there it drops to + 0.13. I am unable to offer any very convincing interpretation as to the factor which is alike in these two tests. Instead we will face the question how ris • 2 • • • • 16 regains, so to speak, its original value. Clearly table No. 4 does not furnish an answer. Neither do any of the 1300 coefficients from which it is an excerpt. The reason is as yet hidden somewhere in the infinite complexities of a situation which involves sixteen variables. Our only hope of finding this factor, without an inordinate amount of arithmetic, lies in re- ducing the number of variables. Now the factor, be it what it may, must be significantly related to marks. Therefore, if we can find the variables which, when combined, include every- thing which is so related, and if we can exclude those which merely duplicate some factor or set of factors, we will be that much nearer to the solution of the problem. Let us recall that R, the coefficient of Multiple Correlation, is a measure of the relation of a number of combined variables to another variable. Ri( 2 3 4 ie), the relation of our fifteen tests to marks, is + 0.55. If now we combine the five variables having the highest coefficients of the 14th order (see table No. THE ANALYSIS OF MENTAL FUNCTIONS 33 3 col. No. 2) we find by computation that Ri( 2 4 e s 9) = + 0.52. The difference between the two R's is 0.03, probable error 0.09 (roughly), and therefore not only is not significant, but the chances are 5:1m favor of its being due to chance. These five variables therefore contain everything that is significantly related to Marks, including the factor we are looking for. So if we eliminate successively No. 2 (Log. Mem. Aud.), No. 4 (Loss or Gain in No. 2), No. 6 (Sentences Built), No. 8 (Const. Increment), and No. 9 (Objects Seen), from r 18 our problem will be solved. Table No. 5 contains these values (see below). Inspection of it shows at once that r i8 drops, as before, when No. 6 is eliminated, but rises again when No. 9 is eliminated. The fac- tor we are looking for is common to No. 8 and No. 9, and is not in No. 2, No. 4, or No. 6. Table No. 5 r i8 = + 0.21 1-18 . 2 = + 0.21 ri 8 • 2 4 = + 0-21 r 18 • 2 4 e = :f 0.14 r 18 . 2 4 e 9 = + 0.19 What is this factor? Well, I regret to say that we may save ourselves the trouble of interpretation, for the difference we have investigated is not based on a sufficient number of obser- vations to be "significant." I have isolated this "factor" in order to exhibit what appears to me as the beauty of the analy- sis and in order to illustrate what can be done by means of in- direct analysis, (analysis by exclusion) by means of the manipu- lation of R, a method, which, so far as I know, has not been suggested elsewhere. Similar remarks are in order for tests No. 6 and No. 9. In both cases the correlation of the 14th order is significant. In both cases we have a fairly high degree of probability that each test possesses something, peculiar to it alone, significantly cor- related with marks. Beyond this point our 92 observations will not permit us to go statistically. We may go further, if we like, by way of a priori reasoning which has perhaps sug- 34 CURT ROSEN OW gestive value. Moreover this value is enhanced by the fact that the elimination of fourteen fairly diverse tests limits us to sug- gestions which must be fairly concrete and — which is more to the point — verifiable by future investigations. Before the close of this paper I shall be guilty of a little speculation of this kind. So far we have been occupied exclusively with analysis. To be sure, the paucity of our data, combined with the low value of our correlations, did not permit us to go very far. But I trust that the sort of thing which might be attained with more ex- tensive data is clear. We may now turn to the subject of diag- nosis and prognosis. We have seen just now that five tests carry practically all the meaning, with reference to marks, contained in the fifteen tests. It follows that they carry also all the diagnostic value. We have shown in an earlier part of this paper that the value of R which may arise owing to fluctuations of sampling alone may easily become unpleasantly large. We found that for 16 vari- ables (15 tests and a criterion) and 92 observations the "prob- able" value of R is + 0.40 even when none of the tests have any actual relation to the criterion or to each other. Our result of -\- 0.55, compared to this, has but little significance, and this is in itself a sufficient reason why it has little value for diag- nosis. (I dare not say no value, on account of the argument we hear so often that an infinitesimal part of a loaf is better than no bread at all.) The case is more favorable if we combine our five "best" tests. Here the actual R is + 0.52, and the corresponding "chance" value is -f- 0.21. The difference, at any rate, is significant, and, if anyone cares to do it, he may compute the weights which these five tests have in their regression equation and use the equation for the "practical" diagnosis of the ability of individual students. 24 But considerations of a more familiar sort show us the trivial nature of the diagnostic value, even in this case, in still more drastic fashion. Let us assume that our R = + 0.55 not only is significant, but is absolutely correct for the entire "population" of freshmen in America. What then would be its diagnostic value? The 2 *A short method for computing weights wall be found in the appendix. THE ANALYSIS OF MENTAL FUNCTIONS 35 probable error of R is now zero, but the "Standard Error of Estimate" remains. In discussing the logic of partial correla- tion we brought out the significance of this expression as being a "Residual." Now such a Residual, if it remains on our hands after we have estimated on the basis of all the information at hand, — be that information supplied by one or by fifty tests, — such a Residual is an Error of Estimate due to unknown causes. The standard deviation of all such errors is therefore a meas- ure of the accuracy attainable on the basis of such information. In terms of the standard deviation of the variable we are esti- mating (say academic marks), it is V 1 — r2 - As r approaches zero, V i — r 2 approaches I, Or, in other words, all of our estimates tend to coincide at the mean. Or, in still other words, in the absence of relation our average error of estimate will be least if we estimate all deviations as zero. Now when r = 0.55, ^ 1 — r e = 0.84. The gain in diagnostic value is thus very slight. 25 The case now stands as follows. The values of R, which we found, had no significance for a combination of fifteen tests,, and only a moderate significance for a combination of five tests. But even if these values were absolutely correct, their diagnostic value would be very low. On the other hand our analysis of the Logical Memory test retains such significance as we showed it to have and, even though the lack of material did not permit of pushing the statistical analysis very far, it has focussed our at- tention on five of the tests in a way which could not have been anticipated from the raw material, and has, in some measure, cleared the way for further investigations. Is this then the last word that can be said for "practical" diag- nosis? By no means. To be sure, if diagnosis of individual abil- ity be the end in view, the material we have investigated is worthless. Furthermore, I believe that this is true of all similar work done by others, in so far as their results can be duplicated 25 The formula for the error of estimate is, of course, perfectly familiar. But, whatever may be true of others, I had never realized how very rapidly it increases in value as r becomes less than unity. My attention was called to the fact by Mr. B. Ruml. 36 CURT ROSEN OW by other investigators. But then, if diagnosis of individual abil- ity for the purpose of educational guidance of the individual student were the only use to which tests and correlations could be put, I would not give the subject five seconds of my time. The motivation of this position would lead us too far afield, and we may waive discussion. But, the one who really needs guidance is the educator. And, even if the reader cannot agree with the first point, he will surely assent to the second. Now if it could be shown that a purely verbal type of memory has a correlation of zero with academic achievement at one institution or in one department, 25 at another, 50 at still another, etc., it would throw very little light on the "General Intelligence" of any individual student, but it would furnish a world of "guidance" to the edu- cator. For, assuming that it is undesirable to encourage and develop the "verbal" type, he would be able to direct his atten- tion and energy to the task of making other institutions or de- partments conform to the standard set by the first. It is to facts in the mass that statistics properly applies. It is there it can and should be applied. IV. Conclusions based on the Data It remains to regale the reader with the a priori speculation with which he has been threatened. We may conveniently do this by casting it into the form of "conclusions." At the same time we will restate the more valid conclusions which have re- sulted from the discussion. ( 1 ) The whole collection of tests has a low diagnostic value. (2) This is due only in part to the low value of the individual correlations. It is due very largely to the enormous amount of duplication. The fifteen tests are, to a large extent, all measures the same thing or things. This fact enables us to concentrate our attention on five tests. (Hereafter, in discussing the correlations of the various tests, reference will be to the coefficients of the highest order, unless specifically stated otherwise.) (3) The Logical Memory Test: The probability is 1300: 1 that test No. 2 is significant. The probability is 23 : 1 that audi- THE ANALYSIS OF MENTAL FUNCTIONS 37 tory presentation is "superior" to visual presentation. We need not repeat the discussion. We must not however interpret it to the effect that the "visual" test is of no value. It becomes super- fluous when auditory presentation is used. The probability is 140: i that test No. 4 (loss or gain) is sig- nificant. This is a clue toward the differentiation of different types of memory. Comparison of various modes of scoring sug- gests itself. E.g., we may compare, in No. 2, an evaluation based simply on the number of words correctly repeated with other methods. The significance of No. 4 may also be due to the factor of "interest." To a lesser degree of probability, similar state- ments can be made about test No. 5. (4) The probability is about 25 : 1 that test No. 8 (Const. In- crement) is significant. The instructions for this test empha- size speed more than those for any of the others where speed is a factor at all. Moreover the activity is far more mechanical and automatic, approximating simple reaction time after practice. Indeed for the faster subjects the limit of speed seems to be physiological rather than mental. Even some subjects who are only fairly fast give this impression, i.e., it looks as if they cannot talk as fast as they can add. This, physiological reaction time, may be the significant factor. It might be investigated by using 1 as the increment, adding only to single digits, and reducing the number to fifty. On the other hand the test, as given, is monotonous and long. It is hard work, and fast subjects are a little out of breath when they finish. As has been said, the raw data were not available, but I have noted elsewhere that some subjects increase their speed as they go along and spurt at the end, while others become slower and slower. By contrasting speed say in four quarters we might get a measure of some of the so-called character qualities, such as effort, preseverance, etc. Some such thing may be the significant factor in the Constant Increment Test. Of course, all this is highly speculative. (5) The probability is about 30: 1 that test No. 9 (Objects Seen) has negative significance. Of the ten objects presented, the fountain-pen, penfiller and inkwell, the envelope and the two cent stamp, the pencil and the ruler, are fairly well associated. 38 CURT ROSEN OW The association for the 25 cent piece, the maroon ribbon, and the key are more remote. The mean for all subjects was a little over seven, so that those who recalled more than the number of objects which are closely associated tended to have low marks. Now it is well known that the capacity to note and to retain a mass of irrelevant details is very great in the hypnotic trance, and many authorities incline to the belief that the trance is simply a state of diffuse attention. It might very well be that a selective, dis- criminating type of mind will obtain low scores in this test. The suggestion is of course capable of experimental investigation by varying the nature of the objects. (6) There is a probability of about 30: 1 that test No. 6 (Sentences Built) is significant. I have no comment to offer. (7) Accuracy appears to be of no significance except possibly as it enters into the logical memory tests. In the case No. 7 (Hard Directions) there is even a probability of 5 : 1 that it has negative significance. Now it is fairly well established that speed and accuracy are positively correlated at many activities and this is also born out by our data. Thus r 8 15 — + 0.40 (Const. Inc.), r 7 16 = + 0.25 (H. Directions) and r lx 14 = -j- 0.18 (Opposites). On the other hand the relation is probably in- verse, within limits, for any given individual. Of course, our tests cannot show the deviation of an individual from his own point of maximum efficiency. So it would seem that in so far as speed is significant, it has associated with it all of the significance of accuracy. We have already seen that in, so far as speed is significant, it seems to be most adequately represented by the Constant Increment test. If it is hard to say why a test does correlate, it is even more difficult to indicate why it does not. We may say in a general way that the ten tests whose coefficients of the highest order give no indication of significance, are very likely influenced by gen- eral factors, such as ability to adapt to novel situations such as the entire test situation is for the average subject, by interest, effort, etc. We can only say that in so far as they have value they are more adequately represented by the five tests with the highest coefficients of the 14th order. THE ANALYSIS OF MENTAL FUNCTIONS 39 This is all that I find it useful to say of the specific results ob- tained in the present study. We will now turn to an exposition of the mechanical technique involved in analyzing a complex situation. V. Appendix : A Contribution of the Mechanical Tech- nique of Partial Correlation In what follows it is assumed that the reader already has a working- knowledg of the theory of correlation. To proceed on any other assumption would be impossible in a work of the scope of the present one. On the other hand it has been my aim to arrange the steps in such a way that any-one who understands the terminology can follow the steps mechanically, or at least by symmetry, without having to inquire about the why's and where- fore's. The general problem before us is this: Given a dependent variable and a number of independent ones, how can we obtain a maximum of information with a minimum of arithmetic. In so far as the schema devised by me is applicable to such a problem it can be divided into three sub-problems, which however form successive stages of what really is a single operation. These three stages are: (i) The finding of the coefficient of Multi- ple correlation. (2) The finding of the coefficients of partial correlation, with reference to the dependent variable, of the high- est order. (3) The finding of the coefficients of regression (weights) for estimating the dependent variable. For the sake of simplicity we will assume that all standard de- viations of zero order are 1. Also, in order to avoid the cum- bersome use of "n," let us suppose that we are dealing with five independent variables, and one dependent one. The method can be extended to any number whatever by symmetry. Problem I Find Ri( 23456 )• This may be done directly from the equation 26 1— R 2 =(1— r 2 )(i— r 2 )(i-r 2 )(i-r 2 )(i— r 2 ) No. 1 1(23456) 12 13.2 14.23 15.234 16.2345 Beginning with the raw data, the work proceeds as follows : (1) Find all coefficients of zero order. 26 Yule. Intro., p. 248. 40 CURT R0SEN0W (2) Write equation No. 1 as above, and rewrite it, reversing the order of subscripts as follows : 1— R» =(1— r 2 )(i-r 2 )(i— r 2 )(i— r a )(i— r 2 ) No. 2 1(23456) 16 15.6 14.56 13.456 12.3456 (3) Compute the coefficients of equation No. 1 and No. 2. For this purpose there will be needed all coefficients having the secondary subscripts in these two equations, forty all told. That is, for equation No. 1, we will need all coefficients of type r_. 2 , r_. 23 , r_. 23 4, r_. 2345 . It is convenient to prepare a list of these coefficients on which their values can be entered as fast as computed. Such a list can be written by sym- metry with a minimum of thought, and about as fast as one can write. Giving subscripts only, it is : Table I 132 15.6 14.2 14.23 15.2 15.23 15.234 I16.2 16.23 16.234 16.2345 12.6 12.56 12.456 12.3456 34.2 35-2 36.2 45-2 45-23 46.2 46.23 56.2 56.23 56.234 (4) Compute 1 — R 2 according to equations No. 1 and No. 2. The two values should check. They afford an independent check on all of the previous arithmetic with exception of the computation of the coefficients of zero order. (5) Compute R, or, preferably, look it up. 27 Problem II To find the coefficients of the fourth order with reference to the dependent variable. There are 5 such coefficients. They are : ri2'3456 ri3-2456 ri4-2356 ri5-2346 ri6-2345 27 Use "Tables for statisticians and Biometricians." K. Pearson, Table 8, pp. 20-21. 14.6 13.6 12.6 14.56 13-56 12.56 13456 12.456 25.6 24.6 23.6 24.56 23.56 23456 35.6 34.6 3456 45.6 We may write, THE ANALYSIS OF MENTAL FUNCTIONS 41 1*12.3456 and 1*16.2345 have been found in Problem No. 1. The other three can be found indirectly from the following consideration. Let Ri(_a) be the Multiple coefficient showing the relation of X x to a combination of all the independent variables except X a , e.g., Ri(- 2 ) is Ri( 3456 ). No. 3 ,, I R 2 l(abc-n) 2 then P2 ; = 1— r 2 i a .bc__n I — XV. l^-aj a3 _ . I R 2 l(23456) a so that _». ; H 1 = 1 — r 2 i8.2 4se I— R 2 l(2456> 1 — R 2 1(23456) , ■ ^ — 7 r- — I r 14.2356 1 — R 2 1(2356) 1 — R 2 1(23456) „ ; ^y~7 T— = I r 15.2346 I K l(2346) i— R» =(1— r 2 )(i-r 2 )(i-r 2 )(i— r l ) 1(2456) 16 15.6 14-56 12.456 1— R' =(1— r 2 )(i— r 2 " )(i-r 2 )(i— r 2 ) 1(2356) 16 15.6 13.56 12.356 1— R 2 =(1— r 2 )(i— r 2 )(i-r 2 )(i— r 2 ) 1(2346) 12 13.2 14.23 16.234 Now all of these coefficients are in table I with exception of ri 2 . 35 6, which may be computed from r 12 . 5 6, ri 3 . 56 , and r 23 .56 which also are in table I. We are now able to compute all of the coefficients of the fourth order from equations of type No. 3. It will be noted that coefficients found indirectly from equation No. 3 are indeterminate as to sign. However, unless the numeri- cal magnitude of such coefficients is negligible, it will usually be possible to determine the sign by inspection. Problem III Find the coefficients of regression. The familiar expression <"l.(2 3 4. -n) for this coefficient is, b12.34.-n = 1*12.34. . n °2.1 3 4- -n We need therefore, in addition to the correlation coefficients of the fourth order, all of the six standard deviations of the fifth order. These should be written as follows : 42 CURT ROSEN OW O"! -23456 : C2-13456 : ffVl2456 : 0"4-l2356 : (T5-12346" 0"6-12345 : [' I -r 2 )(i 16 -r 2 )(i-r 2 )(i- 15.6 14.56 )(i- 36 1-r 2 )(i- 46 25.6 )(i-r 2 24.56 )(!■ 35-6 )(i-r 2 34-56 )(i- 13456 )(i 23456 ■1- )(i- 23.456 )(i-r 2 12.3456 ) 12.3456 r r 45-6 ) (l _r. ) (l _ r * 34.56 )(i-r 2 24.356 ) 13.2456 ) 14.2356 X Y 1-r 2 )(i-r 2 )(i-r 2 )(i- 25 35-2 45-23 -r 2 )(i-r 2 )(i— r 2 )(i— 1 26 36.2 46.23 r 2 )(i-r 2 56.234 )1 15-2346 J H 56.234 )(i-i ) 16.2345 r The only new coefficient in all of the above expressions is r 24 .356. It can be found from r 23 . 5 6, r 24 . 56 , and r 34 . 5 6, table I. The above six expressions for the standard deviations of the fifth order are written in such a way that the dependent variable will enter into the last term as a primary subscript, and from this point the variables are eliminated as nearly as possible in the same order in which they have been eliminated in equations No. 1 and No. 2, choosing by inspection the equation which is seen to be best for the purpose. From this point on the computation of the regression coeffi- cient proceeds in the usual way. The method outlined will, of course, lend itself to the solution of a variety of problems. The only point at all novel is expressed by equation No. 3, problem No. 2. In spite of its very great simplicity, I have never seen it in print, probably because it has no theoretical interest. It does however afford a very useful shortcut to the arithmetic, and the rest of the schema is simply a systematic exploitation of this fact. In comparing this schema with the one given in Yule's "Intro- duction," it should be borne in mind that Yule's schema provides for the finding of all possible relations. So far as I know, there is no short cut to this problem. I have simply taken advantage of the fact that in "test" work interest centers, or should center, on one or two dependent variables, not themselves tests. THE ANALYSIS OF MENTAL FUNCTIONS 43 The schema will show to advantage, however, if contrasted against's Kelley's. 28 Kelley faces practically the same problem as I, except that he scarcely mentions analysis and puts the emphasis exclusively on diagnosis and prognosis. He states 29 that the number of coefficients of partial correlation to be computed is 2 in the case of three variables, 15 in the case of 4 variables, 36 in the case of 5 variables, 78 in the case of 6 variables, etc. Our schema requires 2 coefficients in the case of 3 variables, 8 in the case of 4 variables, 22 in the case of 5 variables, 45 in the case of 6 variables, etc. Moreover, in our method, the symmetry of table I, which accomplishes practically all of the work, is very easily seen and reproduced, and it requires only a very moderate amount of practice and ingenuity to write the expressions in problems No. 1 and No. 2 to the best advantage. Over against this, Kelley does not state any guiding principle for selecting the coefficients to be computed to the best advantage and I am quite unable to see the symmetry without such aid. For example I am quite unable to say how many coefficients Kelley would need for say 8 variables. But then I am quite prepared to find that that is due to my own stupidity. Finally I wish to call attention again to the very useful char- acter of Kelley's tables. Using these tables in conjunction with a "millionaire" calculating machine, 75 coefficients per hour, correct to two places, or 40 coefficients, correct to three places, can easily be computed. 28 Op. cit., p. 23, this paper. 29 Op. cit., p. 14. 021 068 659 9