■tv->t',.. i; !^ i' ' ■ . 'r?v>i>-.s.l;irUt'.':'.:,f; ■ ' ;) Uli ■M\' I.'' !(^- SMf'ti l^M. .l^ll^i'sllll^lUM^- (lass J Book, . LBWi^ .66 i^HZ E\) BY I i X ^ THE GARY PUBLIC SCHOOLS MEASUREMENT OF CLASSROOM PRODUCTS By STUART A. COURTIS GENERAL EDUCATION BOARD 61 Broadway NewYork 1919 ^id MEASUREMENT OF CLASSROOM PRODUCTS SURVEY OF THE GARY SCHOOLS On invitation of the Board of Education and the Superintendent of Schools of Gary, Indiana, a sur- vey of the Gary schools was made by the General Education Board. It is published in eight parts, as follows: The Gary Schools : A General Account By Abraham Flexner and Frank P. Bachman (25 Cents) Organization and Administration George D. Strayer and Frank P. Bachman (is Cents) Costs Frank P. Bachman and Ralph Bowman (25 Cents) Industrial Work Charles R. Richards . (25 Cents) Household Arts Eva W. White (10 Cents) Physical Training and Play Lee F. Hanmer (lO Cents) Science Teaching Otis W. Caldwell (10 Cents) Measurement of Classroom Products Stuart A. Courtis (30 Cents) Any report will be sent postpaid on receipt of the amount above specified. THE GARY PUBLIC SCHOOLS MEASUREMENT OF CLASSROOM PRODUCTS BY STUART A. COURTIS GENERAL EDUCATION BOARD 61 Broadway New York City 1919 COPYRIGHT, 19 1 9, BY 'General Education Board 0ift Publish ^r" OCT 22 im CONTENTS PAGE Introduction vii Author's Preface xxi I. General Statement. ....... 3 11. Tests and Testing Conditions ... 10 III. Handwriting ......... 28 §1. General Results §2. Critical Discussion IV. Spelling , . 79 §1. General Results §2. Critical Discussion V. Arithmetic 146 §1. General Results §2. Critical Discussion VI. Composition . . . 209 §1. General Results §2. Critical Discussion VII. Reading . 263 §1. General Results §2. Critical Discussion VIII. Factors Afpecting Performance . . 355 IX. Conclusions 383 X. Appendix A s^^ XL Appendix B 483 INTRODUCTION The Gary Plan In the last few years both laymen and professional educators have engaged m a Hvely controversy as to the merits and defects, advantages and disadvantages of what has come to be called the Gary idea or the Gary plan. The rapidly increasing literature bearing on the subject is, however, deficient in details and too often partisan in tone. The present study was undertaken by the General Education Board at the request of the Gary school authorities for the purpose of presenting an accurate and comprehensive account of the Gary schools in their significant aspects. In the several volumes in which the main features of the Gary schools are separately considered, the reader wUl observe that, after presenting facts, each of the authors discusses or — ^in technical phrase — attempts to evaluate the Gary plan from the angle of his particular interest. Facts were gathered in a patient, painstaking, and objective fashion; and those who want facts, and facts only, wUl, it is believed, find them in the descriptive and statistical portions of the respective studies. But the successive volumes will discuss principles, as well as viii INTRODUCTION state facts. That is, the authors will not only describe the Gary schools in the frankest manner, as they found them, but they will also endeavor to interpret them in the light of the large educational movement of which they are part. An educational conception may be sound or unsound; any particular effort to embody an educa- tional conception may be adequate or inadequate, effec- tive or ineffective. The public is interested in knowing whether the Gary schools as now conducted are efficient or inefficient; the pubHc is also interested in knowing whether the plan as such is sound or unsound. The present study tries to do justice to both points. What is the Gary plan? Perhaps, in the first instance, the essential features of the Gary plan can be made clear, if, instead of trying to tell what the Gary plan is, we tell what it is not. Ex- cept for its recent origin and the unusual situation as respects its foreign population, Gary resembles many other industrial centers that are to be found throughout the country. Now, had Gary provided itself with the t3^e of school commonly found in other small industrial American towns, we should find there half a dozen or more square brick "soap-box" buildings, each accom- modating a dozen classes pursuing the usual book studies, a playground, with little or no equipment, perhaps a basement room for manual training, a laboratory, and a cooking room for the girls. Had Gary played safe, this is the sort of school and school equipment that it would now possess. Provided with this conventional school INTRODUCTION ix system, the town would have led a conventional school life — quiet, unoffending, and negatively happy — doing as many others do, doing it about as well as they do it and satisfied to do just that. A^ contrasted with education of this meager type, the Gary plan is distinguished by two features, intimately connected with each other: First — the enrichment and diversification of the curriculum; Second — the administrative device that, for want of a better name, wiU be tentatively termed the duplicate school organization. These two features must first be considered in general terms, if the reader is to understand the detailed descrip- tion and discussion. As to the curriculum and school activities. While the practice of education has in large part continued to follow traditional paths, the progressive hterature of the subject has abounded in constructive suggestions of far-reaching practical significance. Social, political, and industrial changes have forced upon the school responsibilities formerly laid upon the home. Once the school had mainly to teach the elements of knowledge; now the school is charged with the physical, mental, and social training of the child. To meet these needs a changed and enriched curriculum, including community activities, facilities for recreation, shop work, and house- hold arts, has been urged on the content side of school work; the transformation of school aims and discipline X INTRODUCTION on the basis of modem psychology, ethics, and social philosophy has been for similar reasons recommended on the side of attitude and method. These things have been in the air. Every one of them has been tried and is being practised in some form or other, somewhere or other. In probably every large city in the country efforts have been made, especially in the more recent school plants, to develop some of the features above mentioned. There has been a distinct, unmistakable, and general trend toward making the school a place where children "Kve" as well as "learn." This movement did not originate at Gary; nor is Gary its only evidence. It is none the less true that perhaps no- where else have the schools so dehberately and explicitly avowed this modern policy. The Gary schools are offi- cially described as "work, study, and play" schools — schools, that is, that try to respond adequately to a many- sided responsibility; how far and with what success, the successive reports of the Gary survey will show. It must not, however, be supposed that the enriched curriculum was applied in its present form at the out- set or that it is equally well developed in all the Gary schools. Far from it. There has been a distinct and uneven process of development at Gary; sometimes, as subsequent chapters will show, such rapid and unstable development that our account may in certain respects be obsolete before it is printed. When the Emerson school was opened in 1909, the equipment in laboratories, shops, and museums, while doubtless superior to what INTRODUCTION xi was offered by other towns of the Gary type, could have been matched by what was to be found in many of the better favored larger towns and cities at the same period. The gymnasium, for example, was not more than one third its present size; the industrial work was not im- precedented in kind or extent; the boys had woodwork, the girls cooking and sewing. But progress was rapid: painting and printing were added in 191 1; the foundry, forge, and machine shop in 1913. The opportunities for girls were enlarged by the addition of the cafeteria in 1913. The auditorium reached its present extended use as recently as the school year 1913-14. The Froebel school, first occupied in the fall of 191 2, started with facilities similar to those previously introduced piecemeal into the Emerson. These facilities, covering in their development a period of years, represent the effort to create an elementary school more nearly adequate to the needs of modem urban life. The curriculum is enriched by various ac- tivities in the fields of industry, science, and recreation. Questions as to the efficiency with which these varied activities have been administered will be discussed by the various contributors to the present study. Mean- while, it is perhaps only fair to point out that the modem movement calls not only for additions to, but elimina- tions from, the curriculum and for a critical attitude toward the products of classroom teachuig. How far, on the academic side, the Gary schools reflect this aspect of the modern movement will also presently appear. xii INTRODUCTION The administrative device — the "duplicate" organiza- tion, noted above as the second characteristic feature of the Gary plan — stands on a somewhat different footing, as the following considerations make plain. Once more, Mr. Wirt was not the inventor of the in- tensive use of school buildings, though he was among the first — if not the very first — to perceive the purely educa- tional advantage to which the situation could be turned. The rapidity with which American cities have grown has created a difficult problem for school administrators — the problem of providing space and instruction for chil- dren who increase in number faster than buildings are constructed. The problem has been handled in various ways. In one place, the regular school day has been shortened and two different sets of children attending at different hours have been taught daily in one building and by one group of teachers. Elsewhere, as in certaia high schools, a complete double session has been con- ducted. The use of one set of schoolrooms for more than one set of children each day did not therefore originate at Gary. Another point needs to be considered before we discuss the so-called dupKcate feature of the Gary plan. In American colleges, subjects have commonly been taught by specialists, not by class teachers. The work is "de- partmentalized" — to use the technical term. There is a teacher of Latin, a teacher of mathematics, a teacher of physics, who together instruct every class — not a separate teacher of each class in all subjects. Latterly, INTRODUCTION xiii departmentalization has spread from the college into the high school, until nowadays well organized high schools and the upper grades of elementary schools are quite generally "departmentalized," i.e., organized with special teachers for the several subjects, rather than with one teacher for each grade. Out of these two elements, Gary has evolved an admin- istrative device, the so-called duplicate school, which, from the standpoint of its present educational signifi- cance, does indeed represent a definite innovation. For the sake of clearness, it will be well to explain the theory of the duplicate school by a simplified imaginary example : Let us suppose that elementary school facilities have to be provided for, say, i,6oo children. If each class is to contain a maximum of 40 children, a schoolhouse of 40 rooms would formerly have been built, with perhaps a few additional rooms, httle used, for special activities; except during the recess (12 to 1:30) each recitation room would be in practically continuous use in the old- line subjects from 9 to 3 130, when school is adjourned till next morning. A school plant of this kind may be represented by Figure I, each square representing a schoolroom. The "duplicate" school proposes a different solution. Instead of providing 40 classrooms for 40 classes, it requires 20 classrooms, capable of holding 800 children; and further, playgrounds, laboratories, shops, gardens, gymnasium, and auditorium, also capable of holding xiv INTRODUCTION 8cx3 children. If, now, 800 children use the classrooms while 800 are using the other facilities, morning and after- noon, the entire plant accommodates 1,600 pupils throughout the school day; and the curriculum is greatly enriched, since, without taking away anything from their classroom work, they are getting other branches also. A school thus equipped and organized may be represented FIGURE I REPRESENTS OLD-FASHIONED SCHOOLHOUSE 40 rooms for 40 classes, of 40 children each, i. e., facilities for the academic instruc- tion of 1,600 children. A school yard and an extra room or two, little used, for special activities, are also usually found. by Figure II, in which A represents 20 classes taking care of 40 children each (800 children) , and B represents special faciUties taking care of 800 children. As A and B are in simultaneous operation, 1,600 children are cared for. This method of visualizing the "duplicate" school serves to correct a common misconception. The plan aims to intensify the use of schoolrooms; yet it would be INTRODUCTION XV incorrect to say that 20 classrooms, instead of 40, as under the old plan, accommodate 1,600 children. For while the number of classrooms has been reduced from 40 to 20, special facilities of equal capacity have been added in the form of auditorium, shops, play- ground, etc. The 20 classrooms apparently saved FIGURE II REPRESENTS THE GARY EQUIPMENT A B 20 classrooms for academic instruction Special facilities, taking care of Soochil- of 20 classes of 40 children each (800 chil- dren in the morning hours and an equal dren) in the morning hours and an equal number in the afternoon hours (1,600 in all number in the afternoon (1,600 in all daily) daily) Auditorium Shops ' Laboratories Playground, gardens, gymnasium and library have been replaced by special facilities of one kind or another. The so-called duplicate organization and the longer school day make it possible to give larger facilities to twice as many children as the classrooms alone would accommodate. The duplicate school, as devel- oped at Gary, is not therefore a device to relieve conges- tion or to reduce expense, but the natural result of efforts to provide a richer school life for all children. xvi INTRODUCTION The enriched curriculum and the duplicate organ- ization support each other. The social situation re- quires a scheme of education fairly adequate to the entire scope of the child's activities and possibiHties ; this cannot be achieved without a longer school day and a more varied school equipment. The duplicate school endeavors to give the longer day, the richer curriculum, and the more varied activities with the lowest possible investment in, and the most intensive use of, the school plant. The so-called duplicate school is thus a single school with two different types of facihties in more or less constant and simultaneous operation, morning and afternoon. Such is the Gary plan in conception. What about the execution? Is it realized at Gary? Does it work? What is involved as respects space, investment, etc., when ordinary classrooms are replaced by shops, play- grounds, and laboratories? Can a given equipment in the way of auditorium, shops, etc., handle precisely the same number of children accommodated in the class- rooms without doing violence to their educational needs on the one hand, and without waste through temporary disuse of the special facihties, on the other? To what extent has Gary modified or reorganized on modern lines the treatment of the common classroom subjects? How efficient is instruction in the usual academic studies as well as in the newer or so-called modem subjects and activities? Is the plan economical in the sense that equal educational advantages cannot be procured by INTRODUCTION xvii any other scheme except at greater cost? These and other questions as to the execution of the Gary plan are, as far as data were obtainable, discussed in the separate volumes making up the present survey. The concrete questions above mentioned do not, how- ever, exhaust the educational values of a given school situation. From every school system there come im- ponderable products, bad as well as good. Aside from aU else, many observers of the Gary schools report one such imponderable in the form of a spiritual something which can hardly be included in a study of administra- tion and eludes the testing of classroom work. These observers have no way of knowing whether Gary school costs are high or low; whether the pupils spell and add as well as children do elsewhere; but, however these things may be, they usually describe the pupils as characterized by self-possession, resourcefulness, and happiness to an unusual degree. While different schools and indeed different parts of the same school vary in this respect, the members of the survey staff agree that, on the whole, there is a basis of fact for these observations. Gary is thus something more than a school organization charac- terized by the two main features above discussed. The reason is not far to seek. Innovation is stimu- lating, just as conformity is deadening. Experiment is in this sense a thing wholesome in itself. Of course it must be held to strict accountabihty for results; and this study is the work of persons who, convinced of the necessity of educational progress, are at the same time xviii INTRODUCTION solicitous that the outcome be carefully observed. The fact that customary school procedure does not rest upon a scientific basis, does not wilHngly submit itself to thorough , scrutiny, is no reason for exempting educa- tional innovations from strict accountability. The very reverse is indeed true; for otherwise innovation may im- peril or sacrifice essential educational values, without actually knowing whether or not it has achieved definite values of its own. Faith in a new program does not absolve the reformer from a watchful and critical atti- tude toward results. Moreover, if the innovator for- mulates his purposes in definite terms and measures his results in the light of his professed aims, the conservative cannot permanently escape the same process. Gary, like all other educational experiments, must be held account- able in this fashion. Subject however to such ac- countabihty, the breaking of the conventional school framework, the introduction of new subject matter or equipment, even administrative reorganization, at Gary as elsewhere, tend to favor a fresher, more vigorous interest and spirit. Defects will in the following pages be pointed out in the Gary schools — defects of organization, of ad- ministration, of instruction. But there is for the reasons just suggested something in the Gary schools over and above the Gary plan. Problems abound, as in every living and developing situation. But the problems are the problems of Ufe, and, as such, are in the long run perhaps more hopeful than the relatively smooth functioning of a stationary school system. Thus, not- INTRODUCTION xix withstanding the defects and shortcomings which this study will candidly point out, the experiment at Gary rightly observed and interpreted is both interesting and stimulating. AUTHOR'S PREFACE The present account of the measurement of certain products of classroom teaching at Gary presents in detail the results of the various tests given for the pur- pose of measuring the degree of efficiency with which the common school subjects were taught at the time of the investigation of the Gary schools. Because of the care with which the testing work at Gary was conducted, and the number and variety of the tests given, the data secured seem to the author to have thrown much light upon some of the problems fundamental to all meas- urement work. He believes that the testing move- ment has now reached a stage in which a critical study of the validity of the results secured may be both in- teresting and beneficial. Accordingly, he has ventured to discuss at considerable length in Section 2 of each chapter the technique of classroom testing and the va- rious factors which affect the results. The report ought, therefore, to be of interest not only because it deals with Gary, but because it attempts a general critical dis- cussion of tests and testing. A volume of this character is never the product of a single mind. The writer would be ungrateful indeed did he not acknowledge his indebtedness for the great xxii AUTHOR'S PREFACE and varied assistance he has received: First and fore- most, to the superintendent, teachers, and pupils of the Gary schools, without whose fullest cooperation the thoroughness, of the testing work could not have been attained; to Messrs. Ayres, Thorndike, and Judd, for aid in the interpretation of the data secured ; and finally, and perhaps most of all, to the enthusiasm, disinterested service and fidelity of the six young men who served as his assistants, Messrs. Paul C. Packer, E. J. Ashbaugh, George C, Brandenburg, Leo J. Brueckner, J. W. Richardson, and E. H. Lauer. It is through their ar- duous labors, intelligent cooperation, conscientious per- formance of assigned tasks, that the original design of the survey has been carried out as planned. For such errors in planning, execution, and expression, as may be found, the writer accepts full responsibihty. They indicate merely his limitations and his inability to profit fully by the generous and loyal assistance which all have been glad to give. MEASUREMENT OF CLASSROOM PRODUCTS I. GENERAL STATEMENT "Status of Educational Measurement THROUGHOUT this report the reader will need to keep constantly in mind the fact that educa- tional measurement is a recent development. The first reliable scale for measurement of any educational product was published early in 19 lo; measurement was first used in a large modern survey in 191 2. Even to-day there are probably hundreds of educational workers who have never heard of measurement, and hundreds more who have not the faintest conception of the fundamental principles involved. On the other hand, so rapid has been the development that a bureau for educational research was established in a city school system by September, 19 13. To-day there is a National Association of Directors of Educa- tional Research with a membership of thirty-odd men and women who are giving their time wholly, or mainly, to such work. Courses in educational measurement are given in most university schools cf education and in many normal schools. No survey of a school system would to-day be attempted without making provision to secure objective evidence through educational measure- ment upon which to base conclusions. 3 4 THE GARY SCHOOLS However, the mushroom growth of the movement for measurement, the superficial use of untested tools, the hasty generaHzation from insufficient data do not make for confidence. There is at the present moment a very great danger that educational measurement will be dis- credited more by the reckless optimism of its friends than by the attacks of its enemies. Yet of the value of measurement itself there can be no question; the methods of science are not on trial. There is no question, even, of the applicabiHty of methods of scientific measurement to educational problems. The one thing that is needed is time, time to study the measuring instruments them- selves before accepting them as perfect, time to formulate fully a problem before attempting to solve it, time to gather reliable data and to digest them before arriving at conclusions. LIMITATIONS OF MEASUREMENT In the appraisal of educational innovations the ulti- mate question must ever be: "What is the effect upon the children?" However widely a school system may depart from established usage, however much its ex- perimental modifications of either theory or practice may seem injudicious and undesirable, if it could be shown by impersonal, objective measurement that be- cause of the changes that had been made, the graduates of the school system in question are better developed, better trained, and generally more desirable members of society, all other judgments would have to be reversed. GENERAL STATEMENT 5 Unfortunately, it is not possible at the present time to measure completely and objectively the total educa- tional product of a school system. For many years to come, expert opinion based upon inspection must be our only means of estimating the general worth of such edu- cational experiments as that at Gary. But in certain phases of school work measurement is possible, and such measurement serves as a check upon the observations and opinions of the inspecting experts. If, for example, inspection is made of the teaching of spelling and such teaching is judged to be faulty, while the objective tests of spelling ability prove that the children spell very much better than children of the same grade or age in conventional school systems, then the ability of the judges to pass upon the innovations may well be doubted, If, on the other hand, the results of the objective tests and the subjective opinion of the inspectors are in complete agreement, then the judgments of the experts in those matters in which no such checks are at present possible may be accepted with greater confidence. Educational measurement in its present stage of development must necessarily deal with but few of the many products of educational work. It is more valuable as a check upon the more extended (but less reliable) subjective judgments of a survey staff than as a complete and in- dependent determination of the merits or demerits of a school system. Therefore, this report should be read and interpreted in connection with that on Organization and Administration, and the chapters, in The Gary Schools: 6 THE GARY SCHOOLS A General Account, dealing with the Course of Study and Instruction. TITLE The title chosen for this report calls for explanation. Measurement itself is so new, the total of our scientific knowledge of tests and testing so meager, that in spite of the limitations implied in the use of the word "certain," the title "Measurement of Certain Products of Class- room Teaching" is still too sweeping in its claims. Any- ordinary writing lesson may awaken the dormant self- consciousness of a child to a sense of his own power, the mastery of a spelHng difficulty may strengthen the fibers of his character, the deadhest grind found in the most mechanical school may, for certain individuals, serve to organize and direct their energies toward worthy aims. Stimulation, character building and inspiration may be products of classroom teaching of the common branches no less than the grosser elements really meas- ured by our educational tests. When, therefore, we undertake to measure classroom products and measure only efficiency in certain mechanical abiHties we may miss wholly certain other products of equal or greater value. If all the ideas, which a strict regard for the truth would require, were to appear in the title, a statement some- thing like the following would have to be used: "A report of an attempt to measure a few phases of certain products of classroom teaching in four of the largest pubhc schools at Gary." The shorter title should be so read as to connote the longer. GENERAL STATEMENT 7 Yet, after all, it may be that the shorter title does not so culpably misrepresent the truth. True stimulation awakens a child to his school opportunities no less than to those of the outside world. True character building is made manifest by the spirit in which daily tasks are perforriied, whether these tasks are in the school or in the home. True inspiration produces as substantial achievement in childhood as in the prime of life. There- fore, when measurement of the higher phases of the prod- ucts of classroom teaching becomes possible, it may be that we shall find the higher and the lower so indissolubly linked together that from the measurement of one, the degree of development of the other may be inferrec^. The present study is, at least, an honest attempt to evaluate completely and thoroughly those elements in the situation which are now measurable. RELIABILITY Because the dangers and limitations of measurement were fully recognized, the attempt was made at Gary to secure results as reliable as it is possible to make them at present. Each subject was tested in more than one way, and great care was taken to control the conditions under which the tests were given and scored. Tests of the product of teaching of the elementary schools were carried through the high school grades as well, both to determine how the abilities developed in the lower grades were affected by high school work, and to see whether or not any marked changes in product had 8 THE GARY SCHOOLS occurred in recent years. From the point of view of thoroughness, therefore, and within their narrow field, the results are probably more complete and hence more reliable than those of previous surveys. PLAN OF REPORT No effort, time, or expense has been spared on this report. And yet, as the time comes for publication, there comes also the realization that what has been written may very easily be misunderstood. A survey report to be readable must condense and summarize its findings, but to be inteUigible must give in full the data upon which its conclusions are based. Either course alone is sure to lead to misconception. Accord- ingly, the writer has written this report in three sections. Section i of each chapter is a concise, non-technical discus- sion of the significant aspects of the data secured in the present attempt to evaluate by measurement the effects of the Gary system upon the teaching of the fundamental branches in the Gary schools. Section 2 contains critical discussions of all that is involved in the testing process — analyses that will show clearly the reservations with which any set of conclusions should be put forward. Finally, in the appendices has been placed material of value to the student of education and essential to a care- ful study of the report, but devoid of interest except to the specialist. This material includes certain of the longer de- tailed tables, directions for scoring the tests, samples of rec- ord sheets, score cards, and other items of a similar nature. GENERAL STATEMENT INTERPRETATION In interpreting the meaning of results of tests, three methods of procedure are possible. One method — and the best, for it is the least ambiguous — ^is to state what was fotmd, leaving it to the reader to judge for himself whether or not the product is satisfactory. The second method is to compare the results obtained with those from other cities where similar measurements have been made. This method is legitimate only when the two cities have been measured with equal care and under similar conditions. The third method is to record the investigator's own conclusions. This method is the most dangerous, for it is difficult to free opinion from the effects of personal bias. However, in this report all three methods have been followed. In succeeding chapters the direct and comparative data will be found in detail. In the final chapter the author's own con- clusions from these data are presented. II. TESTS AND TESTING CONDITIONS THE theory of the modern program at Gary, aim- ing to minister adequately to every need of the child — ^physical, intellectual, moral, industrial, and social — and to correlate closely school activities with those of real Hfe, meets with general approval; but the vital question to be answered by the measurements de- scribed in this volume is : What effects do the actual ways in which the Gary schools carry out their program have upon certain educational products? THE SCHOOLS TESTED The four m.ost representative schools of Gary, named in the order of size and equipment, are: Froebel, Emerson, Jefferson, and Beveridge. The Froebel and Emerson schools, built within the last ten years, are both architecturally adapted and fully equipped to carry out an extended and enriched program. The Jefferson and Beveridge schools, however, were in existence prior to the development of the present system. In both, in spite of alterations, the attempt to carry out the full im- plication of the Gary plan is handicapped by limitations of equipment. Nevertheless, both have the longer school TESTS AND TESTING CONDITIONS ii day and the new t3^e of organization, and in both the regular academic work is supplemented by such additional special work as their facilities permit. Comparisons from one Gary school to another are thus possible and should serve to bring to light any effects caused by differences in equipment or organization. The remaining schools are small in size, limited in equipment, and at such distances from the center of the town that it seemed better to confine the testing to the four schools mentioned. Accordingly, but few tests were given in the small schools. SOCIAL CONDITIONS The sections of the city served by the various schools differ greatly in social conditions; hence, in making school to school comparisons these differences should be kept in mind. The Froebel children are mainly of foreign parentage and often come from homes far down on the social scale. As a result, difficulty with language is a factor which enters largely into all work at this center. The Emerson and Jefferson schools, however, draw from the better residential districts. Beveridge is in an older and poorer section of the city, and its children differ accordingly. The differences in the home conditions from school to school are, therefore, marked. In regard to the related question as to the composition of the population, it is probable that the foreign born element in Gary is greater than in most American cities 12 THE GARY SCHOOLS of the same size, but probably not greater than in other rapidly growing industrial centers.-^ SYSTEM OF GRADING The regular school year in Gary is ten months, usu- ally divided into three equal terms. There is a corres- ponding division of grades into A, B, and C groups. Thus the lowest group in the sixth grade is known as 6C, the next higher as 6B, and the highest group as 6A. However, all three divisions are often found in the same class organization, taking the same work.^ But class organization is not as stable at Gary as in conventional systems.^ Children tend to come and go, and the mem- bership of a given class fluctuates correspondingly. In conducting a test it has happened that the grade label given by the principal to a class did not agree with that given by the teacher, and sometimes neither was the same as the grades written by the children on the test papers. Under the circumstances, it was decided to give to each class the official grade assigned to it at the ^See The Gary Schools: A General Account. ^It might have happened that this plan of promotion would make comparisons with corresponding grades in other cities unfair, in that a third of the children would have been in a grade but a short time by the end of the year. Careful checking of the data, however, proves that owing to the manner of grouping children in classes, this is not the case; the special condition is favorable to Gary rather than otherwise. ^The various recitation groups or classes are numbered, as class i8, Jefferson. Two classes of the same grade in the same school have dif- ferent numbers. Thus class 45, Froebel, is an eighth grade class, and class 46 is also an eighth grade class. TESTS AND TESTING CONDITIONS 13 close of scliool before promotions in June, 19 16. Class No. 45 FS'oebel, for instance, will be called an eighth grade class, and throughout the tables that follow its scores form a part of the eighth grade group. ATTENDANCE Class No. 45 was a persistent unit throughout the testing period and has been so tabulated, but the individ- uals comprising class No. 45 varied more or less with every test. The actual number of individuals found in this class on the 20 different occasions on which it was visited from March 23 to June 9 varied from 11 to 23 (Figure i). While this particular class illustrates ex- treme variation above and below the official class mem- bership, similar data were obtained for many other classes.^ The irregularity in class No. 45 Froebel was due to many causes; some of the variations were brought about by such legitimate factors as sickness and withdrawal from school work; other cases represent legitimate varia- tions caused by adjustment of programs to individual needs leading to attendance on other than official classes. For instance, individual "I" (Figure i) on April 11 recited for some reason with class No. 46 instead of with class No. 45 in which he officially belonged. Simi- ^The results for the other 4 eighth grade classes (combined) are: ofl&cial membership 128, maximum attendance 134, minimum 91, median 119; names not on official list, 11; children tested twenty times in the twenty test days, 34 per cent. See also Table I, Appendix A. 14 THE GARY SCHOOLS Figure i Children Actttally Tested in Class No. 45 Froebel .ATTENDANCE - FROEBEL - CLA55 45 - EIGHTH GR.ADL NAME JlATt ATTS-IL MAV JUNt. TOTAL REMAM5 23 30 4 11 13 14 25 E6 27 2 4 10 1l|l6 iaz9 at i a 9 A 1 £> B B 1 ■ _ 6 ENROLLED IN 4fc H.T c ^ EO E ■ 9 ""' M H H B _B 3 B _ _ _ -ME _ _ _ -■■ 6 3 — 1 — 2= * — -ff- 1^ — -ff- 17 A-PRILII IN 46 J n BH^^^^^^B 3 NOT ENROLLED R. 1 WITHDRAWN 4«. . L 7 NOT ENHOLLKO M _ 1 _ _ _ _ ^ nJ^LlJ-^" — ■ ■ ■ * ■ H * * m^v 18 v- B m "" e n 1 zo — i — —J ■ B B B B _ J ^ ft bHI ^=^ . T — I * ■ i 20 ±±JL. ■ "fe TF 12 -^ _ 1 _ -r^- X. . , — — -fiF -FF ■1 — 1 1 4= 19 y 1 TESTED IN •4-fc-lO "z. a ElflloUED IN 43 T TOTAL »^.at.alul.<:ltcl«l.lml.cl«il<.L.elMl.il»l.-H,iU«L3Il^ VAWAVioN.1144 1^ L^ ^ ^ ^ 1^ iii :: i;;t n 1^ 1^ IJ ^ ,^ iU i^i^ ,*i2j H^lAW 15 Individuals are denoted by letters in the column at the left. The dates of the various tests are given at the top. Each small rectangle represents the attendance of one individual for one day. Black means absence. Abbreviations have the following meanings: T means tested in the given class. N. T. means not tested in the given class. 46 means class No. 46, also eighth grade class. 44 means class No. 44, a seventh grade class. 43 means class No. 43, a seventh grade class. Withdrawn means left school. Enrolled or not enrolled refers to the "Official lists." The ofi&cial enrollment for class No. 45 was 16, the median attendance 15, the maximum attendance 23, the minimum 11, the number of different individuals found in the class during the twenty test days, 26. The superintendent states that this class was a small, irregular group which served as a temporary "catch all" for the grade. Howe\'er, similar variations were found in other classes and grades, also. TESTS AND TESTING CONDITIONS 15 larly, individual "Z" was regularly tested in class No. 43 (7th grade), his official class, but twice appeared in class No. 45 (Sth grade). Individual "B" appears on the official list^ of class No. 46 and individual "S" in class No. 44, but neither is found in the records of the tests of these classes. Individuals "C," "J," and "L" represent still another type of variation; their names do not appear on any of the official lists of any classes yet their test papers are on file as proof that on the days mentioned they were present in class No. 45. In grades four to eight, for 51 classes with an average membership- of 31, there were found present for the spelling test on May 2 and 3, on the average, 1.7 names per class which did not appear on the official enrollment.^ That is, the tests were given to continually fluctuating groups. No complete tabulation of the exact attendance by individuals in each test and class was made, because a ^Principals were asked to furnish complete lists by classes of all the children enrolled. Previously, children's names appearing on the teachers' registers had been copied on cards, and checked against the promotion lists for grades and against the census reports for age. When the lists furnished by the principals had been checked against these cards, they were adopted as "OflScial Class Lists" and are so referred to throughout this report. ^Percentage of "extra" pupils: Froebel, 7-5%'^ Jeflferson, 6.5%; Emerson, 2%; Beveridge, 1%. There is some evidence tending to show that the "extra" pupils were present in larger numbers when the testing work was new. Whether the absent and extra groups represent real conditions or simply defects in the "official lists" cannot be determined. The "official lists" represent, at least, the best that could be done with such lists as the principals furnished. i6 THE GARY SCHOOLS < p ^ w o a O 2; e t/3 " 2 w y Oi U3 Oi O ?D uo «D oa «3 h- Oi < t- t- (O t- t- t- CO (M CO t- t- CO 00 t- t- t- iO 1—1 U5 "*„ -*_ -^__ ^„ '^^ ^^ ^„ ->*_ "*„ '^^ 1—1 T-T i-T i-T T— T i-T 1— t i-T i-T 1—1 1H00 t- ^ «D i-J tH -* «D HO H OH H COO 1000 C lO to i-HOi 05 00 03 t- 03 Oi 00 1—1 05 '^ «D OOJ 05 05 o> 05 oios" O^ 03 oi Oi CO(M (MOJ (M OCI (M(M » Si^ ^ ^ 1— 1 .g« mpleted the tabl H a S.gg ^ tn tn O- Cj IH O- t3 ^ g< ^ 1 >> oi (U _ a-TSC^ 03 c3 ^ ^ W) u > tu C ^ bo •>*^ X) t5 S> fl CI ^ rt Coo -^"^ fti gi^ 0) rd ■l-l ■»-< i8 THE GARY SCHOOLS study of attendance, as such does not fall within the scope of this report; but plenty of incidental evidence that Figure i actually reflects conditions throughout all grades of the school during the testing periods has been accumulated during the process of checking records. For instance, the actual number of children tested on the 19 test days varied from 68 per cent, to 90 per cent. (average 83 per cent.) (Table I, Figure 2). The writer estimates that about 80 children out of 100 enrolled were tested regularly,^ that the remaining children varied from day to day. As part of either the constant or fluctuating group, there were approximately 5 per cent, of the children who were either not enrolled at all, or who, from causes legitimate or otherwise, recited from day to day with classes other than their own. On the aver- age, therefore, a single test measures only 83 per cent. of the total number of children enrolled. COURSES OF STUDY, TIME ALLOTMENTS An important series of facts bearing directly upon the interpretation of the results of tests are those connected with the amount and character of the instruction, with ^That is, three days out of four. For instance, the four Trabue Lan- guage Scales were given at Gary as follows: B, April 13; D, April 14; E, May 16; C, May 29. A class selected at random from the Emerson school proved to be class No. 12, 6th grade. From the tests, 43 names were secured (official membership, 38) ; 49 per cent, of these children were present for aU four tests, 26 per cent, for three tests, 11 per cent, for two tests, and 14 per cent, for one test only. Approximately 55 per cent. of the official membership were present for all four tests. TESTS AND TESTING CONDITIONS 19 Figure 2 Percentage of Attendance at Time of Tests from March to June BOH 60 40 20- ATTENDANCE GRADES 4-8 INCLUSIVE B"ASED ONRELATtON BETVEETTCLASS ATTTTMUTSTqCEJVT TiriE OF TEST5 AND OFFICIAL ENROLLMENT WECK OMITTED 20 27 3 10 24. 1 a 15 ZZ 29 5 12 MARCH APRIL MAY JUNE The scale along the base of the figure is the time scale. The scale along the vertical axis represents the per cent, the actual attendance is of the official enrollment. The solid line represents the results obtained. The dotted line is the generalized^ curve of attendance. The general percentage of attendance indicated by the dotted line is 85%. The reader should note that in the time scale a week during which there was no school has been omitted. 'See XI of Appendix A, page 474. 20 THE GARY SCHOOLS the amount and conditions of study, and with the relation of the special work to the academic training. For de- tailed information on these points the reader is referred to the other survey reports. TESTING PROGRAM^ The testing period extended from the third week in March to the end of the second week in June (Figure 2). The subjects covered were reading, writing, arithmetic, EngUsh composition, and spelling. Counting each separate test and each repetition of the same as one, the total number of tests given was 55, and the total niunber of papers scored and tabulated 69,282. With one or two minor exceptions, only well estabHshed stand- ard tests were used (Figure 3) and these only in the fundamental subjects taught in the elementary grades. TESTING CONDITIONS The effort was made to complete in one day the giving of each test to the entire city. Owing to the department- alization of school work, however, and the lack of facili- ties for testing large groups at one time, it was necessary to reach classes in particular rooms where conditions were suitable, so that in some cases a few of the classes had to be tested on the day following the general test. The children were not all tested at the same hour of the ^See Table II, page 393, of Appendix A for a complete statement of the days upon which tests were given. TESTS AND TESTING CONDITIONS 21 day, but experimental investigations^ seem to show that the time of day is not a factor in determining results. The fatigue arising from routine work disappears before Figure 3 Tests Given at Gary Readinq Writing A-RTTTTM-R-.TTr Oral Cleveland Series B Gray- Free-choice Four Operations Silent Dictation Kansas Cleveland Courtis Composition Multiplication Trabue Fractions English Composition Spelling Original Story Cleveland List Tests Dictation Tests Reproduction of Story Composition Test Total 55 Tests — 69,282 Papers. the stimulus of a change of work and the new situation. At any rate the Gary results should not be affected by this factor more than the results from tests in other school systems. In general, the tests were given on Tuesdays and Thursdays, although a few exceptions occurred. Any one class within the grades tested was visited from 19 to 23 times.^ The conditions as to light, heat, materials, etc., under which the tests are given constitute an important group of factors. These are only partially under control and differ greatly from day to day. However, they differ no more for the testing work than for the regular school work. The tests were given in regular classrooms, the children used their usual pens and ink, or pencils, and ^See: Heck,W. H.,Journalof Educational Psycliology,.Vol.V, page 92. ^See discussion of Table II, Appendix A, page 394. 22 THE GARY SCHOOLS sat at their usual desks. No unusual conditions were noted. One or two classes, mainly in the lowest grades, were tested in rooms with unfavorable desk conditions, but, when necessary and possible, adjustment of rooms was made to provide the children with suitable desks and suitable tools with which to work. However, in general, particular care was taken to test the children during periods when they were already engaged in aca- demic work, that there might be as little disturbance as possible in changing from one type of work to another. That is, the children were not called in from the play- ground or from their shop work to be tested. The re- sults secured cannot Justly be attributed to unusual conditions of this character. The manner in which tests are given is a factor deter- mining the degree of response made by the children, and an exceedingly difficult factor to control. A child tested for the first time by a stranger may be thrown into a nervous panic which absolutely inhibits intelligent re-' sponse to the test situation, although ordinarily the niunber of such cases does not exceed lo per cent, of the group at most. Again, the exigencies of the testing work require adjustment of instructions and explanations to the varying conditions which arise in classes of different grades and at different hours of the day. In general, the tests were given by the author and his three assistants,^ ^Messrs. Packer, Ashbaugh, and Brandenburg. Mr. Brueckner, Mr. Lauer, and Mr. Richardson assisted in the scoring and tabulations, and all had a part in the preparation of the report. TESTS AND TESTING CONDITIONS 23 all of whom are men professionally interested in meas- urement and experienced in giving tests to school chil- dren. For some of the tests, however, it was neces- sary to use many more than four examiners. Through the kindness of Professor Judd, graduate students were secured as needed from the classes in the University of Chicago. These were given the necessary specific training the day before they were used as examiners. Every effort was made through frequent conferences and direct training to keep conditions uniform. Particular care was given to the timing. Whenever possible the examiners used automatic timers, consisting of a clock with electrical connections so arranged that it could be set to give automatically the starting and stopping signals. For long or very short intervals, stop- watches and foot-ball timers were used. In many cases the teacher was given a timer also and asked to check the examiner's timing. The variations noted were small and often due to the difference in the reaction times of teachers and examiners. For the most part, the timing was satisfactorily done and the errors kept within 2 per cent, of the total time interval. Only one gross error in timing was discovered and the results for that test and class were rejected. Variation in timing as an explana- tion of scores was thus reduced to a negligible factor. Previous to the survey, very little measurement work had been done in the Gary schools. To accustom the children to the taking of tests and to the examiners, as well as to give the members of the staff an opportunity 24 THE GARY SCHOOLS to become acquainted with the buildings and the intri- cacies of the program and organization, the first three days were devoted to very simple psychological tests which could have no disturbing meanings to the chil- dren. The first day a test on copying figures was given and scored. The children were allowed to examine fully the test itself and the tuning device. They scored each other's papers. The second day three trials of the same test were given, one after the other. The third day's work began with a test in canceling triangles, next the fifth trial of the test in copying figures was given, then a second trial of the test in canceling triangles. By this time the children understood the nature of the work, were fully adjusted to such details as starting together, turm'ng over papers rapidly, and stopping promptly on signal. It may be said that the Gary re- sults are too high because of this special preparation, but no part of low scores can justly be attributed to undue nervousness at being timed, or to undue fear of the tests themselves. One factor, the effect of which it is difficult to ev;alu- ate, is the disturbance caused by the survey itself. Teachers were subject to inspection, tests, question- naires, etc., for a period of several months, and such ex- periences are not conducive to whole-hearted teaching effort. As far as the testing work itself is concerned, the disturbance and loss of time were smalP and probably Wlowing an average disturbance of 25 minutes per test day for 24 days, the total time taken for tests amounts to 10 hours. During the TESTS AND TESTING CONDITIONS 25 more than offset by the stimulating effects of repeated measurement upon both teachers and children. The children seemed to enjoy the tests and frequently expressed the confident opinion that they had done well. The teachers and principals were interested also, and from tlie superintendent down to the children themselves there was full cooperation. There is every reason to believe, therefore, that the results secured represent fairly the work of the children, and, subject to the general quali- fications to be made in Chapter VIII, constitute a fair measure of the children's abilities at the time the survey was made. SCORING Whenever possible, the tests were scored by the children in duplicate (that is, by two individuals) and later the scoring was checked by the examiners. It should be particularly noted, however, that in every case the original work was unmarked by the children, the scoring being done upon specially prepared answer cards. Most of the original material is still on file, un- marked, just as it came from the children. This made repeated scoring possible so that errors caused by faulty work have been almost completely eliminated.^ Wher- same period, 165 hours (55 days, 3 hours a day) were allotted to regular work in the subjects tested. That is, the testing work at most decreased the regular classroom instruction by 6 per cent, during the actual time the tests were being given (11 weeks), and by less than 2 per cent, if the entire year is taken as a base. iln spite of every precaution, a few minor errors are discovered at each rereading of data or proof. 26 THE GARY SCHOOLS ever the scoring involved more judgment than merely checking answers right or wrong, it was done entirely by members of the staff, and then often only after suitable training on standardized material. Special care was taken in scoring all eighth grade papers. The data given in the tables which follow may, therefore, be depended upon to represent correctly the actual results secured. TABULATIONS Tabulation was carried on by paid, specially trained assistants. For the most part, these were students of the Detroit Normal School, members of the author's own classes in educational measurement. Every care was exercised to check each result. In the general tables, however, fractions and small irregularities have been ignored. The results are correct only to the nearest tenth of an example, or to the nearest whole per cent. For the general discussions, curves have been smoothed, approximations used, and conclusions drawn from general tendencies rather than from minor irregularities. How- ever, in the technical discussions, precise and detailed information is also given. CONCLUSION From the foregoing paragraphs it should be evident that the work of the most representative Gary schools was measured with due regard to proper control of es- sential conditions, and that equal care has been taken TESTS AND TESTING CONDITIONS 27 in scoring and tabulating the results.^ In the succeeding chapters, in which the data are presented, little further reference to these phases of the testing will be made. ^For a full discussion of this topic see Part II of the Seventeenth Year- book of tjie National Society for the Study of Education (1918), par- ticularly Chap. II. III. HANDWRITING §1. General Results HANDWRITING has long held a prominent place in American schools. At Gary the annual time allotment is 329 hours, or 7 per cent, of the total time given to the fundamental subjects. For fifty Ameri- can cities the corresponding average allotment is 388 hours, which is also 7 per cent, of the total.^ Gary, therefore, is typically American in the emphasis put upon this school art. SECURING AND SCORING SAMPLES Samples of children's handwriting in grades 2 to 12 were secured in three different ways. The tests used were the Cleveland Free Choice Test, the Courtis Dicta- tion Tests, and the Composition Test.^ These were all given by the special examiners. The teachers took no part in the testing, although they were present in the rooms at the time the tests were given. The various samples were measured as to the two most fundamental characteristics of handwriting : rate, or num- ber of letters written per minute; and quality (general merit) as determined by comparison with the Ayres Hand- writing Scale. However, to free the results from any pos- ^See The Gary Schools : A General Account. ^For the meaning of these terms, see pages 48 to 53 of this book. 28 HANDWRITING 29 sible question as to the reliability of such scoring they will be presented first by means of representative samples. Sample "A" (Figure 4, facing page 30) represents the characteristic end product of the training in the elemen- tary grades. It is the writing of an eighth grade child in the free choice test, was written at the rate of 122 letters per minute, and is judged to be equal to quaHty 45 on the Ayres Scale. It represents approximately the median score^ made by the entire eighth grade group (generalized eighth grade city wide score: free choice test, 44 Ayres; composition, 42 Ayres; dictation, 39 Ayres)^; that is, about half the eighth grade children at Gary wrote as well as, or better than, this sample, half wrote as poorly as, or worse than, this sample. The reader thus has the opportunity of judging for himself whether or not this performance under the test conditions represents a satis- factory result of eight years' training in writing. ^The median is the mid-score, a score such that there are as many scores the same or larger as there are the same or smaller. More pre- cisely, it is "that point on the scale of the frequency distribution on each side of which one-half of the measures fall." (Rugg.) An approximate method has been used in computing the medians in this report. This method yields correct results when the total number of scores is even; but when the total number of scores is odd, the result is in error (too large) by -2^ th of a step, when "n" represents the fre- quency in which the median falls. Throughout the report the non-technical reader may read "average" for "median" without serious error and with no change in the general thought expressed. ^For actual median quality, see Table XV, page 76. The best score made by any eighth grade class in any writing test was 48 Ayres, the lowest, 35 Ayres. 30 THE GARY SCHOOLS Two other examples, also from the free choice test, will serve to define further the quality of writing (Figure 5, page 31). Sample "B" represents writing of the quality (55 Ayres) that is equaled or exceeded by but 12 per cent, of the eighth grade children, while sample "C" represents the quality of writing (30 Ayres) which is equaled or exceeded by 94 per cent, of the eighth grade children. In other words, most (82 per cent.) of the eighth grade writing falls between qualities "B" and "C." If the values were to be based on any one test, the quality of the samples would need to be changed somewhat, but in no case would the change amount to more than half a step on the A3n"es Scale. The same samples may be used to extend the illustra- tion to other grades. For instance, in the composition test^ the writing of but one twelfth grade student in four equals or exceeds the quality of sample "B," while the writing of but one twelfth grade student in twenty is as poor as, or worse than, sample "C." That is, approxi- mately 75 per cent, of the twelfth grade writing falls between samples "B" and "C." In similar fashion, approximately 45 per cent, of the fourth grade writing falls between samples "A" and "C." Half of the fourth grade writing is worse in quality than sam- ple "C." The generalized city wide median scores^ for both ^See page 76. ^For explanation of the sense in which these terms are used see XI of Appendix A, page 472. D. O g '-S S goo ? g^ 2 >. o o ii D '^ 3 bO li ^ bO O — ^ 3 uo O fe ^ CO ■ ■ Ql ■>-' is a.^ ci c3 (u rrrs O en 5 >'t) CI (uij *j +j - tn " ci3 . (u -pis ~ 6 ^.S ^ P = SiJ »-- . O "O o o . . tn .rt " 1) ni -S £ '+=' £ s "Set! 5? •-• aj S c« 5 fe P< u >-i C ^ !5 M O >. >. >H tH > > H*5 - I S6 to * * f4 « * * * w O " * « * T3 : >> ' cj a CO 1) JIh' o-o bCX) bjo d c rt w , "aS rt 11 1 ■=> 1 •=> 1 1 ■<* 1 1 o o 1 O 1 t- — eo 1 to 1 — * * # « * la ooo oco t- »oeo OlO * tn • C K/ <" ^ \o ^ t^ o o 5 lence ent • <0 »H : o-o . 3 . O a : so :KX :W o-s *■> o 0,0 M 10 40 so 40 so 60 70 ao 90 100 110 . UO _ IJOJ l£TTER5_P£R niNUTe The scale along the base of the figure represents rate, or number of letters written per minute. The scale along the vertical axis represents quality on the Ayres Scale. The heavy solid line shows Gary results in the composition test; the heavy broken line indicates free choice test. The light solid line represents the Grand Rapids results in the composi- tion test, light broken line in the free choice test. On all the curves positions of the various grade medians are indicated by figures. Curves show that the fifth grade in the free choice test at Grand Rapids wrote at about the rate and with about the quality of the twelfth grade in the composition test at Gary. For both Gary and Grand Rapids there are only slight differences in quality in the free choice and composition tests. The Grand Rapids rate in the composition test is probably not comparable with the Gary rate, as at Grand Rapids the writing of the compositions was timed only in the most general way, while at Gary the time each composition was finished was noted. How- ever, it is probable that the only effect of this difference would be to shift the position of the curve with reference to the rate axis, not to change its character. For a statement in regard to the conditions under which the Grand Rapids tests were made see Table IV. HANDWRITING 4r -g oqqqooo qqqoo 00 -*' CT) CO c^ ^ W '^ IN CM Tf -^ Ln.-iMqooM CO Q Oi ^ CO [>•' ^ CO iT: U5 (O i> IN 00 >-l CO CO CO t> t> TJ< 00 --I Lfj 00 in cvj cvi CM CO CO ■* W CO I> (35 t> 00 -H CO 00 c^ Ol i-H ID CM [>' cvi ■^ CM CO CO Lo Lo tn c> m i> 00 LO t-H CO o o r-l CM CO 05 LO o lO CTl d oi i-H to CM iri (D CM CO ^ to !0 1> O IN ooooooo qooooooo O rt 00 IN t~-' in LO CO c^coco^incot^oD tNqcoLninc7>oo C7i CM Ln O Tt' 00 cm" CO -^ -^ Ln Lo in CO qoOCMrnoOOO o CO —I o; CM IN co' CO ■* m m CO CO IN 0C)inC7)COCMCM^CJ>CDC0 9 23 d '^ CM CO Lo in t^' 00 COCOCOCO^Tji^TTTl--^! qcMiniNoiaicoincoo ■(CMCO-*LO(OtNOOOiO.-lCM at o ^ d o ij K e "S "H ° 2 C.2 a c a;'43 oj >^ >> (U ss, 0) 's S " 2 u 6 u 42 THE GARY SCHOOLS quality of their work. However, the rate of writing at Gary corresponds closely to the results that have been obtained in other cities where children were writing freely and did not know that quality of handwriting was to be considered.^ For the free choice test the papers were sorted by grades for quality and the average rate found for each quality, as was done in Cleveland. For instance, in the fifth grade at Gary, 128 papers of quality 20 averaged in rate 58 letters per minute, while 14 papers of quahty 50 averaged 62 letters per minute, an increase of 4 letters per minute. But at Cleveland the papers of quahty 20 averaged 73 letters per minute, while the papers of quahty 50 averaged 57 letters per minute, a decrease of 16 letters per minute (Table V, page 43, Figure 10, page 44). In other words, in Cleveland the poor writers wrote rapidly and the good writers slowly, but at Gary, except for the very, very poor writers who wrote very slowly, the good and poor writers wrote at about the same rate. The difference is probably due to the differ- ence in the effect of the training in the two cities.^ SCHOOL TO SCHOOL COMPARISONS In the dictation test, Froebel has 7 classes markedly above the city average and 4 markedly below. For the composition and free choice tests the figures are i ^Fourteenth Yearbook, National Society for the Study of Education, Part I, pages 56 and 70. *See also pages 252-253, Survey of St. Louis Public Schools. The St. Louis curves resemble Gary's, not Cleveland's, but at a lower rate level. HANDWRITING 43 above, 6 below, and 2 above, 7 below respectively. For the Emerson school the figures are 2 above, 5 below; 2 above, i below; 2 above, 2 below (Table VI, page 45). The other schools show similar variations. The conclusion to be drawn is that there is almost no trace of constant differences from school to school. The differences in the organization and administration of the four schools and TABLE V Rate — Quality Development m Gary and in Cleveland AVERAGE RATE FOR VARIOUS QUALITIES QUALITIES CASES FiFTS Grade No. OF Cases Eighth Grade GARY CLEVELAND GARY CLEVELAND 101 9 62 20 128 58 73 10 78 97 30 128 64 66 39 94 88 40 61 66 63 53 93 85 50 14 62* 57 23 92 81 60 6 63 57 7 93, 78 70 1 45 57 78 80 54 75 90 51 71 iln this report, the range of scores in each interval of a distribution is from the value given in the table up to but not including the next higher value. For instance, 10 in this table indicates a range of scores from quality lo.o to and including quality i9.0g+, but not quality 20.0. •Average 65 if one extremely low score is ignored. The table is to be read as follows: Of nine individual records at Gary, whose quality of writing ranged from 10 to 19 on Ajrres' Scale, the average rate of writing was 62 letters per minute. Of 128 cases in Gary of quality 20 to 29 the average rate of writing was 58 letters per minute, while at Cleveland the rate for samples of the same quality was 73 letters per minute. The number of cases in Cleveland is not known. Other results in the table are to be read in similar fashion. 44 THE GARY SCHOOLS Figure io Rates of Writing in Gary and in Cleveland* RELATION BfTWEtN RATE AND QUALITY GARY a"GRADC JO 60 QUALITY The horizontal axis represents quality. The vertical axis represents rate. Position of circles represent both rate and quality. The number of letters written per minute for each quality is given in the circles. Solid line represents eighth grade, dotted line represents fifth grade. Lines marked "G" represent Gary scores, and those marked "C" represent Cleveland scores. Inferences: In Cleveland there is an inverse relation between rate and quality. The writing of the children who had the highest rate was the poorest, while those who wrote most slowly had the highest quaHty. At Gary, except for very low qualities, the rate of writing was the same for all qualities. The children who wrote most slowly at Gary had also the poorest writing. in the social conditions of their pupils are not reflected in any positive fashion in the results of the writing tests. *See Table V, page 43. HANDWRITING 45 u Q g .a P 01 •^taui'^ t-Ncceo X W «3 C 4) a> (U JJ (U f=4We=>m I o S 2 a t3 t3 o HH-gn » ^1 ^«l s cS fiog s ill 1 m C =« ■^ te 2 |5 JJ ''S g ^''% "S'w 2SS g 1^ 11 < 46 THE GARY SCHOOLS CONCLUSION The results reported above indicate consistently that handwriting instruction at Gary is producing very small effect upon the product. The improvement in quahty is small and does not keep pace with the change in the rate of writing. §2. Critical Discussion characteristics of handwriting Development of skill in handwriting is essentially the development of a motor habit. However, good writing is dependent not alone on the perfection of motor habits, but on the harmony between the visual, movement, pressure, and thought "controls" which keep the writing process going and direct it. In spite of the complexity of writing ability, perform- ance in a writing test furnishes a simple record which is definitely objective and easily measurable. This measurement may follow either of two hues, measure- ment of gross characteristics only, or measurement of analytical details. The gross characteristics are rate (number of letters written per unit of time) and quahty (or goodness, general merit, etc.). For survey purposes measurement of gross characteristics alone is of impor- tance, and only such was made in this case. As the function of writing is to record thought in such a form that it can be easily understood by others, legibility is sometimes given as a quality to be measured. HANDWRITING 47 Legibility, however, is itself complex, being dependent upon the relative excellence of form, alignment, and other characteristics. In comparing two samples it is quite impossible to decide on the basis of subjective judgment alone whether or not one sample is more legible than the other. ' A real measure of legibility requires accurate measurement of the time taken to read a sample under carefully controlled conditions. On the other hand, any and every sample produces on a reader an impression of goodness or badness into which the many particular im- pressions blend. Accordingly, the expression "quality" or "general merit" will be used in place of legibiKty. That is, it is believed that whether measurement is made by the Thorndike or by the Ayres Scale, comparison proceeds on the basis of the impression produced by the samples as wholes, and not upon basis of legibility alone. TESTING CONDITIONS Two direct problems are involved in measurement of handwriting: (i) control of the conditions under which the samples are secured, and (2) measurement of the rate and quality of the resultant writing. Writing, as a motor habit, is under voluntary control. That is, an individual may, within Kmits, vary the rate and quality of his writing at will. In general, the more a person has to hurry, the less care he will be able to give to the formation of his letters, and vice versa. Hence, the performance of an individual will vary as the con- 48 THE GARY SCHOOLS ditions under which he writes vary. It becomes im- portant, therefore, to choose such conditions that the samples secured may be of such a character that in- ferences as to the abilities of the children will be reliable. The physical factors influencing writing may be dis- missed at once. At Gary all tests were conducted in regular classrooms, and the writing was with pen and ink (pencil was used in the lowest grades and in a few other cases where ink was not available) on paper of good quality, so that temperature, humidity, ventilation, materials, etc., were those which usually prevail in school work. The main factors to be controlled were, therefore, two — ^incentive and subject matter. METHODS OF SECURING SAMPLES Teachers of writing often base their Judgments as to children's abiHty upon samples secured by asking the children for their best writing. This emphasis on quality, as everyone well knows from his own ex- perience, leads to a highly specialized performance quite unlike the usual writing of the individual. The purpose of this survey, however, was conceived as an effort to determine real abiHty,^ not maximum perform- ^The reader should remember that the real ability of an Individual is his median performance or effective ability. That test is to be judged the most perfect test of handwriting which reveals not the best writing of which the individual is capable, nor the worst which he will do, but the quality nearest like that shown by his penmanship under everyday conditions in which the writing activity is functioning norrQall3^ The author considers it of utmost importance for a correct understanding of HANDWRITING 49 ance. Therefore, quality was not emphasized in se- curing samples. A method in more general use is that of giving the children material to copy or write, and wording the in- structions in such a way that the children understand they are^free to determine for themselves the rate and quality of their writing. A test of this character is known as a free choice test. The instructions recom- mended by Starch are "write as well as you can and as rapidly as you can." In the Cleveland Survey, the teachers were told that papers would be marked for both speed and quality. The free choice test has been widely used. The instructions at Gary were made to conform to the Cleveland model.^ The objection to the free choice test is that the element of choice prevents a real measure of the efficiency of the teaching. For if, as Freeman has shown,^ certain levels of quaHty and certain rates of work are required for the results secured from the tests of handwriting in this survey that the distinction between performance and abUity be clear. The point of this footnote is that the ability of an individual in handwriting is not to be inferred from a carefuUy prepared letter of application for a posi- tion in which the quality of writing is known to be a factor determining employment, nor from hastily scribbled notes written whUe riding on a train, but from the kind of writing most often appearing in the daily work. Test conditions should be such that the samples secured show writing of this type. See also Chap. VIII. ^See Judd, C. H., Measuring the Work of Public Schools, Cleveland Survey. *See Fourteenth Yearbook of the National Society for the Study of Education, page 72. 50 THE GARY SCHOOLS business life, the efficiency of school training should be judged by the percentage of the total product which measures up to the standard. In a free choice test a child capable of writing at the required rate and quality may choose to write at a much higher rate and with consequent sacrifice of quality, or may emphasize quahty at the e:5q)ense of rate. In other words, for measurement of efficiency the children should write at the standard rate (since rate of writing may be controlled through the rate at which material is dictated), and the resultant writing measured for quahty. If, however, the material used is of unusual spelling difficulty, or is not easily comprehended, such difficulties may invahdate the test as a measure of writing abihty. The material dictated at Gary served also as a test of abihty to speU certain words. The value of the test as a writing test will be discussed later. A fourth method of securing samples which represent children's ordinary^ writing, and probably the best, is to use material written for another purpose. At Gary, the papers written in the composition test, where quahty of writing was not emphasized were used also as samples of the children's writing. As a check upon these results, ^It should be recognized, of course, that "ordinary" is here used to mean the kind of writing which the children have been in the custom of using for their written work in the composition class. It may or may not resemble the "ordinary" writing of the child out of school. In a class where the teacher of English composition has emphasized quality of writing the children might pay more attention to quality than in a class where the English teacher did not consider writing at all. HANDWRITING 51 in certain classes reproductions of the simple story used as a test of comprehension in reading were also scored for handwriting. The second factor which affects the quahty of the writ- ing is the material written. Obviously, if much attention must" be given to the understanding or spelHng of un- familiar words, Httle can be given to the writing. Most free choice tests use as material a famiHar stanza, as "Mary had a Httle lamb," which is written again and again. In Cleveland the material for all grades was the first three sentences of Lincoln's Gettysburg speech. The instructions to the teachers provided, however, that as a preliminary preparation the pupils should "read and copy this (material) until they were thoroughly familiar with it and practically knew it by heart." The same material was used in the Gary test and the same plan of preliminary preparation by the teacher^ was followed. In Cleveland, the tests were given by the teacher, and in Gary, by specially trained examiners. No attempt was made in either Gary or Cleveland to find out how completely the teachers had availed themselves of the opportunity to practice on the test material. In the dictation exercises the words used to test spell- ing were taken from Ayres' thousand commonest words in written English. As far as possible, no test word was ^See page 487. ^One day's preparation was provided for at Gary; at Cleveland the amount is not specified, but may have been as much as a week. See : Measuring the Work of the Public Schools, page 235. 52 THE GARY SCHOOLS used which was not of less spelhng difficulty than the other words in the sentences. It was expected that the dictation sentences would be easy material, well within the powers of the children, but owing to the limited abilities of the Gary children in spelling, it is probable that the material was too difficult to afford a true measure of the children's writing ability. Therefore, the results from the dictation tests should have the least weight in making decisions as to the character of the product of writing instruction at Gary. The rate of dictation was based upon a number of determinations by Freeman,^ Courtis^ and others, of the rate at which children write when writing freely (as in reproducing a story). In other words, the material was dictated at the rate at which children ordinarily write, in order to secure samples whose quality might correspond to the quality of their ordinary writing. This method prevents over emphasis on quahty. It forces some children to write at what is for them an abnormally high rate. The purpose of the test is not to secure the best writing of which the children are capable, but to determine how many of the children have been developed to the required quality level at the given rate level.^ As the results of the free choice test show, the rates at which the material was dictated were almost exactly the rates fourteenth Yearbook, Part I, National Society for the Study of Education, pages 56, 70, 76. ''The method of this test did not function at Gary because there was no tendency on the part of the children to over emphasize quality. HANDWRITING 53 of writing chosen by the children in the free choice test.* From the foregoing discussion it will be seen that con- clusions as to writing abilities of the Gary children are based upon a series of measurements which give oppor- tunity for significant variations in performance. METHOD OF SCORING The scores for rate of writing in the free choice test were determined as follows: Immediately upon the completion of a test the examiner had the children ex- change papers. He then passed each child a score card.^ In all grades the children filled out the blanks on the card and then in grades five to twelve, by the aid of the count of the letters of the test passages printed on the back of the card, they determined the number of letters that had been written in two minutes. Later this count was verified by the examiners, mistakes noted and counted, and the rate computed. It was assumed that the letters written were of equal difiiculty, although this was known not to be the case. The letter "i" is much easier to make than "g," for example. However, in writing one hundred words the relative frequency of the different letters is so constant (Table VII) that the errors due to differences in the difficulty of the various letters are negligible. The third and fourth grade scores may be in error by a small ^In grades eleven and twelve the controlled rate was approximately ten letters per minute lower than in the free choice test. ^See Appendix B, page 488. 54 THE GARY SCHOOLS s ^ 3 ^ < H i N o o i^ O-HinWlN'HMTl't^ oo-^^cn-Hrtt^Tj-oNMcg - 8 d 15 M -Hi-IM i-H tOi-HU3-HC0r-(t^C:>'1*m-co-^c^a.-ica "* 8 § OiMt>OOmcOlOOitn C^ W *-t Wr-iot^co>-it^t~Mmcoio CM i 00 S 6 L? O) .-1 ■<:<■* M N CO LO t^ coirtcoif3 i-HiH iH CM CM g t^ 3 &? iniocqco CM s to o £? coi-i--i'-< iHiH 1-1 '^ 8 i CMC^JUSt^t^COIMSOOl rHrH t-H i-) CM CM LO 6 i*; t>C^^•J-H CM 8 d 15 oc rt CO^CO 05 to 'J' -^ c^ ^ CM g "* 6 ^ t>rtinc t-H fjcqi-H ootoooLnMi-i -^ 8 i (D'-f^<^ainc «# PO --I i-H 00 CO 6 fe° 00 CCOOCIOO 05lr^ t^oo 00 § i ^ ^^^^ t-HM c^rt ,-^ CM a ^X>< i HANDWRITING 55 amount from this cause, but as the relative difficulty of the letters is unknown, it is not possible to apply a correction for this factor. A far greater source of error in determining rate of writing is found in the difficulty presented to the children by certain words. For instance, the rate of writing of the third grade in the free choice test was but five^ letters per minute. The fourth and other grade rates for these same tests are in substantial agreement (fourth grade 42-44). It is probable, therefore, that some factor was operating to depress the third grade scores in the free choice test. Upon examining the test material from this point of view, one is struck by the difficulty of the first phrase: "Fourscore and seven years ago." This is probably the whole cause of the low score. The material was too difficult for third grade children. However, no trace of similar effects is observable in other grades. In the dictation tests, no measurement of rate was necessary as the tests were constructed to be dictated at a given rate. The formula used in the construction of the tests was: T = nr + ^ in which T = the total time allowed, n == the number of letters to be written, and r = the rate in seconds per letter. The correction y\ is an allowance made for the time needed for dictation, the rate of reading being ^See Table TV, page 41. 56 THE GARY SCHOOLS ten letters per second. The value of r for the various grades was as follows: Grades 2 3 4 5 6 7 8 r = 2.40 1.87 1.36 1.05 .88 .75 .64 In the composition and reproduction tests the children counted the number of words written and recorded their scores on their papers. These scores were later verified by the examiner and transferred to cards. For the composition and reproduction tests the scores in words written per minute were converted to letters per minute by determining the average number of letters per word for a series of papers in each grade. After four or five papers the results are constant. In the sixth grade ten papers chosen at random were used (Table VIII, page 57) . In other grades tabulations were carried to constant results only, not less than five papers being used in any grade. The average values finally selected for converting word per minute into letters per minute will be found in Table IX, page 58. MEASUREMENT OF QUALITY The Ayres Handwriting Scale ("Three Slant" edition) was employed to determine quality. This scale con- sists of a series of specimens of handwriting ranging in quality from very bad to very good. The samples were chosen on a basis of "legibility"; that is, careful records were made of the time required to read a large number of different samples and certain of these were chosen HANDWRITING 57 Ah U w w 3 o « g oocoecoocoeoeoeococo tDtOOOOilO(M t> t- t- t- 00 0> oooon<'-ia5CDxf<,-(co ■<*000-^<-lO»0-rt00050 o :S s .a ^ ^ i3 3 a 4} £3 S -Fi ii> o 1 ^ *j i 3 vg m' g o p T) 3 TJCO to Ci ^r H >4 d tn s S -u ^ Q OS i^ \f fl (U pj o _g P-i o CO O o tn •s 00 O > ii OJ rQ X) fO O O Ilia S8 THE GARY SCHOOLS as units in a scale. The differences in legibility from sample to sample were made equal.^ In using the scale, a sample of writing is compared with the sample of the scale until one is found which corresponds in general quality with the sample being measured. The number of the scale sample is then noted and the specimen being measured is given the same value. If the sample being measured falls between two scale samples, it is given a value to correspond. RELIABILITY OF RESULTS In order to make the reported handwriting results as reliable as possible, the eighth grade papers in all TABLE rX Average Values for Changing Rate of Writing from Words Per Minute to Letters Per Minute GRADE composition REPRODUCTION TEST I TESTH TEST in 4 5 6 7 8 9 10 11 12 3.4 3.5 3.6 3.7 3.9 3.9 3.9 4.0 4.0 3.4 3.4 3.4 4.0 4.0 4.0 4.0 3.4 3.6 3.5 3.5 3.8 This table is to be read as follows: The rate of writing in the com- position test in the fourth grade was changed to letters per minute by multiplying the number of words written per minute by 3.4. The resulting values are the true rates within approximately i per cent. ^See Bulletin No. 113, Division of Education, Russell Sage Foundation. HANDWRITING 59 the writing tests were scored by from three to five judges and the median scores taken as the real quality of the samples. The scores are given in full for one class (Table X, page 60 and Figure 11, page 62). Although individual judges differ widely on some samples, yet the scores as a whole show close agreement. Of the 170 individual ratings on the 34 papers of the class, 73, or 43 per cent., agree exactly with the median scores, 57 more, or 34 per cent., fall within five points, or half a step of the scale, and only six, or 3.5 per cent., of the judgments differ more than one step from the median value. The average deviation of the individual judg- ments from the medians is 4.3 points, slightly less than half a step on the scale. But 47 of the 97 deviations were positive and 50 negative, so that the median score of the class as determined by each of the judges alone is closely the same as the median class score determined from the median score on each sample. For two of the judges the differences are zero. Two judges differ by one point only, and the other by six points. If the actual median scores are used, all the differences, except one, are zero. The effect of combining the scores of the different judges is to eliminate the wide variations which occur with each judge on certain samples. The constancy of the general results is quite remark- able. The average deviation of the judges in 190 1 judgments of the papers of five eighth grade classes is 3.9 points. That is, a single judgment will, on the aver- age, differ from the median of five independent judgments 6o THE GARY SCHOOLS o s o o w in 1-1 O ooooooooooooooooooooo w o w u <: o % o > o H OOOU3U3000lO©»OOOOiOlOOOOU30 77 1 17 +17+1717 10O010OU3OOOOOOOOOU5OOOOO + + ++■ ++++ ++ ++ u oooo»ooioomooooioioooow3»oo 7+1 17 T 177 7 1 + m 000000in0U51i300000WU3U5000 1 117 +++ < 1 + 4! 1 1 1 +7+ 1 + 1 8i _ o 010 010 o vo ut) LO 10 00 o o oio 00 o 00 " CO CO CO -rl* CO ■<* -^ CO CO •<:1< lO UO U3 Tj* -.# -r^JI t- r^ r^ U3 »3 ioou3ico»jooutiouioooknioooou5mo c H < CO • T-H^ utio o ■ U30 (M • T-(CO T-(CO (MO (MO 00 lO tHO (MO (MO Q *>, ^, C^ t/1 ci (St <2^ o (u cr . is! fi *-> c ^ §43 S te « ■" fcl ■n s Tl (3 Mil en C S (U »-i O (LI > o 42 en tj en (U > en rt cS c hf) g P-( T5 o s (D en a a (U 43 Q 43 O o 4-1 fO > ■-f1 (U rn rn h d 52 u O ft ^ H 4:3 H >> 10+-' rn (U J2 d) fcl 13 '^ ^ .2 a FiGtJRE II Variations in Quality Scores Assigned by Means of Ayres Scalei HANDWRITING-VARIATIONS IN SCORING QUALITY-AYRES 20 40 60 A- B- C - D- E- Cl- S 17 A' B ■ C P ■ E a- A' B' C- D ■ Z ■ DEVIATIONS FROM MEDIAN +20+15 +IO+5 O -5 -10 -IS ■^ ■ ■ ■ CLASS 31 A — B — C — D — E — JUDGED 10 8 13 I I 34 A- B- c. D- £ - ^See Table X, page 6o. JUDGE E 62 HANDWRITING 63 Figure ii — Continued Diagrams at the left hand side of the figure represent individual papers. The numbers 8, 17, 31, 34, etc., refer to the numbers of the papers in Table X. Letters A, B, C, D, and E refer to the different judges. CL indicates values for class as a whole. Each light line in the diagrams represents the value assigned the sample by one judge. Each heavy line represents the value adopted as the true value of the sample. The scale along the top of the figure represents quality on the Ayres Scale. The diagrams show that for the class as a whole the class score as determined from the scores of each single judge, except D, agrees closely with the value as determined from the combined scores of the five independent judges. Diagram marked 8 represents the score of a sample upon "which there was close agreement between the five judges. Paper 17 represents the scores for a sample in which Judge D showed a wide variation. Papers 31 and 34 represent samples on which there was a little agree- ment between the different judges. The diagrams on the right hand side of the figure show the distributions of the deviations from the scores adopted as the true value of the samples. The scale at the top shows the magnitude and quality of the deviations. The scale at the right of each distribution shows the number of deviations of a given type, as do also the figures written in the diagrams just above the base line. Diagram for Judge A is to be read as follows : Out of the 34 papers. Judge A gave 3 scores which were 10 points higher than the proper value of the paper, 8 scores which were 5 points higher, 14 scores which agrxied exactly with the true value, 8 scores which were 5 points lower than the true value, and i which was 15 points lower. Note that Judges C and E have a tendency to score papers too low while Judge A has a tendency to score papers too high. Note also that in the distributions of the deviations for the class as a whole the dis- tribution is symmetrical about the zero point, that 130 out of 170 devia- tions are not greater than 5 points. 64 THE GARY SCHOOLS Q *05C > OOOOlOOO(NO(MOOu:it-OOlCOO(MU5«>u:( c/3 CO CA) 00 ^c/^cAX^c/D-aiK <<<■<■< <<;<<; 0000a5Oi-l(MoiooTt< CO tH U3 C<1 C^ 03 -^' to 00 O ■<* oi 00 T W M C 00 U3 CO t- oi ■>* 00 CO (M -^ TjJ TjJ <;<<:<<:<<;<<:<: <3 <:«<«<<<<<; < Uii:OiHC0 OOOOOC T-1 50 in o th N iO 00 u3 ci c a o ^ u C S M •^^ ^ ^^ § §8 s tn « "^ g > *^ ^ 66 THE GARY SCHOOLS less than half a unit of the scale. Of 1901 Judgments, 931, or 49 per cent., agreed exactly with the median value for the sample. Of the remaining Judgments, 497 were positive and 473 were negative. All the evi- dences show, therefore, that the Judges used the scale in a consistent manner, and that a class score, even when determined by a single Judge, would not be in error more than a small part of one division of the scale. The quality of the v/riting of grades other than the eighth was determined by the scoring of one Judge. In twenty two classes, however, mainly of the fourth and sixth grades, the papers were scored by two Judges. The median difference in the median scores for the various classes as determined by the two Judges inde- pendently is again a very small part of one step of the scale. These figures make it possible to say that the scores of the various classes may be depended upon to represent the quality of "writing" actually found in the papers within a third of one division of the scale. STANDARDS OF JUDGMENT To give the reported results real objective validity, the Gary Judges scored a set of "standard samples." Part of these were taken from the Thorndike Scale, and the rest are the samples published by Thorndike for the purposes of teacher training, known as Supplement A.* The values assigned these samples by the Gary Judges are ^Teachers College Record, November, 1914. HANDWRITING 67 w e g S *5 < K ^ ^ ^5 a P a '"' OOONCOrJfkOtOt-oO (M (M eg (M (M oi C. M OwPco r-(i-H-H(M' I (MNNNM I I rJ4 IMO CO I «5 U3 '^ Tl* 1-1 I I r-l ■ S fe; d o p Ot-00 I W t-(M i 11 H «3U5« ? S 1 1 1 f^ g 0100 <: c (M >< &3 \a ^ ^ p ca U5ir ) 'A c 3 ►-< 2 H S w |(M 00 C^ X H U3 ^ t-^iOO >z c coco-^ t— C w H ■ -( rt fl fl!5 ^i3 I 000 72 THE GARY SCHOOLS Table 13 — Continued This table is to be read as follows: If the relations of the individual scores for handwriting to the class median in the dictation test are com- pared with the relations of the score in handwriting of the same individ- uals to the class median in the composition test, it will be found that 40% of the individuals maintain the same positions in the two tests within one unit of variability; that is, within 2.5 Ayres units in the dic- tation test and 5 Ayres units in the composition test. Note that the coefl&cient of correspondence between the free choice test and the composition test is 86%. In other words, the free choice test reveals the quality of the ordinary writing of more than five sixths of the children, and, for Gary at least, is a good test. dictation test 2.5 points above. The three tests thus confirm each other in showing that the writing done by this class is slightly above the grade level for the city as a whole. On the other hand, class No. 10 Beveridge, 5A grade, is in the composition test 3.0 points below the general level, and in the free choice test 7.5 above the general level, and in the dictation test 2.0 points above. In this case the three tests give diver- gent results. If the 57 class differences in quality of writing in the composition test are compared with the corresponding differences in the free choice test it will be found that in 29 cases the class scores are either both above or both below the city wide scores; in 28 cases there is disagree- ment, one of the two scores being above and the other below the city wide scores. The comparison of the results from the dictation and free choice test gives almost ex- actly the same results, although the actual classes in HANDWRITING 73 01-lT^•oo•^•^0)WN^»oo■^(^)0«ool^^•-^oOi•-^•<*ol^l■*^J ++ I I I ++ I I I ++ 1 I I + 1 + 1 I a 0'*oooLnoa)Oco'>#oooocooNLnot>'mot~ooo +I+++I IT++I+++I+IT++T++I+I 1+ O O.J ot>iO'*M-*t>ooooLooooN'*c>-ooco->j'Lo-*minoto +++ 1 + 1 1 ++ 1 +++ 1 I I I + 1 I + 1 + 1 +7 1 m<;<;<;oucjfqrat>I>t>t~Ot^t»I>0000000000 <3 rH-<^i— li-HrHi-Hf-Hi— (i-H-^Tti— ti-HfHi-Hi— l-^-^f-HrH ^ asfMccicM ctyeeso^ a c a_ o_ e_ a S-S 03.^ g o g:2«13 S2 S g guu o o" gu gu gl3 o ■rfjs i2 !r! in S i2 M Sxi^ " k S " C^ja >2 C^ i2j2 2j3 S^, m !; 2 e ^ ^w'g a S a 2 6 t^gw e22E'a2S2Ege2a Qr-(CVIC0Tj'inc0^-000iO'-lMC0rfLntDt>00OO'-HNC0--IOCOOLOa>00(CNLr50l^OMincOOO ++ I ++ I I +++ I ++++T I I M ++ I + I I I + Nt>(MCDLn(ra5"*co(MLno'* I I +++ 1 +++ 1 +T I T I T+T 1 1 1 ++++++ 1 I lis LnTfoooooicMcoooo^t~-*ootooooLnt-ii>ot>ooioir3w +1 I++T+++I I++I++I I+++I I++I i+i 3S iMMooo)-*ooNooo>05cviffiQ'-ia)r-co«S^lJ2SSSS^SSR?3?3S3ScQSE:5NS ^•3 oi3 cat! §1 sTE 12; -s Ca U ■4-'* M o £ m (U _ 74 THE GARY SCHOOLS FiGXntE 12 Class Differences from the Median in the Composition AND Free Choice Tests 135 7 9 H (S « U 1921 23»2r«3l 33 3537 3? 41 4J4547 4951 5355i7 The heavy line through the center of the figure represents the city wide median scores for both tests. Distances above the lines represent the amount which class scores are above the median; distances below the line represent the amounts which class scores are below the median. The magnitudes of deviations are shown by the scales along the left hand vertical axis. The extreme variations are marked with letters to show the name of the school. B means Beveridge, F — Froebel, E — Emer- son, and J — ^Jefferson. Solid line represents the results of the compo- sition test (c), dotted line represents the free choice test (F.C.)- The reader should note that in some cases the results of the different tests are in close agreement; in other cases the two give very different results. For instance, class No. 2 is shown as approximately 5 points below the city wide median by both tests. On the other hand, class No. 54 is shown nearly 10 points above the median by the free choice test, and nearly 10 points below the median by the composition test. In the main, the curves show that the two tests agree in all grades within s points. HANDWRITING 75 which disagreement is found are not the same as for the previous comparisons. Comparison of the results of the writing in the composition test with that in the dictation test reveals a little closer correspondence. Even here, however, there are 34 cases of agreement and 23 of disagreement (Table XIV, page 73 and Figure 12, page 74). On the other hand, it should be noted that for the whole 57 cases, the amount of divergence from the city- wide scores is in most instances less than 5 points. In only 6 does the extreme divergence exceed 10 points. In other words, the three tests yield results which are in close agreement and the differences are relatively insignificant when it is remembered that the average difference of judgment in using the scale amounts to from 3 to 5 points. The rest of the difference is probably due to marked differences in training. For example, in the graph it will be seen that in the lower grades the extremely high scores are often those of classes from the Beveridge school. The training in composi- tion work in this school^ is shown to be much below that of the city generally. Consequently the difference in the quality of writing in these tests is in some way probably related to the differences in training in English composition. From the table as a whole, therefore, it is possible to draw the conclusion that there is no general relationship between the three t3^es of scores. For the Gary chil- ^See Chapter VI, page 234. 76 THE GARY SCHOOLS ^ U30»005rf<05eO-.^ CO ^"=!^ H OQ -^ CD -^ o -rjf Cd lO O rH CO coca ■<* i-IO H '-''' T-ICO « tHC(NJ 1 On? *^ rH iH tH '^ T:t< tHCO W T-i(M H H tH •«* Ol Ti< (M O T-f rH (N " T— 1 1— llH kO lO ' O B (M rj< i-( O CO U5 lO M P4 " rHCCiMi-l 00 W5 o o u C/3 g o 0\ u3omo«OT-t 5 1—1 \a U B tH toco-^ ■^ B < o o * Tj<«O«£>U300 Oi 05 . -3 O I-. u O* A 3 .1^ 3 C! C i_i a -ci o O -1-5 -1 - 53 o a sL — •-IK r « a M " 6 eg ■SCJ ..:S.^ S § ^ 2^9 « ^"^ s " nj r^ ui vo (u rt >> a ^03 HANDWRITING 77 dren, one of these tests is as suitable a measure of the relative quahty of handwriting as either of the others. Whether or not this would be true in other schools in which children are given a different type of training is, of course, another question, and one that cannot be settled on the basis of the present data. RANGE OF INDIYIDUAL ABILITY One other point needs to be considered, the range of abihty within the class. The distributions of the grade scores for quahty of writing were found to- gether with the standard deviations for certain grades, and the coefficient of variabihty, based upon the same (Table XV). The range of variation in grades four, six, and eight proved to be much less than in the other grades. But these are precisely the grades in which the papers were scored by more than one judge. In other words, the effect of multiple scoring was to reduce the apparent variabihty. This effect must be taken into consideration when making comparisons with results from other school systems. For instance, in Rockford, 111., the coefficient of variabihty for grades five to eight is 17 per cent., which is much lower than for the Gary eighth grade results (24 per cent.), but no details are given as to the number of judgments upon which each score was based, or amount of variation in the standards of the judges, and the hke. Therefore, in making judgments in this survey, one 78 THE GARY SCHOOLS should be governed rather by the samples of writing shown and by the comparisons from test to test and from grade to grade than by comparisons from city to city, except where tests are used to show general relations. IV. SPELLING §1. General Results SPELLING, like handwriting, is an ability which is considered easily measurable, and there are sev- eral scales and tests available for the careful evalu- ation of the results of teaching effort. Moreover, at Gary the annual time allotment for spelling is 496 hours as compared with 482 hours, the average time allotment of fifty American cities. TESTS USED Three methods of testing spelling were used at Gary. Conventional Hst tests were given to measure the con- ventional school product of the teaching of spelling. Next, timed dictation tests were used in an attempt to control the rate of writing, to prevent deliberation and to insure both automatic spelling and attention to ex- pression of thought. Finally, the papers written in the composition test were scored for errors in spelHng. The conclusions as to spelling ability are thus based upon three very different types of results. 79 8o THE GARY SCHOOLS » ro P4 <-< C/J w H Pi t3 O :a fH X % O z ^ >>cl rt: 2 "5" e o > b *- CI O^ ^ g !.^ do S_'^.H aiii J? a^ ^ bS S^^td (U (U< a305^^ ,^ .^ ,^ ..^ in> S -3 SPELLING 8i LIST TESTS The words used in the list test were the same as those of the Cleveland Survey (Figure 13). In the figure is shown also the division of the Ayres Spelling Scale in which the words occur, and the standards of accuracy for the different grades as determined by Ayres' investi- gations. The words are so chosen that the accuracy of spelling should be the same from grade to grade; that is, the fourth grade children should, according to Ayres, be able to spell the words by which they are tested with the same degree of accuracy (76 per cent.) as the fifth grade children are able to spell the words in the fifth grade test (76 per cent.). In other words, the increase in difficulty in the spelling tests is supposed to keep pace exactly with the increase in spelling ability. The average^ ac- curacy from grade to grade is thus constant. In the Cleveland Survey there were no spelling tests in grades higher than the eighth. At Gary, however, the eighth grade words were given also to grades nine, ten, eleven, and twelve. It must be particularly noted that this repetition of the same words through several grades constitutes an entire change of method. The purpose of the change was to determine how rapidly the abiHty to spell the eighth grade words was developed by high school work. ^The term "average" is used throughout this report in its popular sense — to indicate the arithmetical mean, or the sum of all the scores divided by their number. S2 THE GARY SCHOOLS The list tests were given by the room teacher in the presence of the examiner. No suggestions were made as to how the test should be conducted, other than to ask that it be given in the "usual manner." Accordingly- some teachers dictated the words slowly, some rapidly. TABLE XVI Average Accuracy in List Words GARY RESULTS AYRES STANDARD CLEVELAND GRADE ACTUAL GRADE AVEHAG7? GENERALIZED AVERAGE OF ACTUAL RESULTS VALUES 84 CITIES DIFFERENCES 2 51 51 77 —26 74 3 56 54 77 —21 78 4 53 54 76 —23 73 5 51 54 76 —25 75 6 58 56 76 —18 78 7 62 57 76 —14 76 8 53 57 76 —23 80 AVERAGE 54.9 55 76 —21 76 Scores op High School Classes in Spelling the Words of the Eighth Grade Test Grade 9 10 11 12 Actual Score 57 60 71 69 79 77 80 Generalized Score^ 80 »See Appendix A, page 474. This table is to be read as follows: The average accuracy of spelling of the test words for the second grade by the second grade children at Gary was 51%. Ayres' standard score for these same words was 77%. The Gary second grade record, therefore, is 26% below the standard. The average score made by Cleveland second grade classes in spelling the same words was 74%. SPELLING 83 Some read the words twice and gave explanations of meaning, or illustrations of use that were helpful, others did nothing but read the Hst of words. That is, the conditions under which the tests were given varied from room to room. On 'the average, the Gary children spell the test words with an accuracy of approximately 55 per cent. (Table XVI, page S2). The scores made by the high school classes on the eighth grade words show a gradual im- provement. The record of the eighth grade class on the eighth grade words is 53 per cent., but by the end of the twelfth grade the same words are spelled with 80 per cent, accuracy. COMPARATIVE DATA^ For the list tests used at Gary, general standards are available, based upon tests in eighty four American cities as well as the results obtained in the Cleveland Survey where precisely the same tests were used. For instance, Ayres' standard for the eighth grade words is 76 per cent., the eighth grade score in Cleveland was 80 per cent,, the Gary score was 53 per cent. (Table XVI). The Gary averages are uniformly about 20 per cent, below Ayres' standards. That is, the Gary scores parallel the Ayres standard, but at a much lower level (Figure 14, page 84) . This result in connection with the fact previously brought out in regard to the scores made by the high school classes apparently means that such of the Gary children ^See footnote, page s^. 84 THE GARY SCHOOLS Figure 14 Gary Scores m the Cleveland List Spelling Test Compared WITH Ayres' Standards 7i 90- 75- 45 ^ 30 151 AYRES 3TANDA'RD C B 3 a A 4 C B A C B A C A C B A GRADES The scale along the base of the figure represents grades. The scale at the left of the figure shows average per cent, of accuracy of spelling. The soHd line represents Gary scores (generalized). The dotted line rep- resents actual grade averages showing variation from grade to grade. The light solid Une represents Ayres' standards based upon results secured in eighty-four American cities. The portion of the curve to the right of the vertical line represents results in the high school grades in which the same eighth grade words were repeated from grade to grade. The Gary curve parallels the curve for Ayres' standards, but at a much lower level. The eleventh grade in the high school is the first Gary grade to spell the eighth grade test words as well as the eighth grade chil- dren in the average conventional school. In the graph as constructed the increase in difficulty of words from grade to grade is not shown. (Compare Diagram 21, Measuring the Work of the Public Schools, Cleveland Survey.) SPELLING ^5 hi S 86 THE GARY SCHOOLS Figure 15 Eighth Grade Class Scores at Cleveland and Gary m SPELLrNG THE SaME TWENTY WORDS Average Accuracy ' Gary 53 ' Cleveland &o 10 20 30 40 30 60 70 80 90 AVEflAGE PER CENT OF ACCURACY 100 Each rectangle represents the score of one class. The position of the rectangle over the scale along the base of the figure shows the average accuracy of spelling of that class. The Gary classes are shown in solid black. Four of the five Gary classes are lower than any class in Cleve- land. as remain in school eventually learn to spell the common words used in the tests as well as they are spelled by the eighth grade children in the average conventional school, but in time about three years later. Out of 90 eighth grade classes in Cleveland, only 15 have scores as low as the highest eighth grade class at Gary, while 4 out of 5 eighth grade classes at Gary are lower than any eighth grade class in Cleveland (Table XVII, Figure 15). An inspection of the individual scores of the eighth grade children reveals the fact that about 51 per cent. of these children misspelled more than half of the twenty SPELLING 87 test words. As measured by such list tests, therefore, the Gary children would seem to have little ability to spell the words shown in Figure 13. DICTATION SPELLING TESTS Dictation spelling tests were used to measure the de- velopment of spelling ability. Exactly the same tests were given in several successive grades. The improve- ment in ability from grade to grade, however, makes it impractical to continue with the same words throughout all the grades. For instance, by the fourth grade the degree of accuracy on second grade words reaches such a high level (94 per cent.) that the words are no longer an adequate test. To meet this situation, changes were made in certain grades from easy to more difficult material. In these grades the children took both the easy and the difficult test. The continuity of develop- ment in spelling ability, as revealed by the series of tests as a whole, is thus unbroken. A second difference between the list and dictation tests should be noted. In the dictation test the words were given in sentences and a definite time allowed for the writing of each sentence. Thus the words "every" and "race" in the first test were dictated to the children in the sentence, "Every boy likes to see a race," and fourth grade children were allowed thirty two seconds in which to write it. The time allowances were changed from grade to grade to correspond to the increasing ma- turity of the children. 88 THE GARY SCHOOLS Figure i6 Dictation Spelling Tests and Scheme for Giving to Different Grades Each Test Was Given to the Grade for Which Ayres' Standard Per Cent, of Accuracy Is Given Below ^ GRADES 2 3 4 s 6 7 8 9 10 II 12 Test 1 66% 84% 94% Test 2 66% 79% 88% Test 3 66% 79% 88% Test 4 65% ** ** ** ** **No standards given. Words Used in Dictation Sentences TEST I TEST 2 TESTS TEST 4 1 forget beg victim judgment* 2 blue vacation ought emergency* 3 when importance occupy athletic* 4 seven wait senate organization* 5 every eight agreement annual 6 race complaints entitle committee 7 why- afraid government separate 8 grand destroy responsible receipt* 9 girl spend Wednesday especially 10 dark carried pleasant recommend* 11 rest rapid majority preliminary* 12 Hfe flight organize decision* 13 fine family minute allege 14 noon favor century principle 15 glad which piece convenient 16 age crowd assist proceed 17 name engage suggest February 18 still oblige serious cordially 19 made degree expense immediate* 20 near regard business disappoint AYRES scale set: J 0-P-Q S-T-TJ W-X-Y-Z *Also given in list tests. ^Ayres' standards are based on words dictated in lists, untimed, and are possibly too high by s per cent, for words in timed sentences. This is probably ofifset by the fact that the tests were given in May, while the standard values are for tests given at the middle o^ the year. (See Judd: Measuring the Work of the Public Schools, pige 244.) SPELLING 89 The complete scheme for the dictation tests and the actual words used therein are shown in Figure 16. It should be noted that the dictation tests were made to supplement the Hst test, both in the character of the spell- ing product tested and in the levels of ability measured. In general, the results from the dictation test (Table XVIII, page 90) fully confirm those from the list tests. The eighth grade scores on the easy words for the grade was 69 per cent., on the difficult words, 50 per cent. In grades two to four the total growth shown in the two year interval was 41 per cent, (second grade 42 per cent., fourth grade 83 per cent.). For grades four to six the growth was but 34 per cent., from grades six to eight 20 per cent., from grades eight to twelve ^^ per cent. In other words, the results show that the growth is small from grade to grade and relatively decreases as the difficulty of the words increases. This fact is shown graphically (Figure 17, page 91) by the change in the slant of the development curves in the successive grades. COMPARATIVE DATA^ The Gary results are consistently lower than the Ayres general standards. The magnitude of the differ- ence is approximately the same as for the hst tests (20 per cent. Table XVIII, Figure 17). The Gary scores are lower also than those resulting from the use in De- troit of exactly the same tests. Measurement by dic- tation t ests thus confirms the conclusion previously iSee page 38. go THE GARY SCHOOLS U 1^ 1-1 "t"U5 M S^S N Se^S t-H OOlflN t>ooa) M ISSS (M sss r-l COWOJ S5 1 S^ w w 3 . 1ST T Cork POND] No SlOt>I>OT I-) ^H * 00 ■"I' fi 1 H tn H H £ o 8mi>oooo c^ • la t^ino CO S 77T H H H i Ol'^'O ■qnD^O O! w ^^7 N 5 1 i 1 H H H V o coSo ooalO•H^^ O « c g a „ ca ta 3 • S . > 20 ° a-^-^ ^JJ-- cur SCO ore Tes 5^^^^ g 4>" MJ-. 60 > (3 > « ca ca u « 4> 0) S j2 wja S, ^•st^M ..2S7 m t)-^ 1 VO iQt-S o M foil d. test ater fc* SSoS T) 'o'^la -o ►^ K oj.Sia Die is to be low Ayres' are given de more diJ ard. 2 < Fi .1 ^^iT-g o This per cent, eight, SCO in Test 2 below sta SPELLING 91 90 60 30- FlGURE 17 Results of Dictation Tests TEST TEST g. TEST 3 TEST ♦ A'AYRE^ G = GA1^Y LOV0RD5INL15X3 C B 3 A C B A- GRADES Scale along the base of the figure represents grades. Scale along the vertical axis represents per cent, of accuracy in dictation tests. Solid lines represent Gary results in Test i (grades two, three, and four). Test 2 (grades four, five, and six). Test 3 (grades six, seven, and eight) and Test 4 (grades eight, nine, ten, eleven, and twelve). Broken line represents Ayres' standards for the words used in the dictation tests. In grades eight, nine, ten, eleven, and twelve the broken line does not represent the Ayres' standards, but the scores made by the same grades in the list tests given previously. It should be noted that in the fourth, sixth, and eighth grades, where it was necessary to change from easy to difl&cult words, both tests were given. Conclusions: The Gary scores are shown to be below Ayres' stand- ards. As the difficulty of the words increases, the slant of the devel- opment curves from lower to high grades decreases. In the fourth grade, the Gary results on the easy test exceed those of the Ayres on the hard words. In the sixth grade the Gary results on the easy words are 92 THE GARY SCHOOLS FiGtTRE 17 — Continued just equal to Ayres' standards on the hard words. In the eighth grade the Gary results on the easy words are very slightly above Ayres' stan- dard for the difficult words. The four tests are, therefore, consistent in showing that the Gary children are from one to two years behind children in conventional schools in their ability to spell common words. reached in regard to the lack of development of spelling ability in the Gary children. COMPOSITION SPELLING TESTS As a check upon the formal spelhng tests, misspellings in papers written in the composition test were tabulated. The errors noted were of two sorts: sHps, or trivial mis- takes, such as the omission of "d" from the word "and," and more serious misspelHngs, such as "peise" (piece). The number of words misspelled per thousand running words was called a speUing coefficient. Thus in the eighth grade the total number of running words in 122 papers was 27,610; the total number of all misspellings was 720. Of this number, however, 140 were slips, leaving 580 words misspelled according to the rules adopted for the scoring. The spelling coefficient for total misspelHngs was 26.07 (720 h- 27.61) and for the slips 5.07 (140 -^ 27.61). In other words, the general accuracy of the eighth grade spelHng in the composition test was 97 per cent, (i.oo — .026) if all the mistakes were counted, or 98 per cent, (i.oo — .021)^ if slips were not considered misspelHngs. Thus the results from the ^(.026 — .005 = .021). SPELLING 93 composition test seem directly to contradict the results of the formal tests in spelling. The full degree of this apparent contradiction is shown by the actual and generalized city wide median coeffi- cients for both sHps and misspelHngs in each grade (Table XIX below, Figure i8, page 94.). The general accuracy of spelling is high — 92 per cent, in the fourth grade/ the lowest tested — and the improvement is marked from grade to grade. The total errors in the twelfth grade TABLE XIX Spelling Coefficients in Composition Tests Coefficient found by dividing number of words misspelled by number of words written. Represents number of words misspelled per thousand words written. S stands for slips; M for misspellings; T for total errors. S and T are tabulated. M found by subtraction. actual city wide medians generalized scores GRADE s M T s M T 4 16.8 57.0 73.8 17.0 58 75 5 13.3 52.6 65.9 14.0 51 65 6 10.6 43.1 53.7 10.0 41 51 7 7.5 24.3 31.8 8.0 27 35 8 6.1 13.6 19.7 6.0 17 23 9 3.2 13.9 17.1 4.0 13 17 10 3.4 9.8 13.2 4.0 9 13 11 .84 8.56 9.40 3.0 7 10 12 3.0 6.0 9.0 3.0 6 9 This table is to be read as follows: The fourth grade children at Gary in their compositions made 16.8 minor errors (shps) in each 1,000 words written,_ misspelled 57 words per thousand, or made a total of 73.8 mistakes in spelling per thousand words written. In comparison with the scores of other grades and for the purpose of smoothing the development curves, the city wide scores for the fourth grade are taken as 17, 58 and 75 respectively. *(i.ooo — .075 = -Qas). 94 THE GARY SCHOOLS FlGURE 1 8 Total Errors in Spelling • niSSPELLINGSv 70 • >^ 60 • \ 50- \ 40- \ 30 • \ 20' SLIPS >v^^^ ^"^^^^^ 10 ^ — ^-— --Zr^r^^^"^ — 0- CBACBACBACBACBACBACBA 2345 6789 GRADES 12 Grades are shown along the horizontal scale. The number of words misspelled per thousand along the vertical scale. The curve labelled "misspellings" is based upon total errors; that labelled "slips" upon the number of trivial errors. The true misspellings are represented by the differences between the two curves. Note the great changes from grade to grade and the small number of errors made by high school classes. Analysis of the character of these errors would indicate that the apparent improvement is due almost entirely to avoidance by children of the use of words which they cannot spell. amounted to but nine words misspelled per thousand words written. On the basis of such data, the speUing abilities of the children in Gary would seem to be prac- tically perfect and the results reported in the previous tables grossly to misrepresent the true conditions. The explanation of this apparent conflict between the SPELLING 95 formal, and the composition, spelling tests, is a matter of inference. Careful investigation, however, makes it prob- able that no contradiction between the results of the vari- ous tests really exists, and that the spelHng in the com- position test confirms the conclusions drawn from the results^ of the formal tests. In the first place it must be remembered that mis- spelling one word does not necessarily have the same significance that misspelHng another word may have. Consequently, instead of accepting the coefiicients given in Table XIX, page 93, at their face value, there must be proper evaluation, both of the words used by the children and of the words misspelled. For instance, to say that 580 words out of 27,610 were misspelled is literally true, but entirely misleading. The sum of the frequencies of use of the fifty words which occur most often is 14,598 (Table XX, page 97), or more than 50 per cent, of the total number of words used. Needless to say, misspell- ings of such simple words as "a," "the" and "and" are very few. If the fifty most frequent words and their repe- titions were eliminated from the total list of words used, the accuracy of spelling would be lowered from 580 words misspelled in 27,610 (98 per cent, accuracy) to 543^^ ^Words actually misspelled in remaining 13,012 running words. An analsyis of the 31 words used in the eighth grade formal spelHng tests shows that but 3 words are rated by Jones (Concrete Investigation of the Material of English Spelling. W. Franklin Jones, University of South Dakota), as of fourth, or lower grade diflSculty, 7 are assigned to grades 5 or 6, 13 to grades 7 or 8, while 8 of the more difficult words do not ap- pear in Jones' list at all. 96 THE GARY SCHOOLS words misspelled in 13,012 (96 per cent, accuracy). That is, the composition test may not measure ability to spell on the same level of word dif&culty as the formal tests. On the other hand, the Gary children in their composi- tions used mainly second grade words^ (Table XXI, page 100, and Figure 19, page loi). Eighty six per cent, of the running words and 53 per cent, of the different ^In the appendix, Sec. IV, page 406, will be found a list of the words misspelled in the Gary eighth grade compositions, excluding slips, the frequency with which they were used, and the frequency of their mis- spelling; also the grades in which such words are first used by 2 per cent, of the children in the conventional schools (according to Jones). The reader should examine this list and see for himself the actual words mis- spelled by the Gary eighth grade children. The words in Jones' "second grade" list are not necessarily easy words. In fact Jones states, "The very words that give most trouble in spelling are almost invariably found in the second or third grade lists, and faithfully reappear throughout the subsequent years." (However, see Table VIII B, Section IV, Appendix A, page 414, for the sense in which this statement is to be taken. Even the hundred "spelling demons" are easy words for the average eighth grade class.) An analysis of the first 250 words of this report (page 3) shows 43 per cent, second grade words, 5 per cent, third grade, 3 per cent, fourth grade, 3 per cent, fifth grade, and 9 per cent, sixth grade or higher, 10 per cent. N. L. D., 27 per cent. N. L. Hence, the importance of ex- amining the actual words misspelled by the Gary eighth grade children. (If these results are compared with those on page 230, the author will seem to be using nearly as many second grade words as the Gary eighth grade children. However, it should be remembered that Table LVI is based on 2,500 different words, and the results above on 141. The larger the total number of words, the smaller the percentage of second grade words. Thus, for the 396 different words used in the first 1,000 words of this report the author's percentage of second grade words drops to 35 per cent.) SPELLING 97 words are of second grade difficulty. Also 55 per cent. of the 376 different words misspelled and 59 per cent, of the 580 misspellings were words which are classed as second grade words in Jones' vocabularies. Whenever they did have occasion to use more difficult words, these words A^ere usually misspelled. Thus only 11 different words ranked as eighth grade words by Jones were used in the Gary eighth grade compositions. These 1 1 words were used a total of eighteen times and misspelled fifteen TABLE XX Fifty Words Used Most Frequently in the Eighth Grade Com- positions KO. WORD TQIES USED NO. 18 WORD TIMES USED NO. WORD For'd 9,994 35 For'd 1 the 1,703 when 216 me 2 and 1,162 19 for 206 36 very 3 we 891 20 but 193 37 am 4 to 872 21 one 191 38 came 5 a 756 22 SO 188 39 home 6 was 708 23 as 177 40 started 7 I 652 24 out 173 41 us 8 of 479 25 aU 162 42 she 9 m 452 26 about 160 43 our 10 it 446 27 at 157 44 would 11 my 293 28 up 155 45 with 12 had 289 29 got 152 46 could 13 were 287 30 they 152 47 down 14 on 282 31 not 147 48 after 15 went 244 32 there 145 49 her 16 he 239 33 go 142 50 him 17 that 239 34 then 136 9,994 12,846 TIMES USED 12,846 133 132 129 122 118 114 113 108 107 106 100 96 96 94 93 91 14,598 98 THE GARY SCHOOLS times, so that the average accuracy of spelling was about 1 7 per cent. It will be remembered that one of the spelling dictation tests given to the eighth grade (Test 3, page 88) was composed of 20 words chosen from sets S-T-U of Ayres' Scale. When the eighth grade misspellings were analyzed in terms of the Ayres' Scale (Table XXVII, page 127), it was found that 18 words almost equally distributed among sets S-T-U of Ayres' Scale had been used spon- taneously by the children in their compositions. In other words, these 18 words constitute a spontaneous, self- imposed spelling test comparable with formal dictation test No. 3, the only difference being that the formal test was given to 127 eighth grade children, while at most only half this number used the words spontaneously. The correspondence in the results is almost perfect, as is shown in the following table which gives the average accuracy of spelling in the two tests: GARY SCORE ayres' standard DIFFERENCE Formal Test 69 71 88 88 —19 Spontaneous Test —17 The apparent contradiction between the formal tests and the composition test is thus due to the fact that the Gary children in their compositions did not use many words of the same level of spelling difficulty as the for- mal tests. However, there is a further point to be considered. *m SPELLING 99 The fact remains that the Gary eighth grade children made very few spelling mistakes in writing their com- positions. The question arises, "Is this favorable re- sult to be credited to the Gary teaching, or does it mean that in the absence of effective training the Gary children merely followed a natural tendency to avoid words which they did not know how to spell? " Unfortunately, the comparative data which are neces- sary to determine this point are not available, but one item of the results appears significant. One hundred and fifty-eight, or 42 per cent, of the words misspelled in the eighth grade compositions were misspelled also in the fourth grade compositions. That in spite of the normal increase in vocabulary which takes place from the fourth to the eighth grade, so large a proportion of the misspelHngs should be words which have been used, and used repeatedly, through four years of school work, tends to confirm the inference in regard to the failure of the Gary training, and to suggest that the apparent favor- able results are in no way either real or a credit to the Gary system.-^ In other words, the Gary eighth grade children in their compositions gave ample proof that they were unable to spell as well as the children in conventional schools the words of the Ayres' Scale which have been shown to be ^The effect might possibly, though not probably, be due to the over- lapping of grades, or the presence in the eighth grade of many children of fourth grade ability, and the presence in the fourth grade of many children of eighth grade ability. lOO THE GARY SCHOOLS >< a tTRAC OF LLINi fe5^^^fe5^^^^ 65 ■<#'*O00rH03t-(M>O 1—1 U W 05t-lX>lOU3t-i-|lX>-^ a> U ft < w &4 O 13 s (J p4 XIENC PELL )F TO h) h-l ft o o FREQ MISS IS C UENCY MIS- XINGS ■rjfCOCiOiX>t-'<*U500u:) o -^I^JOO-i— It-H-^r-llOOq 00 FREQ OF SPEI CO lO Jx U , *co>acocoio o o 1-H i:^ H & § 55 ° ^ 9 05t-t-'5) > O « .to °^ ^ H m O SPELLING loi FlGtJKE 19 Character of the Words Misspelled in the Eighth Grade Compositions o«5 WORDS TOTAL Af^EA- 6426 WORDS 2"" GRADE WORDS 3"" ACCURAaS^** ACCURACY ■17 r*' 62 r. DIFFERENT WORDS 209 D n 74r. c=> USED ^746 TIMES 5" 58% 51 r* MISSPELLED 344TIME5 ACCURACY 94^0 D 2j797o Q A5% The small square indicates the scale upon which the remaining figures are drawn and represents five words. The total area of all the other figures represent 6,426 running words, the total frequency of use of the words which were misspelled by some member of the group. The large square represents the words which are classified as second grade words by Jones. There were 209 second grade words misspelled by the eighth grade children. These words were used a total of 5,746 times, and were mis- spelled 344 times. The accuracy of the eighth grade spelling of the second grade words was, therefore, 94 per cent. The remaining diagrams show the relative frequency with which words classified by Jones as third, fourth, and fifth grade, etc., were used, and the relative accuracy of spelling. UNL means words which were not listed by Jones. D means derivatives, and words not counted because proper nouns, etc. The figure as a whole shows that the words misspelled by the eighth grade children were mainly second grade words, and that as the diffi- culty of the words increased, the'accuracy of the spelling decreased. The exception in the seventh grade is due to the frequent use of the words "exciting" and "accident" which were written on the board by the exam- iner in giving the instructions for the test. I02 THE GARY SCHOOLS the most frequently used words in the English lan- guage. SCHOOL TO SCHOOL COMPARISONS Very small differences were found in spelling ability from school to school. In general, the Beveridge and Jefferson schools, which are less completely equipped to carry out a modern program, do rather better in speUing than the two larger schools, Emerson and Froebel, TABLE XXII Class Deviations^ from Generalized City Score 1 i composition dictation test ] list TEST TEST — TOTAL SCHOOL 1 1 misspellings* CLASSES TESTED + 13 5 6 6 CLASSES TESTED •+ 11 5 5 2 CLASSES TESTED - + Froebel. . . . Emerson. . . Jefferson. . . Beveridge. . 31 17 18 18 7 1 2 6 31 18 18 18 6 7 8 8 22 11 13 12 1 6 4 5 14 1 4 6 1 Class scores which exceed by i^y the generalized city score. *Note that in this division of the table the plus sign indicates less ability in spelling: for the larger the coefficient the greater the number of mistakes made. This table is to be read as follows: Of 31 classes in the Froebel school measured in spelling by the dictation test, 7 classes were markedly- above the corresponding city scores, and 13 below. In the hst test 6 classes were above and 11 below. In the composition test i class had smaller coefi&cients than the average for the city and 14 larger. That is, in general, the children of the Froebel school are consistently below average of tbe city in spelhng ability, as was to be expected because of the greater amount of foreign parentage. SPELLING 103 although the differences are relatively insignificant (Table XXII, page 102). §2. Critical Discussion SPELLING ability A spelling test is popularly supposed to measure a child's "general abihty to spell." Many persons speak of "learn- ing to spell" as if school training had for its purpose the development of a general abihty to spell any word without regard to whether the word had ever been seen or spelled before. It is easy to show by experiment, however, that well trained adults can spell simple phonetic words which they have never seen or heard before only with an ac- curacy of (approximately) 30 per cent.^ Therefore, the existence of general abihty to spell may well be doubted. The primary purpose of spelhng is to make a written record of the sounds used in oral language. To be sure, the word may not have been sounded by the writer, and may not be sounded by the reader, but all writing is capable of being translated into sound; conversely, all oral language may be recorded by means of appro- priate letters or groups of letters. In an ideal system of phonography in which each sound would be represented by a single letter, and each letter by a single sound, a person thoroughly skilled in the use of the system would, conceivably, be able to spell any word correctly, whether 'From an unpublished Study by the Department of Educational Research, Detroit Public Schools. I04 THE GARY SCHOOLS or not he had ever seen or heard the word before, and even if he did not know its meaning. Under such a system school training would develop general ability. The child with little skill in using the system could spell correctly only a few words. As skill increased, the num- ber of words spelled correctly would also increase. Finally, when the individual had attained perfection in analyzing spoken words into their sound elements and in representing each element by its appropriate symbol, the person's spelhng ability would also be perfect. Unfortunately, English words are derived from many sources and English lacks an absolute phonetic system. English is, perhaps, more illogical in its spelling than any other of the modern languages. Ability to spell in English at present means merely the ability to repro- duce certain conventional symbols which stand for given words. The fact that some entire words, and parts of many words, have a phonetic basis merely increases the confusion. It may be true that phonetic analysis makes it easier to learn to spell new words, but so far as general ability to spell is concerned, our present phonetic system probably adds to the difi&culty of the situa- tion. Hence in the English language it is quite impossible to measure directly a child's general ability to spell. The most that can be done is to measure his ability to spell certain specific words. In other words, one cannot infer with certainty that because a child is able to spell certain words correctly he will also be able to spell SPELLING 105 correctly words of equal or less difficulty. The spelling of each word stands by itself. It is, of course, possible to test an individual with a large number of representa- tive words and so finally determine the range of his spelling abinty, but general abihty to spell is seldom used with this meaning. This point is so important and yet so often misunder- stood that it merits more extended discussion. Proof of its truth will be shown by analysis of the results of tests of certain classes at Gary. In grade 7, class No. 13 Emerson and class No. 14 Jefferson made almost exactly the same average score in accuracy (71-70 per cent., respectively) in spelling 20 given words, while class No. 44 Froebel made a very much lower score (52 per cent.) in spelHng the same twenty words. Four- teen of the twenty words were taken from Ayres' Scale, Set U, as being words of equal spelling difficulty as determined by the actual performances of many thou- sands of children. Three words were taken from the next easier set (T) and three from the next more difficult set (V). However, a mere glance at an analysis of the results by words is all that is necessary to show that while the words may be equal for seventh grade children in general, they are most certainly not equal for the children in these classes. The accuracies on individual words vary from 95 per cent, to 21 per cent. (Table XXIII, page 107). Even if the analysis be restricted to the re- sults of the class making the best record, a variation from word to word of 35 per cent, will be found. The io6 THE GARY SCHOOLS probable explanation is, of course, that some of the words had been studied recently at Gary and others had not. If a comparison of the results for the best and the worst of the three classes is made (Figure 20, page 109), the words being arranged in order of their difficulty on the basis of the per cent, of children spelling the words cor- rectly in the class of low ability, beginning with the easiest words, it is found that this order is not the order of difficulty for the more able class. It should be noted further that in spite of the fact that the average accuracy of one class based upon the entire twenty words is 19 per cent, higher than the other (the differ- ence between sixth, seventh, and eighth grade scores at these points of the Ayres Scale range from 9 per cent, to 15 per cent.), the more able class spells "celebration" and "respectfully" correctly less often than the less able class. If the class scores had been based upon the average of the first eight words, the results would have been 78 per cent, and 74 per cent, respectively. That is, although the two classes can spell 8 out of 20 words with nearly the same degree of accuracy, it would not be safe to infer that they could spell the other 12 words equally well, although seventh grade children, in general, are able to do so. If a similar comparison is made of the results for the two classes having closely the same average score in both tests (Figure 21, page in), the great varia- tion from word to word again shows that abihty to spell one word is no guarantee of ability to spell another word SPELLING 107 0005U500eOT-(050U303U5t-T(TlHDU3»OTj< «DC>>t)t^ tn y 4) "C c3 s > >> H) 2 rt cS d tn i-i H tn o io8 THE GARY SCHOOLS Table XXIII — Continued This table is to be read as follows: The word "folks" was taken from set "T" in the Ayres Scale. According to Ayres, the average accuracy of spelling the words in this set by seventh grade children is 79 per cent. Class No. 44 Froebel, however, made an average score of 86 per cent., class No. 14 Jefferson a score of 85 per cent., and class No. 13 Emerson a score of 87 per cent. That is, the score of class No. 44 Froebel was 7 per cent, above the standard. No. 14 Jefferson 6 per cent, above, and No. 13 Emerson 8 per cent, above. The reader should note that the average score of class No. 44 Froebel on the entire twenty words was about 52 per cent., which is much below standard, while class No. 14 Jefferson and class No. 13 Emerson have closely the same score (70 and 71 per cent.). Also that No. 44 Froebel actually spelled "respectfully" and "celebration" more accurately than No. 13 Emerson, although for twelve of the words the scores of No. 44 Froebel fall very much below the scores of the other two classes. of equal difficulty. The spelling of each word must be learned individually. It may be objected that no word was missed by the entire class, just as no word was correctly spelled by the entire class, so that in one sense the results show merely that there are larger differences in ability to speU certain words. From this point of view, ability to spell a certain word would depend partly upon general ability to spell, and partly upon a direct knowledge of the particular word. That is, of course, true. The most difficult word of all was spelled correctly by 21 per cent, of the least able class. However, "general ability to spell" is seldom interpreted in such a sense. Any group of children reaching the seventh grade will have had so many contacts with common words and so many ex- periences in spelling words that the group, as a whole. SPELLING 109 Figure 20 Comparison of Class Scores on Individual Words I004 CLASS VARIATION \H SPELLING ABILITY CLASS AVERAGE A -71 B-52/. The scale along the base of the figure shows the various words. The scale along the vertical axis of the figure shows the accuracy with which the different words were spelled. The words are arranged in the order of the accuracy with which they were spelled by Class B. That is, "folks" was the easiest word, "suggest" was the next harder word, and "elaborate " was the hardest word of all. The solid line (A) represents results from Class No. 13, Emerson, seventh grade. The dotted line (B) represents results from Class No. 44, Froebel, seventh grade. Average accuracy of spelling based on entire twenty words: Class A — 71 per cent.. Class B — 52 per cent. The curves show that the order of difficulty of words for Class B was not that for Class A. The words in the test are given as of nearly equal difficulty by Ayres. For these two classes they range in difficulty from 92 per cent, to 21 per cent. The curves also show that, although the two classes make very closely the same scores on the first eight words. no THE GARY SCHOOLS Figure 20 — Continued they differ widely on the remaining twelve words. Note that for the two words "respectfully" and "celebration" the poorer class makes higher scores than the other. has a certain vague "general ability" to spell. In this sense the term means merely that in such a group there are sure to be some children who can spell at least part of a group of common seventh grade words. To most persons, however, general spelling ability is a term appHed to the ability of individuals and implies that the individual's ability to spell seventh grade words in general may be determined by having him spell a random sampling of seventh grade words. The results shown above indicate clearly that inferences from one set of words to another set of words may be in error by amounts equal to three times the average yearly progress from grade to grade. Under the circumstances, each test consisting of but ten to twenty words must be con- sidered by itself as a test of ability to spell certain words only. The impossibility of making inferences in regard to an individual's abihty to spell a word from his performance in spelling some other word of equal difficulty is brought out plainly by a study of the records for a single class. For instance, in class No. 13 Emerson (seventh grade) each word of the 20 in the test was missed by at least 3 of the 37 children, while no word was missed by more than 16 children. On the other hand, only 2 of the children spelled all 20 words correctly, and of the SPELLING Figure 21 Comparison of Class Scores on Individual Words III /• 00- CLASS VARIATION IN SPELLING ABILITY^ 90' SO- TO 60- 50 40 ■ /\ CLASS AVERAGE A=7I C-70% V 30 20 10 0. A> EMERSON f3 C-JEFFERSON M- 3 .-. S The scale along the base of the figure shows the various words. The scale along the vertical axis of the figure shows the accuracy with which the different words were spelled. The order of words is the same as in Figure 20, and Curve A is taken directly from that figure. (The comparison in Figure 20 was between two seventh grade classes having very different average scores.) In this figure the comparison is between classes having almost the identical average score. In spite of this fact, the curve shows that the scores for individual words vary greatly. Foi instance, Class C spells "citizen" very much better than class A, but " arrangement " very much more poorly. The solid line (A) represents results from class No. 13 Emerson, seventh grade, as in Figure 20. The dotted line (C) represents results from class No. 14 Jefferson, seventh grade. Average accuracy of spelling based on entire twenty words: Class A — 7 1 per cent., class C — 70 per cent. 112 THE GARY SCHOOLS 3 children who missed only one word, each missed a different word. The poor spellers of the class did not miss all the hard words, nor spell correctly all the easy words (Table XXIV, page 113). Equally significant is a comparison of the records of those whose average accuracy in spelling the 20 words was the same as the average accuracy of the entire class. Three out of the 37 children in the class had scores of 70 per cent., while the class average was 71 per cent. That is, each of the three missed 6 words, but in no case did all three miss the same words. In only 4 cases did two of the three make the same error (Table XXV, page 115). The variation is so extreme that even to Hst the different words misspelled by three children of "aver- age" ability is to list 14 out of the 20 words of the test. Similar records could be shown indefinitely. It should be evident, therefore, that each test measures precisely only the ability to spell the words in the test and under the conditions under which it was given. On the other hand, as data accumulate, general ten- dencies may become apparent, and at last inferences may be safely made in regard to the general effects of teaching effort. "^ It should, however, never be forgotten that such inferences are inferences only, and a different choice of words or a different form of test may )deld very different results. When, however, more than one form of test is used and more than one choice of words is made without bringing to light any conflicting evidence, SPELLING 113 s9 a o h9 65 fe? fe? 65 H Zj tD o (J 65 65 114 THE GARY SCHOOLS Table XXIV— Continued This table is to be read as follows: The easiest word for the class was "suggest." The average accuracy of spelling of this word by seventh grade children, according to Ayres, is 73 per cent. The record of class No. 13 Emerson at Gary was 92 per cent. All four individuals whose records are shown spelled the word correctly. The reader should note that while the individual in the class who had the lowest score spelled the four easiest words correctly, she also spelled the difl&cult word "dis- cussion" correctly, although she misspelled the equally difficult word "arrangement." The individual who had the next to the lowest score in the class speUed correctly words of every level of difficulty, from the easiest to the third hardest, and also missed words of every level of difficulty, from the second easiest to the hardest. Similar variations are shown in the other two records. The reader should note also that judged by the first nine words the class score is higher than Ayres' standards, although on certain words the class score is from 15 to 20 per cent, below Ayres' standards. inferences gain a reliability they could not possibly possess if they were based upon a single test. The term "general ability of the Gary children in spelling" will be used throughout the report to mean the final inferences drawn from the series of measurement of the abihty to spell particular words. As a spelling test by itself is a rehable measure of children's performances only for the given words, the selection of test words becomes an important matter. It would be as unfair to test a school system by unusual and difficult words as it would be to use as test words only those which had recently been taught in the class- room. Fortunately, in all written EngHsh certain words are used again and again, while others appear but seldom. Ayres has made a determination of the SPELLING "5 TABLE XXV Misspellings of Three Pupils^ AVERAGE AVERAGE INDIVIDUAL RECORDS WORD ACCtJllACY AYRES CLASS ACCURACY A B c 1 Senate 73% • 68% 0% 0% 100% 2 Majority 73 59 100 3 Necessary 73 59 100 4 Celebration 79 66 100 5 Mere 73 68 100 100 6 Respectfully 73 78 100 100 7 Testimony 66 65 100 100 8 Elaborate 73 57 100 100 9 Discussion 66 73 100 100 10 Arrangement 66 57 100 100 11 Citizen 73 78 100 100 12 Agreement 73 70 100 100 18 Receive 73 68 100 100 14 Suggest 73 92 100 100 iThese pupils had the same accuracy score as the class (70-71 per cent.). This table is to be read as follows: The word "senate" whose aver- age accuracy of spelling, according to Ayres, was 73 per cent., was spelled by the class with an accuracy of 68 per cent. It was misspelled by both individuals A and B, but spelled correctly by C. The reader should note that individuals A, B, and C have the same average accuracy on the 20 words (70 per cent.), and this accuracy is the same as the average made by the class as a whole on the 20 words (71 per cent.). The three individuals, however, did not miss the same word, and in only four cases — the first four words in the table — did two of them miss the same word. To record the dififerent words misspelled by these three individuals of average ability it is necessary to list 14 words, although each of the children misspelled but 6 words. ii6 THE GARY SCHOOLS common words, and has also found the average accuracy with which they are spelled by American school children. As one of the main functions of the teaching of spelling must be to give children the ability to spell words which they use frequently, all test words were chosen from the Ayres Spelling Scale. LIST TESTS The conventional form of school examination of spelling ability is spelhng words in lists. It has no counterpart in daily Ufe. When a Hst of words is dictated slowly the child has time to recall consciously the possible ways of spelling a word. He has opportunity to work out by rule, by guess work, or by reasoning from analogy both the letters he uses and their order. The word ac- tually written on the paper tells merely that the mental process employed by the child — automatic response, reason, or guess — has in the particular instance yielded the correct result. Spelling in daily Ufe, however, has not this character. If one is conscious that he is uncertain of the spelling of a word, he consults a dictionary. The errors actually made are thus in automatic, unconscious spelling when the attention is concentrated upon the thought to be expressed so that the errors are unnoticed. It would seem, therefore, that a test to be a real test of spelling ability must be given under such conditions that the results would reveal not whether children know how to spell words correctly, but whether or not their actual SPELLING 117 spelling habits when writing freely are correct. This is the reason for the use of more than one test. DICTATION TESTS The timed dictation of sentences is an unusual form of spelling test and a few points in regard to their construc- tion need explanation. The tests consisted of ten sentences, each containing two test words. The sentences in any one test were made of equal length within approximately five letters and care was taken to employ no words of greater spelling difficulty than the test words. The rate of dictation was controlled, each grade being given the material and rate corresponding to the median rate of free writing at its grade. That is, the test was given for the purpose of determining how many of the children could spell the given words when writing rapidly. A defect in this series of tests was that while most of the sentences were natural, a few were markedly artificial, due to the necessity of using certain words. Again, some of the test words occurred at the end of the sen- tence. The child who is naturally slow, and who in such a test is compelled to write at a rate higher than his natural rate, tends to omit the last words of the sentence. Such omissions were counted as misspellings. The test words should all have occurred at the beginning or middle of sentences. Tabulations were made of the number of words omitted by children in the classes making the lowest scores in ii8 THE GARY SCHOOLS Q ACCURACY IN SPELLING IOU3010000 Ui ui ->*' -"Jt ;0 t-^ CD (m' t-^ i-C<*CO(M-^-^ U3 w o OS i-( i-l 00 1-1 1-( 1— 1 tH i-l 1-1 a Oi lCi(M 1-1 C W t-' 05 U3 O l> T— 1 NO. OF CHILDREN MAKING EACH TYPE OF MISTAKE M 1 CO tH ■«;)<(>] lO 00 iH O ^ t- 00 00 tH U5 o t- o (Mi-l(Mi-li-H CO o OOOt-(M005U5 05 1—1 1—1 1—1 03 1—1 a 05 00 00 U3 (M iO CD (M iH 1-1 T-l W CO CO 1— 1 T-^ TOTAL NO. IN CLASS O (M -^ CD CO CO 1-1 * U5 CD t- 00 o 2'2 1) .W) " u g 1! XJ 2 •r* -*-' -3 § J ^ cn ,„ M 4J 'I II II 7 ^ O ^ w SPELLING 119 Table XXVI — Continued The table is to be read as follows: In the second grade class having the lowest accuracy there were 20 children. Of these, 19 misspelled one or more words, 18 omitted one or more words, 3 wrote one or more words illegibly, 7 substituted words for the words pronounced by the teacher. On 20 words the average mistakes in spelling were 6.9 words, in omissions 8.5 words, in substitutions .6 of a word, in illegibility .9 of a word. Total errors were 16.9, making the average accuracy 15.5 per cent. The omissions and substitutions are large only in those classes where the average accuracy is lower than 50 per cent. Note the record of even the second grade class when the class average was 57.5 per cent. each grade (Table XXVI, page 118). The maximum effect due to the omission of words is less than one word, or 5 per cent., except in classes where the accuracy of spelling falls below 50 per cent. In such cases, as Ayres has pointed out, the test ceases to be a spelling test and becomes a guessing contest. Only when more than half the children in a class are unfamiliar with a word are the omissions large. From the scores of children in other cities, it was not expected that the scores of the Gary children in any test would faU below the 50 per cent, level. Actual conditions were not foreseen in planning the test. It must be remembered, also, that a large number of the omissions are really due to misspellings. When a child has to "stop and think" how a word is spelled, he is not able to spell, as spelHng is defined in this test. The omission of words to catch up is, therefore, equiva- lent to misspelling. There may be some, however, who are unwilling to accept this point of view. They should I20 THE GARY SCHOOLS base their conclusions on the list tests which are untimed. Nevertheless, it is the opinion of the writer that the real value of a spelling test does not He in the certainty with which the children show their maximum abiHty, but in the certainty with which it points out those children who fail to spell correctly in their ordinary written work. COMPOSITION TEST In checking the results of the different examiners for the same papers in the composition test, it was noted that there were frequent disagreements as to which words were misspelled. Some of the differences were due to mere errors in reading on the part of the scorers, but many were due to the inability of the examiners to agree as to what constituted misspelling. Finally, after repeated conferences of the scorers, definitions of misspellings and a mmaber of arbitrary rules were adopted. The papers were then scored independently by two observers. Each of the two made a Hst of misspelled words without in any way marking the papers himself. The two lists were then compared by a third person, and all differences in scoring by the first two ex- aminers were checked by this third person by reference to the original papers. In this way the scoring was made reliable, although perfection in scoring proved to be very difficult to attain. In considering results from ^Those interested in this point should note the correlations given in Table XXX, page 133. SPELLING 121 similar tests in other school systems as a basis for com- parison, it would be necessary to inquire whether equal care had been taken before such comparison could be con- sidered valid. The definition of misspelling finally adopted was the following: Any variations of the character, number, or order of the letters called for in a given situation, except those which are plainly caused by grammatical errors, are to be considered misspeUings and marked "E." The part of this definition to be particularly noted is that the scorers were unable to decide whether or not a given word was misspelled by examination of the letters alone. The whole situation had to be considered. This will become plain as each qualifjang rule is discussed. The following rules were adopted and used in marking the papers : (i) Make no record of any questionable grammatical errors. If a child wrote, "I only done my duty," the substitution of "done" for "did" was considered a grammatical error, and not an error in spelling. In cases of doubt, the rule gives the child the benefit of the doubt. (2) Regard all words in which illegible letters occur as misspelled, marking them "L". Illegibility caused by poor writing thus operates to increase misspelling. 122 THE GARY SCHOOLS (3) -Errors in simple words caused by the addition or omission of common letters, prefixes or sufi&xes, are to be counted as slips and marked "S." Illustration: "I walk home an ate my dinner," the omission of "ed" on walked and of "d" on and will be marked "S" for slips. A slip is caused by some form of inatten- tion and is ordinarily a defect of the writing activity. It may not indicate lack of knowledge of how to spell, but is, never- theless, an error in spelling. The ex- aminers were divided in their opinions. Some contended that spelling ability is to be measured not by slips, but by errors in words of importance. The matter was adjusted as follows: Every error of any sort was noted. One tabulation was made of the total errors, and a second one of the slips alone. It is thus possible for the reader to decide the point for himself. But whether slips are to be counted as misspellings or not, it is certain that they reveal one characteristic of the training. A well trained, careful child does not make such slips any more than he misspells words. A change in the number of slips from grade to grade is thus as much of an indication of the efficiency of training as SPELLING 123 a change in the number of words mis- spelled. Since a certain element of subjective judgment had to be reckoned with in the determination of which mistakes were to be considered slips, all doubtful cases were passed upon by three judges. No hard and fast Hne could be drawn between true misspellings and sHps, but the re- strictions given in the rule were adhered to throughout. (4) The substitution of one homonym for another is to be counted as a misspelHng. Illustration: Their for there, fairy for ferry, were counted misspellings, even though the words actually appearing on the papers were correctly spelled. (5) The substitution of one word for another, as lighting for lightening, is to be decided in each case on its merits. If the context shows that the child made a mistake in the selection of the word to express his thoughts, no record is to be made of the mistake; but if the word in question is incorrect because of faulty spelHng, it is to be counted as a misspeUing, and marked "M". Decision in these cases again involved subjective judgment, but as above, ques- 124 THE GARY SCHOOLS tionable cases were passed by three judges. The number of such substitutions was not large. (6) Slight changes in common words will be marked as misspellings and not as slips. For example: than for them; were for where, etc. (7) The omission of hyphens in compound words, the separation of words into parts, as "some where," or the faulty division of words at the end of a line, are not to be counted as misspellings if both parts of the word are correctly spelled. (8) Proper names of unknown persons wiU be accepted as spelled correctly, but all misspellings of well known proper names will be recorded. "Wasington" as the name of a boy playmate was accepted as correct, but as the name of the first President of the United States was counted mis- spelled. (9) Faulty use of capitals, omissions of dots over "i's" or crosses over "t's" are not to be counted either as misspellings or slips. After the spelling errors in the compositions had been determined by the above rules, the tabulations brought up an added difficulty. Children wrote papers of vary- ing lengths, so that the possibility of error differed greatly. To reduce the results to a comparable basis, a coefficient of misspellings was computed for each child. SPELLING 125 This coefficient was taken as the number of words mis- spelled per thousand words written; that is, the actual number of errors in spelling was divided by the total number of words written, and the division car- ried to the nearest thousandth. A tabulation of a random selection of cases proved that there was no relation between the length of a paper and the size of the coefficient of misspellings: consequently the values thus determined were accepted as a measure of spelling abihty. It should be remembered, however, that such a measure is a gross measure only. To misspell a word written correctly by most of the other children in a grade is a much more serious error than misspelling an unusual word. A truly significant coefficient of spelHng ability should be based upon the relative seriousness of the various misspellings. However, as no information bear- ing on this point was available, the gross coefficient was accepted at its face value. Hence the actual misspellings were also Hsted and analyzed. An analysis of the words misspelled by the Gary eighth grade children in their compositions on the basis of Jones' vocabulary list is given in § i of this chapter. A similar analysis was made on the basis of Ayres' Spelling Scale. There are fewer words common to the Ayres Scale and the compositions than there are to Jones' lists and the compositions, but the tabulation fully con- firms the conclusions reached previously (Table XXVII, page 127). The results show plainly that as the difficulty 126 THE GARY SCHOOLS of the words increases the Gary scores fall more and more below Ayres' standard- It may be objected that Ayres' standards are too high. They represent, however, the average scores made by the children of eighty four representative American cities, and are not standards arbitrarily set. To be sure, the tests by which the standards were de- termined were given under conditions no more uniform than can be secured by the transmission of instructions in correspondence. The standards have, however, been repeatedly checked by subsequent tests under carefully controlled conditions and have proved valid (Table XXVIII, page 128) . The values set by Ayres ap- proximate the values which represent the average spelling ability of children in various cities of the United States. The differences from Ayres' standards are small, and vary from grade to grade. In some cases they are nega- tive, and in others positive. Many cities, indeed, show an average difference which is positive, and in amount equal to from 4 per cent, to 12 per cent. On the whole, therefore, comparisons with Ayres' standards are vahd. RANGE OF ABILITY In discussing the results of educational tests the varia- tion of individual ability within the grade is almost as much a matter of concern as the class score. A well graded, efficient system., meeting the needs of individuals at every turn, may be expected to have compact grades SPELLING 127 ^ ^ S ^ ^ ^ go g w f^ a iz; ES !z; coi-iosooast-meoiMOi^ 1— ( i-H (M CO t- ?D I I I I I I I I I I I OS 00 OS OS 00 1- CO 00 10 I I t--^ O W IOt-Ii-I03CO IfS 00 1— I tH T-H r-( OJ 1— I t-< 1^ -'S'-^lOCOt-i-ICCO'-li-li-lt-OS CD 00 CO u3 as U3 i;o to CO oco (M P '^ Ta "-I is '-' ^ t"' !> wis rt 4, '^ o ^ 2^'S w) S ^ ci 3 o lU C 1-1 ^ > nj IJ iJ rt « O, (0^ ^H ^,- o a .'*^^ CJ N 3 J3 ^ ■i-i "d •<-> o d to »^ TO F-H '^+-* n O U3 tn fl- S 3 (U 03 3X1 O ' o i O tn a' " s "y 3 n CJ r^ 1) rt I2J THE GARY SCHOOLS (M (M CO U3 Tj< «> o 1— ( + p tn ^ la c3 M 00 < (U o H !2 =• 003«OUSOO«0 l-H T— I rH 1— I I I + + + + + 00 + 00 + in -a to Si S nJ o 5 «D oa a> -^ u3 c£) oi + + + + + CO 1-1 CO rW (M O ->* l + l 1+ + 83 dj oj cd -3 u " (J "-", H N CO Tt< k« CO t- 00 -^i S-isTJ — 4245 d ^ ■*-< -^ O S -i O T3 > > *" C O O rt > . . ni o t; b «« w ^ -l CS in -l-> 2 "^ r_ ■" SPELLING 129 erf low variability. The range of variation within the grades at Gary is very great and the coefficients of varia- biUty large (Table XXIK, page 130). RELATION BETWEEN TESTS A series of measurements of a single type of ability affords an opportunity for a study of the relation be- tween the tests used. A number of interesting questions suggest themselves. Are the children who make low scores in the formal spelling tests those who misspell the largest number of words in their compositions? Will a dictation test or a list test reveal most accurately the children who make errors in spelling in their compositions? The scores of the children in the Hst and dictation tests show a greater correspondence than do the scores from any other two of the tests (Table XXX, page 133). Eighty eight per cent, of the 42 eighth grade children present for all tests maintain their place in the two dis- tributions within one unit of variabiUty. That is, 9 chil- dren out of 10 will be as much above or below the median of the group, relatively, in one test as in the other. The reader will, of course, note that the median deviation for these two tests was 24 per cent, and 20 per cent, respec- tively. If, however, the hmit of variation be reduced to half that figure, 60 per cent, of the children will be found to maintain the same relative place in the distribution whether measured by one test or the other. The correspondence between actual mistakes in spell- ing and the mistakes in the formal tests is such that three I30 THE GARY SCHOOLS H M h-t o o u 1^ Hi O IS CO«050U3U3Tj C/D 4J -S 1-^ ^ ^ U c3 (u i:; cd tn rt T3 !fl i3 w '-' t> >-i S at ^« •rl >H D OJ aj t/) !1 S.s^ ^'^ f ^ "^ rt =! ^0 ^.-i.-s^ dj tn .S >H t spelling t of the clas of the ind t. In othe ^ 2 as^ :S b^.2 M ° c g fl^ En . -ij !rt aJ " c ■> ^ 4^ > ^ l-O -So tS =" _□ j_» _ ■4-» O O bi e ^ w ^ ^ t^ C^ "Tl TO cfi ?3, ^ O) rt « ^-^ tn ■*-» 03. rt c3 en . - sb a 'S.S o g jj " n tj C (U -^ is tn {S tn >ijj • rt O en *j -y S >> TO u< 5 a s s fC| o TO ^ rC U* •53 •^ 'pq H cu c S C A i^^ rt o > § a rt 132 THE GARY SCHOOLS fourths of the children maintain their relative places in the two distributions. The correspondence is sHghtly greater between the misspellings in the compositions and the dictation test^ than it is for the hst test, but the differences are scarcely significant. If sHps alone be considered, the correspondence with the scores in the dictation test is considerably closer than with the scores in the list tests, but if the total errors are to be considered, the relative merits of the two tests are exactly reversed. That is, the children who make slips in their compositions are the children who misspell in the dictation test, rather than those who miss in the list test; but if the combined scores of sKps and misspellings be counted, the corres- pondence between total errors in the composition test and scores in the list test is slightly greater than the corres- pondence between the total errors in the composition test and scores in the dictation test. If the limit of variation be restricted to half a unit of variability, these relations are altered very slightly. On the whole, there- fore, judgments as to the way children will spell in their compositions, based upon either a list or dictation test, are fairly reliable. Those unfamiliar with statistical methods wiU find the graphic record shown in Figure 22 on page 135 a more satisfactory basis for judging of the relation between the three sets of scores. While in general the correspondence ^The Pearson Coefficient of Correlation for the relation between mis- spellings and scores in the dictation test for these same 42 children is .67 (probable error .06). SPELLING TABLE XXX 133 Coefficients of Correspondence in Different Trials of Spelling Tests^ \ CLASS average median individual deviation from average TOTAL RANGE OF SCORE S = Slips M = Misspellings T = Total Mistakes. . . L = List Test 5 22 27 51 70 5 14 15 24 20 0— 31 0—87 0—88 5—100 D = Dictation Test. . . . 10—100 1 Based on the scores of 42 eighth grade pupils. Coefficients of Correspondence Percentage of Total Cases Which Do Not Vary in Relative Position, in the Two Distributions Compared, More Than] One (or one half) Unit of Variability. 1 unit § UNIT s M T L D s M T L D S 48 45 43 62 S 24 19 20 31 M 48 — 83 74 78 M 24 — 74 45 38 T 45 83 — 67 64 T 19 74 — 34 40 L 43 74 67 — 88 L 20 45 34 — 60 D 62 78 64 88 — D 31 38 40 60 — This table is to be read as follows: If the relation of the scores of individual children in number of slips made in their compositions to the median number made by the class as a whole be compared with the relation of the same individual scores in misspellings to the median number of misspellings made by the class as a whole, 48 per cent, of the children will be found to have maintained the same relative position in the two sets of scores within one unit of variability; that is, within 5 slips, or 14 misspellings. 134 THE GARY SCHOOLS between scores in the composition test and the score in the formal spelling test is relatively close, there are among these 42 individuals some 5 or 6 who do better in their compositions than their scores in the list tests would warrant, and about the same number who do better in their formal tests than their spelKng in the compositions would warrant. That is, spelHng ability is specific, not general, and performance in any one test is dependent upon so many factors that performance in a single test is not a reliable basis from which to judge of an individual's performance in a related test. How- ever, if the two tests are closely aHke, as for instance, list and dictation tests, the inference will be more reliable than from performance in one test to performance in a test of totally different character, as from performance in a list test to performance in a composition test. The list test and the dictation test upon which these computations were based (Table XXX, page 133) were of very different degrees of difficulty, one being composed of words which were easy for the grade, and the other of words which were difficult for the grade. Search through the records of various classes brought to light one class in which the records in the dictation test and the Hst test were closely the same. This was class No, 11, Emerson, rated as a fourth grade class in June, 191 6. The average score in the list test was 55 per cent, and in the dictation test 6 1 per cent. The coefficients of correspondence based on the scores of this class were also computed (Table XXXI, page 138) . In this case the relations shown in the SPELLING 135 FiGtJRE 22 Comparative Abilities in List, Dictation and Composition Tests VaritMlty Rjtia Coi TMpenjMic el L 1 5T- DICTATION &T.5I-70 v«. 24,-20?o COnPQSITIQN .'^PEIUNS CT. £2 Vt. I I s I r I ' I I I I ' I I ' I ' I I ' ' ' ' I I ' I ' I I ' ' ' I I I • 5 10 15 20 25 30 2-5 40 Total Number of C«iei.4Z_Numb«r wrthi n I \\„U 31 ~33 Percentego of CorrwpondcncoZizZ^?. Rcmftrlu Correspondence List- LIictaildn 6&7o The scale along the base of the figure represents individuals — ^42 eighth grade pupils in all. The scale along the vertical axis represents the position of the individuals in the distribution. The line marked zero represents the class median. The lines marked i, 2, 3, 4, and 5 above or below the median line represent differences from the median equal to one unit of variability (in this case, the median deviation of individual scores from the class median). For each individual the score made in each of the three tests is indicated by the lines. Individuals are arranged in order of their performance in the composition test; that is, individuals i to 8 had no mistakes in spell- ing in their compositions, individual 42 had 87 mistakes in spelling per hundred words written in his composition. The broken line is based upon accuracy of spelling in the list test. 136 THE GARY SCHOOLS FiGXJRE 22 — Continued The scores range from individual 20, who missed none of the 20 words, to that of individual 41 , who missed 19 of the 20 words. The dotted line represents the scores made in the dictation test. The scores range from those of individual 16, who missed none of the words, to individual 39, who missed 19 of the 20 words. It is possible to tell from the figure for any one individual his relative position in the class for each of the three tests: thus, individual 11 stood very high in the list test, not quite so high in the dictation test, and still lower in the composition test, but in all three he was among the upper 25 per cent, of the class. Individual 7, however, stood at the top of the class in the composition test, a little above the median in the list test, and in the lowest 25 per cent, in the dictation test. The curves show that about 75 per cent, of the individuals maintain the same relative position in the class distributions for the list and dicta- tion tests that they do in the composition test (within one unit of varia- bility). The figure shows also that the correspondence between the list and dictation test is closer than between these two formal spelling tests and the spelling in the composition test. Eighty eight per cent, of the individuals maintain the same relative position in the list and dicta- tion tests within one unit of variability. previous table are reversed. It is the list test that corre- sponds more closely with the scores for slips in the spell- ings, while the dictation test corresponds more closely with the total number of errors made. As before, the correspondence between the list and dictation scores is greater than that for any other relation. If the limit of variation is reduced to half the median deviation, the magnitude of the correspondence of course decreases, but the general relations are not greatly changed. The fact that a child misses certain words in the formal spelling test is not an indication that he will SPELLING 137 misspell them in his compositions if he uses them. On the other hand, a child who misses words in either the list or dictation test is quite often the child who makes many slips in writing a composition and has many words improperly spelled. An attempt was made to check these conclusions by direct evidence from the three spell- ing tests. It was hoped that many of the words used in the formal tests would also be used spontaneously by the children in the composition test. Careful search, however, brought to hght but five words which were so used (Table XXXII, page 139) ; therefore the results are too few to warrant any conclusions except the two pre- viously stated, namely: (i) that measurement of spelling ability is a much more difficult thing than it is popularly supposed to be, and (2) that children, as a rule, use in com- positions only those words which they know how to spell. RELIABILITY OP CLASS RESULTS The preceding discussions have had for their purpose the full statement in regard to the unreliabiUty of the performance of an individual in a single test as a measure of his general ability in spelHng. Teachers and principals unfamiliar with testing work are often amazed to see a child whom they have been accustomed to rate as their best speller make a low score in the formal test, while a child who has repeatedly failed in all school work in spell- ing makes a high score in the same test. Their amaze- ment, due to the apparent contradiction between the test results and their judgments, has in many instances led 138 THE GARY SCHOOLS TABLE XXXI Relation Between Results of Different Spelling Tests^ S = Slips M = Misspellings. . . T = Total Errors.. . . L = List Test D = Dictation Test . CLASS average 19 31 50 55 61 MEDIAN INDIVIDUAL DEVIATION FROM AVERAGE 9 16 16 20 14 TOTAL RANGE OF SCORE 0— 55 0—102 0—157 20— 85 20— 90 iBased on scores of 23 fourth grade pupils, Class No. n, Emerson. Coefficients of Correspondence Percentage of Total Cases Which Do Not Vary in Relative Position More Than One (or one half) Unit of Variability. WITHIN 1 UNIT WITHIN ^ UNIT s M 52 T 53 L 48 D 39 s M 35 T 35 L 22 D s S — 26 M 52 — 61 48 30 M 35 — 35 22 4 T 53 61 — 26 43 T 35 35 — 9 30 L 48 48 26 — 61 L 22 22 9 — 40 D 39 30 43 61 — D 26 4 30 40 — This table is to be read as follows: If the relation of the scores of the individual children in number of slips made in their compositions, to the median number made by the class as a whole, be compared with the relation of the same individual scores in misspellings to the median number of misspellings made by the class as a whole, 52 per cent, of the children will be found to have maintained the same relative position in the two sets of scores within one unit of variability; that is, within nine slips, or sixteen misspellings. SPELLING 139 g Q "A < »x2 C0 05OON 00 S^a tH l-H 3 S p ii < s 'A § Q « W ^ 2^3 OtHtHOO (M B ill w E^S § s P4 w g «5 ■ « S5 & HH i POO H f=< (NIMOOO -* 3 p< « H w S & w " n ^ s M §g^ i^2 (MOOi-HO CO w P &I '^ Q ^ S t- 'CO T-l i-H (M t- a 1— ( c-i g&: • *J ■ : : • c : P ■ . u • ■'Op Minute, Piece. . Suggest Arrange Pleasan i TO _, fll ^ CL O to o ^tJ ^ S-^ 9 o ef= S.sf •is t.v 4J (UTS n . -M 4) bo W S O •S M g " ".■5 o. ::|s O g u. tn OT3 ■" "" "S jij .-t; >. m -d .. C 5=S tn'3 t:; © s "55 1? Q "^ a, 5. o ^ 3 Pj^ o t ^ C M .S ■«-• M I40 THE GARY SCHOOLS o o ceioiomooooooooo 00 r}J t> -rl< (M N CLOU3lOlOO-^lOOt-OOOU30COU30t>lf3lOlOO (MC0U3 oeoTHi-HCo6 I ++ 1 I I +++ 1 I ++ 1111111 + (M(M(MeOCOCOCOCOCO-<*'*-<#'^«3"3UiU3>0?0«D;050 2 a 2 SPELLING 141 o o t- o o 10 o th ui TT+ I OOOK3 ■r-t IM t-" (m" I I 00 to U3 100 ct-t-00 00 M < 00 ti ■<-> p >, ^-^^ +J +J fl o ■" ID kt U O Oi bo Sn 2 ea « -a— g ^ U P ii'Tl <" «T3 f^i O) «« «« m«4H 5 ^ < O o s 1— 1 Hi 8 t-IOlM'-Ht-OOOOOOO !?! o u < H '^OOOOC^l^lMOTHr-l i-H rH 1-1 a 13 O >< < 8 1 OOSOOOOCX3«DtJ<(M p lOiOSOOOOiOSOi-HN lH tH i-H S ^ s O t— ( H < < < cococoeoo5oocooo(M 1-1 H tH 1-1 iH o flj Oi « > H U w ? I*: S o I ^ 2r ii .£? 100 lo It t- o eo -^ 2 t--N2 co'*ooooio0'^05?5 OiC>lt>00TH5OO»T-lC r-i oo.g> e9 ARITHMETIC 151 by successful business men (Table XXXV, page 152). Thus there is evidence that the tests measure skills of value in life and that rates of 12 examples or more, and accuracies of 80 per cent, and higher, are necessary in many forms of commercial activity. The Gary eighth grade product measured by such standards is very low in rate of work, and inadequate in accuracy.^ Much more significant are comparisons with the achievements of children in conventional schools. In addition a score of 11.6 examples attempted (Gary 8.4) and accuracy 76 per cent. (Gary 57) may be taken as the average achievement of American schools in northern states (Tables XXXVI and XXXVI-A, pages 153 and 154). These ''norms" are derived from tests given in May and June, 19 16, in cities of every type, large and small, and in widely separated states. Comparison with scores from large cities (Boston 13.7 examples, accuracy 78 per cent.) would make the Gary results seem corre- spondingly lower, and with lower scores from the smaller cities, or from rural schools (a county in Pennsylvania, 7.7 examples — 52 per cent.) correspondingly higher. Judging from all comparative data available, however, it is quite plain that Gary is low both in rate of work and in accuracy. The Gary scores in addition, plotted in relation to the median development curve based upon the scores adopted as representing average conditions throughout the ^See also Hanus & Gaylord, Educational Administration and Super- vision. November J 191 7. 152 THE GARY SCHOOLS as 1 °5 OS SgSI SS3 » .-< N CO d '^ oi CDt> ^Tt CO MO M -*ibcdu2oi 1 od gs t>ooqqi>q r-i Tt'td d N oJ t-mooo q coirt •"iiod^oN I in N«o .-INt>t~COC Minooooi>c t>i>i>c»oi N in w ■* t> CO «bco.-ioiin I> 00 00 00 00 N id 00 "-I c^ M o«^r! ^ .-T ^co t> t^ MOO 35 N ■<1' -^-^COCO-* 0(300t^ Oi-JdiOOOl I C^N^r-(rt loq 00 00 [>C' Tt<' 00\0 ID 1 d-H rt l-( rH r-l r-l rH .-H jvq •* N f- ■* ■* in t»oo i-H ,-1 tH Tf CO C^J III u OS o :i!^. mn ,a&:s _0 3 3 rt e; c rt to &-■ So * «-2 " » &6fl O d S "^ " g.gja Sf o ^^ ». ^^ 5 O. CO >,a3 « ol^ .2 oO m =^ '-' tS a -^ i> S^ M in ^ 3 ti >> S^Su ^ r2 u . ^ g ag g o .2 W 4J W U o g SJ3 >.o.2.ti c '-' J3 .. C u .2 « C g ,-, " fe O- u O ° u - C.2 (uia ) g v.. o o. .S-d la w •a «J S Si ^ i BO P^PM ^05 0»-H gens M n d e to o ARITHMETIC lOr-iinoo t> •<* to ■* «£> t> ■<* to 00 0)^0 SSc t>00O)COM toooooid (O-^OO tdt-'oOCT) Tj> §S3§ ■-a 2^1 ►SOWW 153 P3S o te en u-f^ m O S "il .a & • ' 5 u fe I- 2 -a 5 3 4) M o y, nJ p3 -Q^^ o G o ™ •^ o cj o ■ , ^' o ra o ■ « j^ m K in -y ^ _ I* _ a oj ° 3 . C u u,S oj •^ oj n < ?DC0i-H*Oeo«>Oi t- 05 O tH (M CD t- OV o" T-l CDi-IN«Ot- Tj< CO 00 OS O m Iz; H«0 < B H OS !^ ^ d H t] P M ^ O M < S H en 1 >-l COCOC-IMtH t-' oo' OS rH N SOUS 00©© CD t- oo' ©' iH COOSjHC^ t-' 00 © i-i COCOCDN I cot- 00© ©OiOOO© iO lo" CO* oo' OS ■rJ<00 00(MTJ( CO rjJ \a t- 00 kO t>0OCDTj< Tti lO t-" OS © lOt-lOtO Tji iO oo' © T)<(MrHeOC» !l<' U3 «d 73-^U5COt>00 lrMO«ot-oo >T#U3COC»00 I ARITHMETIC TABLE XXXVI-B Comparative Data, Series B H5 SCORES OF GARY EIGHTH GRADE CLASSES^ CLASS NUMBER 1st trial 2d TRIAL SCHOOL RATE ACCU- RACY RATE ACCU- RACY Addition. Froebel ti Emerson it Jefferson Subtraction. Froebel Emerson Jefferson Multiplication. Froebel Emerson Jefferson Division. Froebel « Emerson Jefferson 45 46 14 15 18 45 46 14 15 18 45 46 14 15 18 45 46 14 15 18 8.7 8.0 8.0 7.0 10.3 9.0 9.7 9.2 8.5 11.0 7.3 7.3 8.2 7.0 9.7 7.7 8.2 5.3 5.2 9.2 50 58 50 53 64 75 67 72 80 92 60 68 71 66 71 70 83 57 78 81 9.0 8.7 8.0 7.2 9.5 8.5 9.6 9.5 8.5 10.4 8.6 9.5 8.7 7.2 9.7 7.5 7.4 6.6 6.2 10.0 55 64 54 68 60 60 72 64 68 81 71 82 71 68 78 85 80 88 75 95 iSee page 194. United States, make evident both a difference in the gen- eral character of the development at Gary, and the fact that the scores made by the Gary children are relatively very low. The Gary curve is concave; the curve from the general results convex (Figure 25, page 157). This difference probably means that in the conventional schools 156 THE GARY SCHOOLS most of the children have learned to add by the end of the fourth grade, and in the remaining grades there are small improvements in both rate and accuracy of work, due partly to increasing maturity, partly to elimination of the less able through non-promotion, dropping out of school, etc., and partly to the effects of training upon those who have not learned in previous grades. In Gary, however, progress in the lower grades is quite uniform in both rate and accuracy, being mainly in rate in the lower grades, and evenly balanced between rate and accuracy in the high school years. The level of work is, however, very low — so low that one is led to wonder how much of the progress is due to training, and how much merely due to the effects of maturity and elimination.^ For example, the twelfth grade scores in both rate and accuracy do not reach those of the seventh grade in the conventional school, and the eighth grade score at Gary is only slightly above the normal fourth grade level in rate and far below it in accu- racy. The median development curve in Figure 25 is based upon results from cities of every t3^e, large and small, and it is hardly fair to compare the Gary results with those from larger cities which are known to do better in the fundamental subjects than small villages and towns. However, a comparison of the Gary scores with those from smaller cities does not alter the general character of the conclusions to be drawn (Figure 26, page 159). ^See Chapter VIII, page 357. ARITHMETIC FiGTIRE 25 Development or Rate and Accuracy in Addition^ 157 Addition— Diagnostic Curve of Median Development in Speed and Accuracr. Gndet 4 to 8 InduriTe^ June 19tt | Sc^l Accuracr 100% ""'""1 1 1 1 1 ■ ' " Y 1 T . 11 Aay elan wbcne poiitioo Ealli on thii lide of the tcurvt t« bi^h ia accuraqr. Do not oe£lect ipeed. 1 1'°. Q — -'t' r-cf -( fH W S o r^ U50i(MOC0ooooooooa50 o N CO lO UO lO ix> O5C0t-e0Oi-^C0t--^(M c^KMco'^uiOLntot- « H >; ^ W Q ri y^ c^ O Z> z - g S 52 s < •^aicoco«oo»-iooooco C0ir5iOtDC-C-O5C5(J5r-( k0OU0?0 t-o «D C t-^ 05 (Mi-i ^ " 6 « N 1) ^ 3 '" =^ B.S C ON & a I -M 1) JU c^ o S C T3 • i .'^^^- ' -- "CLEVZLAND CLEVELAND GRADES 5 6 4 5 ~^:7^ ~- - GAUY In the diagrams the vertical lines represent the median grade scores; the horizontal spaces represent the multiplication tests. In the upper diagram the vertical Unes are based upon the grade scores at Gary. In the lower diagram the vertical lines are based upon the grade scores in Cleveland. In the upper diagram the dotted lines are drawn in the proper relative position to represent the scores at Cleveland, while in the lower figure it is the Gary scores that are represented by the dotted lines. From either the upper or lower figure it will be seen that grades four, five, and six at Gary fall between the third and fourth grade curves for Cleveland. suits; that for a complete understanding of the signifi- cance of the data they must be expressed in terms of rate and accuracy of work. Accordingly, the results of the Cleveland Tests have been thus tabulated, and to make the data from different tests easily comparable (that they might be shown in one graph) each rate score ARITHMETIC 165 has also been expressed as a percentage of the score made by the twelfth grade (Table XXXVIII, page 166). The results in this form show that development in the tables is practically completed by the sixth grade (accuracy 91 per cent.) . The development oi the ability to use a one place multipHer, however, increases more slowly, and does not approximate its maximum develop- ment until the seventh grade (accuracy 84 per cent.). Development of abiHty to use a two place multipher is of still a different character. Increases are fairly regu- lar and equal from grade to grade, up to the eighth grade, but from this point on progress in accuracy ceases and the curve indicates increase in rate only (Figure 29, page 167). In other words, in the simplest work the development is completed early in school Hfe. The more complex is barely completed by the eighth grade. In the most difficult work of all, the development shows no signs of reaching a maximum, and progress is merely cut off at a low level by high school work in which no training in multiplication is provided. In Figure 29 the dotted line is based upon the Gary results in the Series B tests, and attention is called to the exactness with which two sets (Series B, multiplication test No. 3, and Cleveland long multiplication, set L) confirm each other. The meaning of such comparisons is plain; the more complex the ability, the less well it is taught at Gary as measured by intercomparisons of the results at Gary themselves. The evidence from the Cleveland Tests thus greatly 1 66 THE GARY SCHOOLS gH" 8g og ^ COLOtOC-OOO I I .-i.-irHrqM I I 6° Ir^U5N.-lt>ogoOOOO 00 o 00 o cn to M N c W^-*;0Oi>0005< 'il'OOOOCvlrflMOt-Hi-l tOCOOlOOCOCDC^OOOlM C00C soNooLONoocooic^Ln Ln!5i>oooooooooooooo NcOTfLSi>t>ooooaic ooioa)i-ico-*irtmio«5 t-oooooioioGioicna^ NC0"*WtDt>0000O)C 'S' c-^ oi csi •* «5 00 d -H (m' rH i-H rt 1-1 C^J W N Mi*LniOC^00010i-IN fe?5 CO bO II '" ^ O CJ H .. & ^5 II *■ s. a- a a u 3= C 3 — (A a; v^ rtJ3 O & c3 4j B « -c « lu ai •a 01 e-" O V cd & « Is"^! J3 O . 3 1) S.sj « ■5 g>.s 2g-3 S"^ ■5 g « 2 " V ■■ ■ -° o u 3J5 > Eh ■3 8.S » S-O 3-" . « 2 B a gja-d > v_ >^ ^ i^ 3 =^ S s iu -* -- 3 3 ra •-; So ."OS p'^ Si S " S 2g oi'S 35 SJ S " i •a ° t; =3 2'j3 u I ° 3 i^tS-r) ca ARITHMETIC 167 Figure 29 Rates of Development in Three Multiplication Tests AeoMw DEVELOPMENT CURVES CLEVELAND Tests MULTIPLI CAHON 1» TABLES RATE • 9 '• « 3t ^ J4- «5 7i 61 99 »9 PERCENTAGE OF FINAL 5CORE.5 The scale along the horizontal axis represents the percentage the rate scores of each grade are of the rate scores made by the twelfth grade. The scale along the left of the figure represents accuracy of work. The three solid lines represent the curves for the three multiplication tests which differ in complexity, ranging from a very simple test of the multi- plication tables up to long multiplication (four place numbers multiplied by two place numbers.). The dotted line represents the results from the multiplication test in Series B. In all, the circles show the position of the grade scores in both rate and accuracy. The curves differ markedly in their character; that for long multiplication showing no signs of reaching a maximum. The development is merely interrupted at the eighth grade. As a whole, the set of curves means that the more complex the ability the less well it is taught. 1 68 THE GARY SCHOOLS strengthens the conclusions drawn from a comparison of the Gary results with those from other cities. The clearness with which such diagnostic tests^reveal the story of what is taking place within a school sys- tem is strikingly illustrated in the case of the fraction tests. Fractions represent, of course, a more complex t3^e of development than multiplication. The exam- ples in the tests called simple fractions were all of the type -T'+f or f — \, in which the only response called for was the addition or subtraction of the numera- tors. In the test called complex fractions, however, the examples involved reduction to a common denomi- nator and reduction to lowest terms. Further, the complex test included multiplication and division of fractions, as well as addition and subtraction. The examples in the test were of the type | + f • Twenty one was the largest denominator called for in any ex- ample and all the denominators were products of simple factors. The character of the development revealed by the results (Table XXXIX, page 169) is a confirmation of the conclusions of the previous discussions. For the test in simple fractions the increases in accuracy from grade to grade are relatively large and continue up to the tenth grade. For complex fractions the period of rapid increase does not begin until grade seven, and from that point is in accuracy only (Figure 30, page 171). In other words, the results for fractions differ from those for mul- tiplication in exactly the same general fashion as the ARITHMETIC 169 OOO'-H'-ilO IrttaO-^rHCOt^NOOO 'iHN CO CO cot' CO CO oooLot>in 2 . ^0 4-> OJ •g^ t-1 =1 m f: ^ OC^01-CMCOCDin 00 00 00 co-*in;oc-oooO'H 2353 .2T3 s g « 9 C! 3 S n o !> c'-' uo-woc pic'- «r a 8 g^l =^ " ■* ti>> -OS- "o-^ ;; ■^S^ccb'5^^ •S'o '^S'-' H •- " uT'!^ o 2 ° '>' 2 o. S § g u « «'t3"«*.;g •■-' S "^ t-* hfc. _. Ti ra )ri .5 3 S S "i ^ gf^£? -wja o.c3 8-9* 2 " 1) oja oi K !3 y S So b.2 i>-^ K, ^^^ .-c p.S-g ,u >, >-i£C,„.2SBSaoi rtj3 4; atj H oW a •S S " ^ s-a-G^ « 'T j3 oj_, h S o o 3 >• .a jis'o-^ P.10 " 0) o J3 "CIS tn^ja >,A-Z lyo ^ THE GARY SCHOOLS multiplication results differ among themselves. The more complex the abihty, the less the development. SCHOOL TO SCHOOL COMPARISONS A comparison of the scores from school to school reveals rather larger differences in arithmetical abilities than in those discussed in the previous chapters. The school which is least well equipped to carry out a modern program, the Beveridge, shows quite uniformly in all tests a larger number of scores above the city median than the other schools. Jefferson is second, Froebel third, and Emerson fourth (Table XL, page 172). That is, the Emerson school has, proportionately, a larger number of low scores than any other school. In all schools, however, there are individual classes which have scores above the city median, and others which fall below it. A similar school to school comparison based upon re- sults in the Cleveland tests gave very similar results (Table XLI, page 173, Figure 31, page 174). Neverthe- less, the differences from school to school are relatively insignificant and probably mean merely that the Beve- ridge school gives more emphasis to the drill work. In Emerson, on the other hand, arithmetic is, in general, receiving less attention than in Jefferson and Froebel schools. In all the schools the very best classes have scores much below those made by children of correspond- ing grades in other cities. ARITHMETIC 171 FiGtJRE 30 Rates ot Development in Two Fraction Tests DLVELOPMENT CURVED CLEVELAND TE5T^- FRACTIONS 5IMPLL COMPIXX 3 « 9 II 15 IS Zl 27 3» The scale along the horizontal axis represents the percentage the rate scores of each grade are of the rate scores made by the twelfth grade. Scale along the left of the figure represents accuracy of work. The two soUd lines represent the curves for the two fraction tests; one, the addi- tion or subtraction of fractions having the same denominator, the other, four operations with fractions having unlike denominators. For illus- tration of the type of examples see text. In both curves the posi- tion of the figures indicate grade scores in both rate and accuracy. The curve for the simple fraction test shows a smaller rate of rise in accuracy than the curves for long multiplication in the previous figure, and for the reasons there explained. The curve for the complex test in fractions indicates mere growth in rate up to the seventh grade, and from that point on, growth in accuracy only. As a whole, the two curves show that the Gary children have very little ability to work with fractions. 172 THE GARY SCHOOLS w w CO t- (M u:) 010 7-1 th CO CO U U3 00 t- 00 OS Oi K3 ao t-00 00 00 pq W oolOth t-0000 th c t~ 05 -^ CO 00 cgeo* M rHCgCO < u M 05 cot- CO Oi-^ (MIMCOTjtiO H h-1 (MtHC20OC0C0 CO ■>* »0 «D t- 00 S :Sii o .s JlJ "" a-^ M > m 1 cit g on impl 4) 3-r5 0. tj-S ^ rt 0.3 C^ > :> >i ^•S 4> *-' , > IV. ttj tn .a H X) i^ rt -^ « tc^iJ -0*" ., thir t, in ti th 0) ji^.a tn ^ M.^ *^ u \n &« a "^ ^ S-^ K ows: exam Ldtipl > pq ^ *-' -, t-( . .S:S ARITHMETIC 173 MEASUREMENT OF REASONING POWER It may be contended by some that in place of skill in computation the Gary children are receiving a type of training which develops reasoning power instead, and makes Ihem better able to cope with arithmetical situa- tions after they leave school. The answers to this claim are: First, no evidence of such superior ability to grapple with arithmetical TABLE XLI School to School Comparison — Series B — Two Trials The results below show the number of class scores in each school which are more than one tenth of the corresponding city wide scores above or below the general results for the city as a whole. All four operations are combined. SCHOOL NXIMBER OF CLASS SCORES COMPARED NUMBER OF CLASS SCORES ABOVE AND BELOW CITY WIDE RESULTS RATE ABOVE RATE BELOW ACCmtACY ABOVE ACCUBACY BELOW Froebel Emerson Jefferson Beveridge. . . . 176 78 104 96 34 2 37 37 27 27 27 28 50 6 30 47 46 17 26 13 This table is to be read as follows : Out of 1 76 class scores in Froebel compared with the corresponding city wide scores, there were 34 in rate and 50 in accuracy markedly above the median, and 27 in rate and 46 in accuracy markedly below the median. That is, the Froebel school ranks were slightly above the general results for the city. Simi- larly, the Jefferson school is slightly higher than the Froebel school, but lower than the Beveridge school, and much above the Emerson school in the abstract work. 174 THE GARY SCHOOLS Figure 31 School to School Compaeison CLEVELAND ARITHMETIC TEST 5CH00L TO SCHOOL C0riPARI50NS HI.GHE5T AND LOWEST Fl FTH GRADE CLASS ES GfJADC 3 4- 5 6 7 a TE5TA \ B \ •' B > y " S < -< X fl D v^ > ^ '» E / EMEfLSON \ BEVERIDGE B CLEVrLANO C Comparison of scores made by fifth grade classes in tlie Beveridge (highest) and Emerson (lowest) schools in Gary with the city wide medians The horizontal spaces represent the Cleveland arithmetic tests in multiplication and fractions. The vertical line represents the Gary city wide grade medians. The line marked "E" represents the scores from the Emerson school; line marked "B" — the scores from the Beveridge school; the line marked "C" — Cleveland fifth grade scores. The fifth grade class in the Beveridge school falls about one year above the average for the fifth grade city wide scores, whUe the fifth grade in the Emerson school falls about an equal distance below. The curve for the average scores of the fifth grade in Cleveland was repre- sented by the third curve except in the complex test in fractions, in which the Cleveland score is given as zero. The Cleveland fifth grade scores fall between those of the seventh and eighth grades at Gary. ARITHMETIC 175 situations was discovered in the course of the survey, although the children were repeatedly required to score their papers, and to perform other work inciden- tal to the testing which required an intelHgent use of mathematical skills as a means to an end. Second, however well a person may be able to reason, his work in the world will be ineffective if he does not have the mechanical skill necessary to obtain correct results. In educational circles there are some who claim that a child will develop such skill as need arises, provided he has a motive for doing his work correctly. The reader should be careful to note, however : First, that the scores made by the high school classes show very small increases in ability over those of the eighth grade in spite of the fact that these classes have more or less incidental training in arithmetic through its use in algebra, physics, chemis- try, etc.; and, second, that there is no evidence of any beneficial transfer from the incidental use of arithmetic in the shop work and activities of the enriched cur- riculum. There should, however, be no confusion in the mind of the reader in regard to this point. The tests used were not given to measure reasoning power. ^ They prove merely that the Gary children do not possess the ability to add, subtract, multiply, and divide at a reason- able rate and with reasonable accuracy — defining reason- able as that rate and accuracy which is attained by the average child in the conventional school. ^See Chapter VII, page 328. 176 THE GARY SCHOOLS §2. Critical Discussion TYPES or PRODUCTS Training in arithmetic falls sharply into two divisions, (i) arithmetical computations and (2) reasoning. The products of the first type of training are mechanical skills or habits. Training involves building up a set of responses to objective stimuli. The stimuU themselves, the controlled associations called forth by them, and motor responses, are the elements out of which such skills are built. The products of training have two funda- mental aspects: speed, or, better, rate, the amount of work done per unit of time; and accuracy, or the re- lation of the work that is correct to the total work done. Both of these are easily measurable in objective units. The higher thought processes of the second type (reasoning) are much more complex, and the products of school training in them are much less clearly defined. Moreover, as in testing work, all reasoning problems must be represented through printed symbols, the actual results obtained in a reasoning test are merely unanalyzed result- ants of reasoning abiHty and abihty in reading. In view of the many uncertainties anddifl&culties connected with test- ing such abiUties, and interpreting the results, it was de- cided to limit the measurement of arithmetical products to the fundamental skills. TESTS OF SKILLS For the mechanical skills of arithmetic, well standard- ized tests and standards and a growing volume of com- ARITHMETIC 177 parative data are available for interpretative purposes. The Courtis Standard Research Tests, Series B, measure the end products of training. Certain of the arithmetic tests used at Cleveland, namely, those dealing with the various phases of multiplication and fractions, trace the relative development of these abilities. The tests, as a whole, therefore, show plainly the nature of the devel- opment and the character of the product of the classroom teaching of the fundamental skills. The expression "end product" needs definition and explanation. In multipHcation, for example, it is easy to show that the products of training in the various grades differ greatly in complexity. In most school systems the children begin development of skill in multiplication by learning the multipHcation tables. At a later grade they master the technique of carr3dng. Soon after that they are able to multiply a four or five place number by any single digit, and finally they are able to multiply any integral number by any other integer. This is the end of the development in multiplication itself, although further training in the use of this skiU is necessary, A test which is designed to measure the most complex form in which a given skill is found is a test of the end product. The significant points to be noted in the foregoing discussion are two: (i) that some children in every grade above the third complete their development in multipli- cation as far as their maturity permits; (2) that each type or partial phase of development is in reality a distinct abihty. Each of these points will be discussed further. 178 THE GARY SCHOOLS The type of ability selected to represent the end ' product is, of course, a mere matter of convention. The convention adopted for the Courtis Tests is that in any operation the units selected shall be the smallest that cover all the essential elements. These elements for the different operations are as follows : Elements Covered by Type of Examples Used in the Arithmetic Tests (Series B) addition subtraction multiplication division 1 Knowledge of Combina- tions 2 Bridging the Tens 3 Carrying 4 Attention Span 5 Fatigue 1 Knowledge of Combina- tions 2 Borrowing 3 Fatigue 1 Knowledge of Combina- tions 2 Place Value 3 Carrying 4 Addition 5 Fatigue 1 Knowledge of Combinations 2 Place Value 3 Estimation of the Quotient 4 Multiplication 5 Subtraction 6 Fatigue The smallest types of examples^ that can be selected to include these elements in their simple form are given below: addition subtraction multiplication division 927 297 136 486 379 925 340 765 756 473 988 524 837 983 386 140 924 315 353 812 110 661 904 466 854 794 547 355 965 177 192 834 344 124 439 567 107,795,491 77,197,029 75,088,824 57,406,394 160,620,971 51,274,387 80,361,837 25,842,708 3,597 73 5,739 85 4,268 37 6,428 58 94)85,352 37)9,990 73)53,765 49)31,409 ^Except in subtraction. ARITHMETIC 179 The figures in these examples are not determined by chance, but in accordance with a systematic plan. For instance, in the multipHcation examples the reader should notice that in multiplying 3597 by 73 and 4268 by 37 a child is called upon to use every one of the combina- tions vof the three and seven tables except the one and the zero combinations, for 2, 3, 4, 5, 6, 7, 8, and 9 are all represented in the multipHcands. In similar fashion, care is taken to test all combinations and situations throughout the tests, the combinations omitted being only those which appropriate tests have proved are of extreme simpHcity. Equal care is taken in all of the tests to cover for each operation every factor mentioned above, and enough material is provided to keep even the bright- est child busy for at least four minutes. For in four minutes the average child will reveal any marked ten- dency to make errors because of a lack of control of those forces which tend to divert attention after a few minutes of continuous activity of a single type, forces commonly described by the word fatigue. The care taken in the construction of the examples for these tests makes possible the construction of other tests of equal difficulty but differing in every answer. At the time the Gary survey was begun, three such editions were in general use throughout the country. These were known respectively as Forms i, 2 and 3. Form 3 was used at Gary for the first test, but in order to pre- vent any possible suggestion that there might have been direct preparation for the tests, a fourth edition, Form 4, i8o THE GARY SCHOOLS was prepared and used for the first time at Gary. This was done before the tests of Form 3 had been scored. The tests were given under precisely the same condi- tions from the fourth to the twelfth grades, inclusive. Even in the fourth grade, children were found who showed by their scores that they possessed to a greater or less degree all the abilities measured by the tests. For instance, the percentages of fourth grade children who equal or excel the eighth grade median score for examples correctly worked are: ADDITION SUBTRAC- TION MULTTPLI- CATTON DIVISION FORM 3 6.7 60 FORM 4 6.3 60 FORM 3 FORM 4 FORM 3 FORM 4 FORM 3 FORM 4 Per cent, of fourth grade children equaling or exceeding eighth grade median Per cent, of fourth grade children getting one or more examples right 1.5 16 6.0 69 1.0 49 .01 55 .05 17 .08 21 These figures would make it evident that the tests measure very simple skills, the teaching of which is completed in most schools by the fourth grade, since, at the time the tests were given, from one sixth to two thirds of the children "knew how" to get at least one example right. This means that were these children given time enough, they could complete every example in the test and get every example right. Therefore, as given, the tests measure skill, or ability to do, not mere ARITHMETIC i8i knowledge of "how to do." In the lowest fourth grade class (4C) there are none, or very few children, who, at the beginning of the year, can do long division, but by the end of the 4A classes, a knowledge of this process has also been acquired. The tests are, therefore, measures of the end product of teaching effort, and the changes in scores from grade to grade are due to changes in skill, not to changes in knowldege. Increases in skill from grade to grade are determined by three sets of causes. Part of it is due to increasing maturity of the children as they pass through the grades.^ Eighth grade children are certain to show a higher rate of work than fourth grade children, simply be- cause they are older and, consequently, have more highly developed nerves and muscles. Part of it is caused by daily use of the four processes in arithmetic in and out of school. A child is under a steady pressure from his teachers and his school work to perfect his skills, and both rate of work and accuracy are benefited. And part of it is caused by teaching effort. If a teacher of the sixth grade discovers that a particular child does not know certain fundamental combinations, or has faulty habits of carr3dng, or does not know how to control the critical pulses of his attention, she may make such explanations and provide such training that the individual will overcome his peculiar difficulties and rise at once to new levels of abiHty. The second point to be noted is that each type of ^See Chapter VIII, page 357, l82 THE GARY SCHOOLS pq ^ I Si O H O M U H o w 03t~- I COOOr-lOOOO ' ■>-( OJ tH rH eq CO CvJOOC< u u < 1.5 OeO 1 T-IO(MN t-t-o 1— IT-I 't— ICQi—lrHT— INi— 1 CO (MOOU3000000t~t>lO T-H CO «£> CO cq ca w oa u3 U3 cgoiirq OU^OOOOOCOOSM-*""* T-lrHCO(M*'^?Dt-(M '-< 05 O > c a c. « bl '=^ -S! !?1 a S2 >.' SG^ 43 ij g oToo -w -O -H « C li -^-c) «i& ^'55^ o':2 g.° >^, •r « s^ M ^ ■" (TJ S u -fi ^ ^ 05 >'5'^ D w S ^p. •<-< ••-' rn >-< (-H OJ (U 1) % 4J O _ c3 p-S am :3 1-^ >■ O oi 0 I I I IWLO'^CONrtO **I c/j H "S •§ g r'S (5 H 1-' S w •-IrH I M-^-^^O-^COi-l •^J -^-^ I I COrHW«t I .-H •J J I I I ^ I I ^ I M SJODg mtn-*-*M I CM •^d I I I I i fo I 9J03S •JIjI ajoag •JJ ajoog •JJ sioDg •■ij aioog IDfDtn I Tj-M I M 1 I-hI M t~t>!Oin I I CO I I I-'hI I COOOC^OlrtTji I I 1 1 1 1 1 » I I I I i I 1 1 1 1^^5 00>OOr~(OI/5T)^.is a> '^ O (3 cd 55 H h c « "3 a rt O g 2-0 rt . a u * g^"'^ « O !*.- O -,T3 n d - « "- E c aj aj ■*-* — ^ c *-' rf O O ^ •S O.CS Is 50 •c OJ OJ 3 g " c u " CI c d«>E-^ .2 d ^--a^ I cut! o ''-C^J:; _ tSj- ? =« ui 8Sg^>.-35 ^ " " -tvJi 5 S^S^rSH §. igo THE GARY SCHOOLS meaning whether an accuracy score is 37 per cent, or 12 per cent., when both are so low as to be of no practical value, the errors in this part of the scale are not serious. The source of the error lies in the fact that scores of widely different values are grouped together. If the reader will refer to Figure 32, he will see that three children who made scores of 6 examples tried have scores for examples right which bring them into the lowest accuracy column. These scores may be either 2, i, or o examples right, the record sheet does not discriminate between them. On the average, the scores will be i and the accuracy 16 per cent., but if in a particular case they happen to be either 2 or o the actual accuracy will be either ^^ per cent, or o. In the case of the Gary scores, the effect of such lack of discrimination is to raise the accuracy scores. If the conventional distribution of the scores for examples right is made as shown in the tabulation in the table on the right of Figure 32, the median score for examples right will be 1.7 examples. The class accuracy based upon the median score of five examples attempted would accordingly be 34 per cent, instead of 52 per cent., as shown by the form of tabulation used. Such large differ- ences, however, occur infrequently. To check these results, the scores of all fourth and eighth grade classes were tabulated both ways. For instance, in addition, in 79 per cent, of the cases the differences are either zero, or show that the accuracy ARITHMETIC TABLE XLIV 191 DrFFEEENCES EST THE ACCURACIES OF FoURTH AND EIGHTH GrADE Classes^ NUMBER OF TIMES TOTAL CASES SUM OF DIFFER- ENCES AVERAGE OPERATIONS STANDARD LAiGER RIGHT LARGEK DIFFER- ENCES Addition Subtraction Multiplication. . . Division 26 27 24 23 7 6 8 6 33 33 32 29 165 242 149 183 5.0 7.3 4.7 6.3 Total 100 27 127 739 5.8 »As tabulated in the Standard Record Sheet and as computed from a tabulation of scores for examples right. This table is to be read as follows: S3 class scores as to accuracy in addition were computed in two ways. One, using the standard form of record sheet shown in Figure 32, and the other, tabulating the number of examples correctly worked and computing the accuracy from the median number of examples right compared with the median number of examples attempted. In 26 cases the first method yielded • larger results; in 7 cases the second method yielded larger results. The sum of the 33 differences in scores, without respect to sign, is 165 per cent., the average difference, 5 per cent. The average difference for 127 cases is 5.8 per cent.; that is, four times out of five the standard method jdelds scores which are, on the average, 6 per cent, higher than they would have been had they been computed from the number of examples worked correctly. as determined by the standard record sheet is higher than when determined from the nmnber of examples right, (Table XLIV, page 191). The average amount of this difference is 6 per cent. The accuracy scores in the tables of this report, therefore, are either the true scores or scores too high by an amount which, on the average, will be 6 per cent. 192 THE GARY SCHOOLS o H u H P5 P to o s < la S f-H f-H ►-^ c^l Cv) -H ++ +11+ 1 1 1 1 1 ^SScoinin5SirtS!sSc» mis t5?gs??ss3ss^ssg o 1 d s ■'j'ccocooOTj'NtCMNmooeo + 1 +++ 1 1 + M 1 + 1 "^1 •-<-*OtOOTlt^OMiNooinoooooqu5NO CO •* -t TjJ in in «> 50 1> t-V oi j^ 1 o i < s 050MMIN'oo«DocC0ONinMMNC0t^00 1 I+++I + MT+I "§1 ot~ocoinNcoNt^toMOLn '^co^M'WLn^t^txotoo'. o> 1^1 o^t>OLnoooo>CTiootnMM •*Tl'-<*TjiT)'t/5O5DC^t^I>C0O ^2 . Oo UP5NNrHooin«t£it~t^Tfog ARITHMETIC 193 I ++ 1 ++++++++ O^N^COtCtOOOINCOOiMm NC0CO«CMCOCOCOlOLOI>00O5 SSt '+T+++1 1 I I+++ c^NOOO^c~N"-noi>ino N •<* i-i CO cvi ■^ -^ -"t w in ^D oi NW•*O-H^3^0trtC0ONOt> ++++I+++I I I + ■§3 5SS MOOt>NOOOCOIMOOINO ++++++ I ++++ LnOMOC^U5NCONO^C0t> co''t>t>.'aioi ;3 I Oo MTftflOOCOOMCROOOOOC^ cocococoLn-^cotoiB^tDoioi O»NNf-tOOLfiM> s T? Ct-i c ta a C gradf differe i secon betwee e accur u a ^^^ji-s 4> School re was icy on examp etween i pgs:- .a &1 rn'-y Q'a W" 0, ^S .— • '-•'i-" « 60 ■S^i^g ^ 9 in ial. r ce ores 2 of S . *^ a u ■ 0*^ a" u i ass K econd as 22 erate 3 wer ,. > !3r„ " Q u " ^5 S 'c3 -M ..■33 =! « ^3. ■ollows les on first tr betwe lie rate ,„P.(U5?-^M *> ^ i5 § g| ri 'O w r^ OJ IJ C WJ !a2 _. 'rt *H u oj ^ "^ ^ .2.2 .5|2"o. 194 THE GARY SCHOOLS It may be contended that the median accuracy should have been computed from the accuracy of the individual papers. That this is better is conceded, but the time cost is prohibitive, and the advantage small. In the case of the class shown in Figure 32, page 189, for instance, the median class accuracy computed from the individual accuracies is 50 per cent, as compared with 52 per cent, computed from the standard record sheet and 34 per cent, computed from the median number of examples right. When the median accuracy does not fall much below 50 per cent, the standard method is much to be preferred because it preserves the scores in their fundamental relationships, and the results will differ very little, if at all, from those obtained in the longer, but more accurate method. Moreover, to be comparable with the results from other cities, the Gary results must have been obtained by the same methods. For these reasons the standard method was used at Gary in both the Series B and the Cleveland Tests. In the case of the Cleveland Tests, however, to make comparisons possible, it was necessary to tabulate all classes in both ways. RELIABILITY OF MEASUREMENTS As the Series B tests were given twice, the data afford a basis for the discussion of the rehability of a single test. For instance, the difference in the scores of the 13 classes in the Jefferson School which were tested twice with each of the four tests, 52 differences in all, is, for the most ARITHMETIC 195 FlGXTKE 33 Differences in Class Scores nsr Jefferson School for Two Trl'Uls of Series B^ DIFFERENCES IN CLASS SCORES -Two TRIALS RATE ADDITION -IS ■ -\& MULTIPLICATION ARITHMETIC TESTS,StRIESB — ... ACCURACY SUBTRACTION DIVISION \y 12 12 II 10 10 15 13 Id ir ir 14- is ? '^ i^ n lo lo is n i<>.i7 u \\ is CIAS5 NUMBERS- J£ff£R50N5CH00L Each quarter of the diagram is a graph of the results for one operation. The scale along the base of the figures shows the numbers of the classes. In each diagram the straight line marked O represents the score made in the first trial. The scale along the left hand vertical axis shows the number of examples the rate score in the second trial is greater or less than the corresponding score in the first trial. The scale along the right hand vertical axis shows the number of per cent, the accuracy score in the second trial is greater or less than the corresponding score in the first trial. The solid line represents differences in rate. The dotted line represents differences in accuracy. It should be noted (i) that there are no consistent differences in any of the diagrams, (2) that a gain in rate is often accompanied by a loss in accuracy, and vice versa, indicating that the changes in score are merely fluctuations in the methods of work, (3) that the differences are gross dif- ferences caused by changes in class membership, of changes in ability due to training, or changes caused by any other factors that maybe operating. ^The tests were given five weeks apart. 196 THE GARY SCHOOLS Figure 33 — Continued In classes 9, 10, 11, and 12 in multiplication, the curves probably indi- cate growth due to training, but throughout the remainder of the dia- grams there is little evidence of either growth or of consistent differences in difficulty between the two editions of the tests. part, one example or less in the number of examples at- tempted and 10 per cent, or less in accuracy. Only about one difference in five will exceed these limits (Table XLV, page 192), and 55 out of 102 differences are positive. That is, the scores tend to be shghtly higher on the second trial. A careful study of these data, however, shows the variations are of two types. In some of the classes changes in rate and accuracy are in opposite directions, in others the two are in the same direction, and in still others there is practically no change (Figure ;^s)' The results show plainly that a number of different factors are at work. A factor that might cause change in scores is a change in difficulty from test to test. In the first trial, Form 3 of the Series B test was used, while in the second trial Form 4 was used. Form 4 is constructed to be of equal difficulty with Form 3 on an objective basis; that is, the same combinations were employed throughout, and, as nearly as possible, in the same arrangement. The varia- tion in difficulty from one form to the other should not be large, but it is quite impossible to check the relative difficulty of the tests except by very carefully conducted experiments. -^ ARITHMETIC 197 In the results shown in the previous table it will be seen that the classroom scores do not differ in any- characteristic way. Sometimes the scores from Form 3 are larger than those from Form 4; sometimes they are smaller. These are indications that the differences are not caused by any marked differences in the difficulty of the tests themselves. As a further check upon the differences from test to test, the number of times each example was missed in Trial i and Trial 2 is tabulated. Some of the examples of Form 4 proved to be missed by a smaller number of children than in Form 3, and the others by a larger number, depending upon the operation (Table XL VI, page 198). The average difference per example was 2.6 per cent. In 60 per cent, of the cases the differences were positive. There are also evidences of differences in difficulty from examples to example. It must be remembered, however, that the children who complete the various examples are a different group, as only the most able children reach the later examples (Table XL VII, page 200) . The results show that the units of which the tests are composed are fairly equal as measured by even the small number of scores at Gary and that the differences from test to test are not very great. Those who feel inclined to question the equality of the imits of which the tests are composed, or the equahty of Forms 3 and 4, should remember first that each group of four examples^ of the addition tests caU for the use of the same combinations, ^The number is different for each operation. igS THE GARY SCHOOLS w ^ , CO » * ,j 0> » s??5?i OOOOOC^CO ca to 00 00 ScOMC?^ Ifit^CMCO tOt>p-i.f-l 1 > t^ CO CO CO a-.cvjcj OOCMCOCO 1 — , — -»— -, — -~r— - — . — — n — 1 lis m ^^ 8 11 in CO m to 00 CM !>0) O t-H CO inm S oo t>f-. C3>t> o oo 005 to-* o t^. §3!; O M a LOr-l CM + TJ" t-( CO ■*i-c i-fin t- CUD LDC^ Wt^ to [>-* too coco LOCM o sa a ■*.-( CO -^r t>C^ to CM '^ CO + g CJOO mco COO 00 t^.-l to 00 lOCO ^ ooo ^^ 00 lOM ;D-h " + 05 CM C^4 coco "M t>CM CO t>cO ■* OC- 05Ln 8^ ^ 00 Ln lOtO Tj-Ol to c:5to i:^ K i> t^co ■* t>cq C.J + '^ 0)CM COCO ■* 00 CO a (D sg to 8^ ^% SCM 05 CM ooo OCM 00 -H CM4- g^ ^3 §^ s^ m torj COIM 8^ •* 8^ tOCM no. CM wo CMO lO OICO CO CU + C.^ >i w (N 00 U5 t^lo 05C0 n 82 CM-H 00 t> ^ 00. CM to ■* ^ OICO C.J4- OICM + en CO c;ico CJ + ^ a>t^ r-.-H s^ to SS COl^ fci^ CM f wco CO 01'* en CO COr-< N ..;| 05>-( OlCO CO CM N ^^ ^ Ri^ i^7 OlCVl 82 CM CO 00 LO O5C0 ^ OCM OCM s^s *"* + + 0<£> CO OlO wen CT)CM CM 8cM WIO f-Ol o ©CM t>co '-' C* OlO .;, .M| w+ T ^ ^ CO CO CO Q. Ml 1 .S as? .SB .s 1 ■i| fttp 1 Mie £ " ^c"; i-a S g^ §-a s S-S S •<^ £ri <2 ^P g2 0)0 S^S s S?S g^t §^ ?? 8^ ^?^ 8?3 §3 §" MO "7 M a a M S-ffi ^ <§ 6? CO ■ " ca >> ^ ^^ '^ -- .ssas >o£H Sea aggt «g '^"' 63 S S-S O u & S 13 t3 "S^"^ c "^ ci ja "^ ^3 o.S.8).S3=g§ f-s ax S-a'^ o ■* K-Sf^ poo Irt S . 25 g g cn-^^ u 3 ^ . «^ (o^ « ssS g |E s-sot;-Ssi a n ', d c E n (T i*-i 1> -n s sn a ja E "<> 200 THE GARY SCHOOLS 1— 1 3 > n ^ w M .-1 a < S H o p< Ph ;z; w H «0- 00U3CD g S . • 5 coicic-ui05 IS w Icoc~i:dcoc<)ooio* CO 05 00 g^ Is c/5 Oi|-<*00(Mt- g^ Is Hi o o «5 |U505C0^00t-WO0i U5 Ife '^Jo^CO»0* Is s o |a3">4WTH«OT-l 00^ CO 03 H >-) tolc5«£>t-THt--ooa5«ocoot-ioo05 ClCOlOtDCOC35-^t-03 1 OioOCO-^C^Ii-l-^OlO o "ef H S O«5t-i-lt-000S«0M cot-t-«DU3aicooo(N CO(NJCS0Ot-«>Cici05«ieocg O i-i ■M o 0) S3 -S 3 n) C5 **-! ^ > O >4-l *J '^ o o' ,- o OJ <]j M f^ "^ Sxi.S ^ a§ gt^-s « y a s g^- ' 03. fl-S PhCI 2 "3 -c .2 3 £» OJ 3 3 o '^ >>, 3 2 a - w o-fi a X o ■ 2 rt 0^jV3 O.^-i-i'ti (U U3-^ ^ ^-3 ^ s^ a"^ a >>,'" o ^ bo w, ' 10 > H "^ >4 . HS'B'^-Kgl^ag'SI 204 THE GARY SCHOOLS HH ^ H M o <: 1— ( o Large Gain Change in Method Change in Method Large Gain Change in Method Loss Insignificant Change Change in Method Insignificant Change Change in Method Change in Method Gain in Accuracy Gain Change in Method Large Loss Loss in Accuracy Loss in Accuracy Change in Method Loss Change in Method Gain Loss in Accuracy 1 B <: H «DO'0OC0OOOOC0C0O'^OOC000CDMC0«0«D i-H (Moec o i;d o CO CO 00 T}< u:) ■<* 00 iM CO oo 00 iX) in s H oooo ■<# Tjt CO OC0C0C0t0OOU5OOOOC0CDOOi-l^«0U3CqO COCOCOtDOOt-O-^^tOCOCOOOt^OOOOOqiXiO l-H l-H 1—1 1—1 l-H 1— ( <; pel 3 g ?D;0'«*05;o(Mcgweo«DmoiooiO(;Dt-i-o>ai i-H fri H lOlOOO cococococoeoco->*Tt)LOuoicio«oo«Dt-t>i>ooooci 1-1 < M 1— i T-l(MCO T^VO«Ot-00050T-(CSt-a5 , -^ , (M 1 W Cj] » ' 1 ' 1 >l < H 3 1 (MOO 1 -^ '*-««* u 8 < w >J H 00 1 (M -^iX> 1 «o H " t^ 1 rH 1 T-l ^ l + I + n W 1 m t I (o^^a H U5U5«OU3 H X ii" Onl als. lOn Trial hTri Tria u tlst tBot t2nd Grou cj fl CI • 9 +J >-i H 5 bo fl >> l> " 2i^, bO ^ l" Cl.S^ « rt rOT3' t^ te- !r! M ^ J3 ••-^^■^ g^ s^ rt rt W' ^ 3 " -M „ P- XI O-rt-^UCJtn-rt a Xi'Zi „ g Jj (3 en ,0 n U 4> t^ •-"« OJ,C| tj OT ? 4JHH U <4 O O 2o6 THE GARY SCHOOLS o o C/3 -s thod thod cy thod thod thod 5 (U V i> p a i> ii s ^^o^^^ I'd d .s.s<.s.s.s 3 « V u e 0) 4j lu -q bo bo bC'-H bjo bo bo ^.S.S g. d C C c c c c O oj oJjU nj rH f3 rt J3 J3 J3 ;z;oououoouuu CO-^-^int-CZlt-C-OOGO?© couimus^ot-t-t-oooOi-H C0C0C0C0«D«C><£>t-t-t-O T- + ' + 1 i-lO(N O o I o o^ ° +J -W -4-) ^^ Pi e fl -i IH ,K bO e<3 1^ O sj 3 OJS _ •TJ t-> tn n >^ *^d •^ § «r-3 •2 • ^ u i^ O O « O -M ^ 4) 'C 4) U3 'o in '-' C JG !2 lU « > I- .S^ a, M "< uj 4) te a JJ in S 02 4-1 aj »-• 4> .^ j3 w 'Z jn g _, t, uy "^ .3 ,00 05 M .2'- wj g eg C !^ o ^ tj O '-3 M ' n ^ s rt c ^0 S^ 4. W)' 4> t; .ti .2 4j &3.ES 4) Oi-J-J '^'^ 2o8 THE GARY SCHOOLS in score is the difference in the reaction of the children to the test situation. The first time the test is taken most individuals tend to hold themselves in restraint. They proceed cautiously, on the lookout for difficulties. On the second trial, however, when they know what to expect, they work more freely. The result is an in- creased rate and decreased accuracy.^ Many illustrations, both of increase in rate and de- crease in accuracy, and increase in accuracy and decrease in rate were found. (Tables XLV and XLVIII.) All such differences will be said to be caused by a change in method of work. The amount of change in rate necessary to produce a given change in accuracy is, however, not known and apparently varies with the different classes and different individuals. Consequently in all cases of change of method it is impossible to tell whether or not there has been any real change of abiHty in the interval between tests. A third factor which undoubtedly influences many scores is the change in class membership. Attention has already been called to the fact that attendance at Gary is exceedingly variable, and in the results previously given (Table XLV, page 192) no account has been taken of changes in membership. An analysis was made of the scores of the children from a class whose scores show a large loss to determine the effect of individual variation (Table XLVIII, page 204) . There were 3 children present at the first trial who were absent on the second, 10 children pres- ^See X of Appendix A, page 452. ARITHMETIC 209 ent the second trial who were not present the first trial. Of the 22 children present at both trials, the scores of 5 show a real gain, some of them in both rate and accuracy, 1 1 individuals show very Httle change, or else a change in method. That is, in this class, the large loss in accuracy is due almost entirely to the change of method of work, for the children present at both trials have increased their rate score one example, and lost in accuracy 24- per cent. This efifect, however, is somewhat masked in the general class scores by the fact that children present at the first trial were less able than the average of the class, and tended to reduce the first class score. A similar analysis was made for a class in which there was a larger gain in accuracy (Table XLIX, page 206). The results show that this gain was due partly to seven children who show a real gain, and partly to the fact that a number of children worked more slowly in the second test with increased accuracy, that is, to eight children who show an apparent gain in accuracy due to change in method. Analysis of random selections of other classes showing similar large differences gave similar results. Nothing was found to indicate that there were any real differences in difficulty between the two addition tests. The fourth factor tending to produce change in score is school training. This is particularly true in the B eve- ridge school in which systematic drill work was observed following the first test. In most of the classes, however, there is no evidence that any great amount of change of score can be attributed to such cause. Most of the 2IO THE GARY SCHOOLS variations shown in the table are due either to chance variations or to changes in method. A tabulation of all the differences in the class scores in the two trials of the Series B tests in all schools showed (Table L, page 211) that about 10 per cent, of the classes made exactly the same score in the two tests, 78 per cent, of the classes made the same score in rate within one example, 62 per cent, made the same score in accuracy within 10 per cent. As the scores of the Gary children of the eighth grade average 8 examples, a change of one example worked correctly would mean a change in accuracy, if the rate scores were constant, of 12 per cent., so that about three fourths of the class scores do not vary more than one example in their rate or accuracy. The median of 211 differences between the class scores of the first and second trials of the tests is about one half an example in rate and 7.6 per cent, in accuracy. Of these differences, the positive differences have a ratio to the negative differences of about 2:1, indicating the general tendency of the classes to make a higher score on the second test. This is undoubtedly due to the practice effects of repeating the test, as has already been indicated. The reliability of the rate scores of a single test is high — about 90 per cent.^ That is, only about 10 per cent, of the children will be misrepresented by their rate scores in any one test. In accuracy, however, variations ^Pearson's coefScient of reliability, 42 eighth grade children, rate scores addition, Trials 1 and 2, were + -90, P. E. ± .02. ARITHMETIC 211 TABLE L Vakiation in Class Scores in Two Trials of Series B — Four Operations Combined^ AMOUNT OF variation^ FREQUENCY, RATE FREQUENCY, ACCURACY ^ NUMBER % NUMBER % 19 10 24 12 1-4 81 38 54 25 5-9 64 30 54 25 10-14 30 14 44 21 15-19 8 4 19 9 20-24 9 4 16 8 Total 2113 100 211* 100 Median .55 Examples 7.6% ^Differences in rate and accuracy are distributed for grades 4-ia. 2For rate of work the amount of variation represents tenths of an example. For ac- curacy the amount of variation represents per cent, of accuracy. '73 scores were lower on second trial, 119 higher. *63 scores were lower on second trial, 124 higher. This table is to be read as follows: Of 211 repeated tests 19, or 10 per cent, of the whole made exactly the same median score in rate in both trials: 24, or 12 per cent, made exactly the same median score in accuracy. Half the classes varied .55 of an example or less in rate, and 7.6 per cent, or less in accuracy. arise much more readily and the rehabih'ty of the tests is low. This is shown clearly by the coefficients of corre- spondence (Table LI, page 213). About three fourths of the children will maintain their relative positions in the distribution of rate scores through the two trials, and about 40 per cent, of the children in accuracy.^ ^Pearson's coefficient of reliability, 42 eighth grade children, accuracy scores multiplication. Trials i and 2, was + .12, P. E. ± .09, 212 THE GARY SCHOOLS . It cannot be too strongly emphasized that educational measurements differ from measurements in the physical sciences chiefly in the fact that the quantities measured in education vary enormously with slight changes in conditions. The length of a metal rod is changed so little by temperature, pressure, and other factors that we come to think of the length as a quantity independent of the conditions. Abihty to add correctly, on the other hand, is so dependent upon the conditions under which the adding is done that it can scarcely be said to exist independently of them. In other words, as has been repeatedly pointed out, a test does not measure abihty, it merely registers performance under the given condi- tions. The conditions revealed by the tests are not created by them. Whenever two separate measurements by the same test are possible, a greater range of individual variation than one would expect is always revealed. Many tests, however, cannot be repeated, and for very few are there available second editions of equal value to the first. The conclusion to be drawn from the re- peated tests is, of course, that a series of tests is necessary to determine with any certainty the abihty of an in- dividual, but that variations in individual achievement take place in accordance with such fixed laws that class scores based on the scores of groups of children are very rehable. In other words, the scores made by the Gary children in the arithmetic tests are proved, by the repeti- tion of the tests, to represent actual conditions. The ARITHMETIC TABLE LI CORRESPONDENCK BETWEEN RESULTS OF TwO TrIAXS OF Series B^ 213 ■MTDIAN MEDIAN DEVIATION TOTAL RANGE RATE ACCY. RATE ACCY. RATE ACCY. Addition, 1st Addition, 2nd Subtraction, 1st Subtraction, 2nd Multiplication, 1st Multiplication, 2nd. . . . Division, 1st Division, 2nd Multiplication, Cleve- land Tests 8 7.5 9 9 8 8 7 7 5 61 57 83 79 74 75 83 84.5 63.5 2 1.5 1 1 2 1 1 1 1 14 14 8 12 15 14 17 13.5 19.5 4-15 3-15 6-16 5-17 2-12 2-13 1-13 3-14 1-7 0-100 0-100 14-100 0-100 0-100 0-100 0-100 20-100 0-100 'Based on scores of 42 eighth grade pupils, tested at intervals of four weeks. Percentage of Total Cases Which Do Not Vary in Relative Position More Than One (or One Half) Uott of Variability SERIES B rate ACCURACY COMPARISON TRIAL I WITH 2 I UNIT i xrariT 1 UNIT J UNIT Addition 78 55 36 21 Subtraction 78 36 45 29 Multiplication 76 50 55 24 Division 60 24 31 12 Multiplication Trial 1 Series B with Cleveland 86 62 40 26 Multiplication Trial 2 Series B with Cleveland 81 29 38 14 This table is to be read as follows: If in the first trial of the addition test, the relations of the scores of individual children to the score of the class as a whole be compared with similar data based upon the scores of the same individual in the second trial, 78 per cent, of the children wiU be found to have maintained the same relative position in the two sets of rate scores, and 36 per cent, in the two sets of ac- curacy scores, within one unit of variability. That is, within 2 ex- amples for rate, and within 14 per cent, for accuracy. 214 THE GARY SCHOOLS scores made by the Gary children in all the tests of me- chanical skills agree in showing that the Gary children work slowly and very inaccurately. EFFECT OF TEST CONDITIONS The claim is sometimes made that such poor work in the tests is due largely to the fact that the children are working under artificial conditions; that if occasion were to arise for the use of these same skills in the achievement of some purpose which seemed worthy of effort, the chil- dren would respond to the motivated situation in a man- ner which would prove their ability to cope with it. This statement is both true and untrue. It is true that a slow, inaccurate worker will, under the spur of sufficient incentive, repeat his computations many times until he is finally able to arrive at the correct results. As has already been pointed out, by far the greater number of children tested both at Gary and in other school systems would be able to solve correctly every example in every test (except certain of the fraction tests) if a sufficient incentive should lead them to at- tempt such an achievement, and if there were no time limit. So far the claim above is based upon fact. The part of the statement that is not true is the implication that because two groups of children attain the same goal their achievements are equal, and the trainings which made the achievement possible are of equal value. For if the children of one school require less time than those of another school to accompHsh the same task, ARITHMETIC 215 they are more skilful. The sole purpose of the arith- metic tests given at Gary was to determine the degree of skill possessed by the Gary children under the given conditions. Such measurements are of value because many persons have already determined for themselves the degree of skill they think a child of a given age or grade should have under the test conditions. The one additional point to be noted is that in the judgment of the survey staff the test is a suitable measure for the type of instruction found in the classrooms at Gary. If, therefore, the reader wiU realize that no attempt is being made to draw inferences from the results, other than in regard to the degree of skill in the mechanical operations of arithmetic, the discussions of this chapter will serve to show the degree of dependence that may legitimately be placed upon the conclusions reached. VI. ENGLISH COMPOSITION §1. General Results LANGUAGE work at Gary is allotted i6 per cent, of the total time given to the fundamentals, and -^ this corresponds almost exactly to the per cent, of time allowed this type of work in the average of fifty American cities. The actual number of hours at Gary, 798, is somewhat less than the average, 864 hours, in the conventional schools, but the difference is so slight that one may fairly say work of this tj^^ receives the same emphasis at Gary as elsewhere. English composi- tion is the one phase of language work which is measurable at present. Such surveys as have been made in other school systems show that in general the products of school training in written composition are far from satisfactory. It was felt, however, that here, if anywhere, the training peculiar to Gary should produce results. Accordingly, a test of ability in English composition was given. TESTS No attempt was made to measure oral composition, and, of the four recognized forms of written composition, the testing work was limited to the simplest, narration. Children were asked to write a story of some interesting or exciting experience^ of their lives. Subjects were ^The instructions and conditions of the Composition Test given in the Denver Survey were followed closely. 216 ENGLISH COMPOSITION 217 suggested, but the children were urged to choose for themselves, and, for the most part, they did. Children wrote freely in the presence of the examiners and were given ample time (fifteen to twenty minutes) . The actual number of minutes and seconds taken by each child was noted, however. SCORING In all but the lowest grades the children counted the number of words written and later these scores were verified by the examiners. The papers were also scored for quality, for number of errors, and in other ways, all such scoring being done by trained men under carefully controlled conditions. The quahty of the compositions was measured by means of the Hillegas Scale. ^ RESULTS A typical eighth grade composition is shown in Figure 35. The handwriting in the sample is a little better than the eighth grade median in handwriting (quality Ajnres' Scale 45, actual eighth grade median in composi- tion test 39, generalized score 42), and the misspell- ings are less frequent than for the median spelling paper for the eighth grade (spelling coefi&cient of the illustration, 8; median eighth grade spelling coefiicient, 19.7), but in style, subject, structure, and range of vocab- ulary it is a representative paper. So far as it is not ' A discussion of the reliability of measurement of quality in composi- tion by the Hillegas Scale will be found in §2, page 247. 2l8 THE GARY SCHOOLS Figure 35 THE GARY PUBUC SCHOOLS t.«NGTH -SSL"* ASE../;^ GRADE M CLASSM- ^ AAr^uaJyiAr-^ AArtTAJm t'CiO (kUnaM^^ fiJL^-^^ A^^Tj^. 10 AAut.A'&yt^ ^.Ci^toLt y6Ckoi/{ Lf- /t^-^l/t^ ^ /a ,.yb4yLiJ' (ji^tr-^ .X-^J dL£f/ajyUr 1 -^c-cg 1 4f- 00 1 T-l i-H to 05 , 1 =^°°.^ ^-^ 1 i-icgxocg 1 1 I 1 1 t-iot-05cg ig- t> 1 1 1 TH0505T-I , 1 1 iiOi-i05cgeo 1 1 1 eoooiysi-i 1 , T-l -<# iO CD 00 00 \a II cgtnooi-i , 1 tH 1 -"^OOOWS !5^a5 ^ 1 1 1 ,HOO>(M . . en . ll 2 a > bo O jH (U '^ & s a 32 o» •^ rn' 3H Cj JL ENGLISH COMPOSITION 225 Figure 37 Development in Composition 60 IT »« fw / 50-- A / 40- <^ .••' C0HP051T10N. ^^^ .-' REPRODUCTION 30- 4 ' 20- ENGLISH COMPOSITION DEVELOPMENT CURVE ■ SPEED-QUALITY 10- ■ RATE o 10 20 Scale along the base of the figure — rate in number of words written per minute. Scale along the vertical axis — quality on the Hillegas Scale. Solid line — composition; dotted line — reproduction of simple story after one reading. Small circles on composition curve indicate position of grade scores in both rate and quality. The small figures near circles show grades. The reproduction curve is based upon the actual rates of reproduction and the quality of the composition test. The eighth grade quality of reproduction was judged to be equal to that of composition, but the dotted line, as a whole, represents a theoretical line. The curve was drawn to make evident the actual difference in rate of writing be- tween reproduction and composition. The difference increases from grade to grade and the curves show that quality in the Gary schools is produced at the expense of rate. Note that the light dotted hne which represents the actual scores for development of ability in English composition begins to rise rapidly in the sixth grade, and that the growth during high school years is almost 226 THE GARY SCHOOLS Figure 37 — Continued entirely in quality. To conform to results in conventional schools the composition curve should have the same general form as the curve for reproduction and fall half way between the two curves in the figure. between the fourth and sixth grade, but the change from the sixth to the eighth grade is 13 points. The eighth grade median score is nearly 46 Hillegas. The ninth grade is but sUghtly higher, but the tenth and elev- enth grade scores raise the level 18 points. The twelfth grade score is lower than that of the eleventh grade. That is, of the 34.3 points of difference in quality between the fourth and eleventh grades, 30.2 points gain is made in four of the grades. The growth in high school grades is almost wholly in quality. (Tables LIII, LIV, pages 224, 227, Figure 37, page 225). Teachers of EngUsh hold that in compositions there should be increasing freedom from error from grade to grade, and increasing power both to choose the words best adapted to the expression of a given thought and to organize the words chosen into coherent discourse. Ac- cordingly, the eighth grade papers were subjected to a series of analyses to determine the number and character of the various errors made. Papers were marked for errors in capitalization, punctuation, spelling and gram- mar and were also analyzed as to range of vocabulary. On the average, a Gary eighth grade child makes .8 of an error in capitalization, 1.8 errors in punctuation, and 3.9 errors in grammar, or a total of 6.4 errors in writ- ing an original composition of 214 words (Table LV, page ENGLISH COMPOSITION 227 HH ^ 2 a u oa t OS (M-*«Dt>00 I I ft P U5OW5OV0Oe0t-O 00 O T-l CO -^ U5 10 U5 ?D ■<*lO?0t-0005OT-tN D O 3 -| " « 3 g 2 ^'a o ^ C i^ D ^ tn tn S CI u „ o- ID !2 >> iu ^ T3 ^ M ^ ^ Si .Ef >^ '-0 ) o H a ^3 t: cS «x ti »H # Tl* 1 OJ CO CO -^ CO CO PER CENT. PUPILS MISSING 1 COOOOCOI>«D i-H ■<;j< r(< t- Oi 00 C5 Oi I> t- N iH tocg t> ici CT> 00 NUMBER or PUPILS MISSING 1 •<#(MU3(MOCO i-l^(MCO(M i OiCDOIMNt- T-H »-| tH >— 1 1 "^ (MOO 1— 1 cc ai (M CO CO 1-1 CO »0 t- (M 1— 1 CO CO t- Od T-H 1— 1 1— I 1— 1 CO COtMCOOlOCO (M(MCOlOTHrH Capitalization Punctuation Grammar 'c3 Capitalization Capitalization Punctuation Punctuation Grammar Grammar 3 o H Capitalization Capitalization Punctuation Punctuation Grammar Grammar C/3 < 1-1 O 00 00 00 1-1 T-H i-H 00 -^U5-^lO-^vO T— 1 1— 1 T-l T-l I— 1 1-H tH in ;£) lO CD lo CO iH o 1— > i o ENGLISH COMPOSITION 229 t- 00000s -^ 00* ' T-ioi CD 1 ooi-iko 1 05t-U5 1 (MUiOO 1 t-cooo 1 1 cot-o 1 tH U5 ■"StTjtO 00 OS 0(Nai T-l (M rHW-* 00 a .2 ■y "cS ^B i 13 H rt 3 M H >. -l-> «i-H u ^e2 « &'3 y-^ er cent, of the children of a given grade. If the Gary vocabularies had been restricted to different words used at least three times, the percentage of the second grade words would have been seventy-five.^ Again a careful study of the vocabulary^ fails to show any clear effect of the special training at Gary. For instance, the word "carbon" is used three times, but a reading of the composition in which it occurs shows that ^Comparative data from conventional schools are not available. How- ever, a tabulation of a random sampling of all the words used in the eighth grade compositions gave 54 per cent, of the words five letters or less in length. A similar random sampling of Jones' second grade words gave 45 per cent, five letters or less in length. That is, the Gary vocab- ulary contained a larger proportion of the simpler, smaller words. See also page 414. ' *In Appendix A, VII, page 434, will be found a list of all the eighth grade words which are not either proper nouns, words in Jones' List, or derivations of those words. That is, it contains all words which might in any way be peculiar to Gary. ENGLISH COMPOSITION 231 M Q, H O ft O « W 9 S 5 (^ li( g < O O w CD N 1-1 ' " , (05 CD US (J r O iz; & Z "^ w ;? P ^ t-(MC0O0005-<*u3(M OCX3t-i-ICX301>r-tm OOCDCONi-ICO t-C O) Si ul "DH- 1 (3 -a ^C 3 -oja ^ u "" J5 o c o "I •*°^ "2 a JJ o 2 «j S — ■559 a o .to o J> tC =* (U XS '^'O 3 H O U ■is u 43 -=5 O o ^d co- il fe3^| °o ^^ « = « ^ Wo'S M ^ _, o 5 JJ c t,.- « *j ^ S « ^- > c.S Ud • 132 a^-^ to ^7. ^ si « vo o « ^ « a " O I— »j:5 "2 U5 O 3 N *-< o g C .«3 M bo . ^ T) 3 S ' 232 THE GARY SCHOOLS the word was acquired at the time the boy was in a booth with his brother, a moving-picture operator. ("The red hot carbon fell out of the lantern and set jSre to the film.") Similarly, in one case, "auditorium" refers Figure 38 Vocabulary of Eighth Grade Compositions Based on Jones' Vocabulary Lists VOC AB U LARY ~ EIGHTH GRADE COMPOSITION'S TOTAL AREA = g506 DIFFERENT WORDS FROM 27,610 133 9 DIFrER£NfT W-ORD^ RATED AS SECOND GRADE WORDS^BY J0NE5 160 :3BP GRADE WORDS 148 4^-^ GRADE WORDS'' 93 5' GRADE ■» 58 6^" GRADE 81 7"GRADE- nJI a^'^GRADE 386 unlisted: WORD? £1 l40Df3CARDED WORDS not to school work, but to the name of a theatre in Chi- cago; "pottery" to Indian pottery seen on a trip to New Mexico. One of the few possible exceptions is a set of words, "mummy, art, beetles," etc., which are used in describing a trip to the Art Museum in Chicago. The trip was taken with nine other boys from Gary and may have grown out of school work, although no reference is made to teacher or school. ' As far as it is possible to judge, the various words which do not appear in Jones ENGLISH COMPOSITION 233 TABLE LVII Composition Subjects — Eighth Grade Classes Total number of papers 127 Analysis A relative to life of child Events in the life of the writer (exciting) 49 Descriptions of scenes or accounts of experiences (not exciting). .. 30 Accounts of incidents observed in the life of others (exciting). ... 19 Description of trips 13 Accounts of experiences related by others (not seen) 11 Dreams, ghost stories, and imaginary events 4 Experiment in physics 1 Total 127 Analysis B types of experiences Accidents, runaways, and collisions 23 Experiences, fishing, swimming, walking, and skating 22 Trips 20 By boat 10 " rail 6 " auto 4 Hikes in woods or country 12 Storms 18 Rain 11 Snow 5 HaU 2 Fires 6 Errands 5 Miscellaneous j 21 127 Analysis C source Farm, country, or woods 30 City 24 Rivers, lakes, or ocean 23 A trip of some kind 15 Home 13 School 7 On way home after school 7 Miscellaneous 9 127 234 THE GARY SCHOOLS may as well have been acquired from actual life experi- ences as from school activities. The incidents chosen as subjects by the 127 children tested in the eighth grade classes were carefully studied and tabulated (Table LVII, page 233). The children followed instructions and wrote, for the most part, about simple, childish interests in striking occurrences of daily life. There is Uttle in them to show that the interpreta- tion placed upon these experiences by the children has been influenced in any way by school training. SCHOOL TO SCHOOL COMPARISON In making comparisons from school to school, marked differences were found. Of the 13 classes tested in the Jefferson School 2 had composition scores markedly above the city scores and one below. Of the 12 classes in the Beveridge School none was above the city score and 7 below (Table LVIII, page 235). Comparisons on the basis of composition rate, or rate of reproduc- tion, yield results which are very similar. The schools in order of rank are: Jefferson, Emerson, Froebel, and Beveridge. It is probable, however, that these dif- ferences are not in any way due to special program fea- tures. If the differences observed were due to the en- riched curriculum, the order of schools would probably be: Emerson, Froebel, Jefferson, and Beveridge; Froebel being put second instead of first to allow for the difficulty in language work with the children of foreign born par- ents. Under the circumstances, the differences shown ENGLISH COMPOSITION 235 236 THE GARY SCHOOLS in the table are probably not significant from the point of view of this investigation. COMPARATIVE DATA The question of the value of the Gary product as com- pared with similar work in other schools cannot be set- TABLE LIX Comparative Data for Quality of Composition (Hillegas)' •V o Hi O II t/2 %1 fU 3 cq 23 28 34 38 41 1^ 3 iS O 1> (fi > «_ 23 26 38 48 56 52 50 59 63 U 29 28 41 40 53 1 4 5 6 7 29.9 32.6 32.8 39.7 45.8 46.9 56.2 64.2 62.2 35 40 45 50 55 60 65 69 72 26 31 36 41 46 29* 31* 38* 44* 54* 28 34 38 42 46 50 53 57 59 33 38 46 50 67 69 72 75 32 39 43 42 56 64 60 68 47 8 9 10 11 12 Fifty foiu' High Schools 50 59 64 67 iSee page 38. ^Starch's standards are derived from Butte and Salt Lake City. *The values of this column are those printed in the Salt Lake City Survey Report. It is probable, however, that to make these values comparable with Gary tney should be raised 7 points as they were computed in a peculiar manner. See page 247. This table should be read as follows: The median fourth grade score in quality of composition at Gary was 29.9 Hillegas. Trabue's fourth grade standard, 35; Starch standard, 26; fourth grade score at Butte, 23; Salt Lake City, 29; Nassau County, 28; Mobile, Alabama, 33; Mobile County, 32; South River, N. J., 23; Chatham, N. J., 29. The scores of the eighth grade classes were as follows: Froebel, class 45, 40 Hillegas; class 46, 40. Emerson, class 14, 47.2; class 15, 44.0. Jefferson, class 18, 50. ENGLISH COMPOSITION 237 Figure 39 Comparative Development THE GARY SURVEY scbcci__CIXY___T«. No QQ.tlEQ5LlU.0hL 40- 30- BUTTE CBACBACBACBACBACBACBA 2345 678 9 10 11 2 GRADES The scale along the bottom of the figure represents grades, the scale along the vertical axis represents quality by the Hillegas Scale. The solid line represents Gary; broken line, Butte; dotted line, Salt Lake City. The Gary results are better than those from Butte and not so good as those from Salt Lake City. The Gary and Salt Lake City curves are the same to the fifth grade, but the value of the scores in Salt Lake City increases more rapidly than in Gary, so that by the eighth grade the Gary scores are about two years behind. Note the increased rate of growth in the high school grades. For comments on the reliability of scoring by the Hillegas Scale see Section 2 of this Chapter, page 239. 238 THE GARY SCHOOLS tied as definitely as for other subjects, because few com- parative data are available, and studies of the reliability of scoring by the Hillegas Scale are conflicting. The Gary eighth grade score (45.8 Hillegas) is higher than the corresponding Butte score (41 Hillegas) and lower than those given in the Salt Lake City Survey (54 Hille- gas) (Table LIX, page 236, Figure 39, page 237). In making the Denver and Grand Rapids surveys the Willing Composition Scale was used. The values of this scale do not correspond to those of the Hillegas Scale, but through the kindness of Mr. Willing all but one of the eighth grade classes^ were scored by him personally, so that the Gary results might be directly comparable with the Denver and Grand Rapids scores. The median qual- ity of the Denver eighth grade papers, written on the same subject and under the same conditions as the Gary papers, was 63.5 ; of the Grand Rapids papers, 65.0; of the Gary papers scored by Mr. Willing, 61.3.^ It is extremely probable, therefore, that on the basis of such comparative data as are available at present they should be judged to be about equal to the products of composition training in conventional schools.' ^The fifth one was omitted because of lack of time. ^The median of the Hillegas scores of the same 97 papers by the Gary judges was 46.3 as compared with 45.8 for the entire grade, so the omis- eion of the one class did not greatly influence the result. ^The personal judgment of the author is that this conclusion will not stand when more comprehensive comparative data are available. ENGLISH COMPOSITION 239 §2. Critical Discussion The measurement of ability in English composition presents a problem of greater dilB&culty than the meas- urement of any of the abiKties pre^dously discussed. The factors which determine the merit of a composition and those which affect the judgment of the scorer are many. Little is known about their relative value. The ordinary marking of teachers varies enormously, both from teacher to teacher, or for any one teacher from day to day, and from sample to sample. Also the use of a composition scale has been attacked by certain teachers of EngHsh. It is important, therefore, that the method of marking the papers from the composition test in the survey be explained in detail. SCALE USED AU marks reported are in terms of the Hillegas Scale.^ Unfortunately, the purpose and value of the scale have been so Httle understood that the use of the scale for survey purposes must be justified. It must be admitted at once that the scale cannot be used effectively without training. The effects of a first reading of the scale are Hkely to be irritation and a sense of the impossibility of attempting to judge of the quality of a composition by reference to other compositions of a totally different char- acter. Yet it is easy to show that such judgments are not only possible, but are accurately and consistently ^Teachers College Record, Vol. 13, No. 4, page 21. 240 THE GARY SCHOOLS made, once certain viewpoints and experiences are gained. The Hillegas Scale provides for marking of composi- tions on an absolute basis. That is, a mark given by the scale means just one thing, the degree of merit possessed by the composition as a composition, and en- tirely apart from any consideration of the age or grade of the author, the conditions under which it was written, or the purpose for which the mark is given. That is, 50 Hillegas units of composition are comparable in mean- ing to 50 inches or units of length, 50 pounds or units of weight, or 50 units of any other quantity which may be measured in absolute terms. The reader unfamiliar with the use of scientific units for the measurement of educational products should see that the composition scale has for one of its purposes the bringing to hght of the very items which teachers' marks conceal. The teacher of a fourth grade class recognizes a paper as an exceptional paper for the grade and marks it, let us say, 95 per cent. The teacher of the eighth grade class also assigns a mark of 95 per cent, to a paper from her class. Numerically, the two papers are equal; yet both teachers would agree that one paper is better than the other. The marks have served the teacher's purpose, but from the point of view of a survey they conceal the most important elements that the survey wishes to reveal, i.e., the real quality of the compositions and the amount of progress which has been made from the fourth to the eighth grades. If, however, the two ENGLISH COMPOSITION 241 papers are marked in terms of the scale and one is found to be of quality 40, while the other is of quahty 80, it is possible to say at once that the one has twice as much merit as the other. Many persons will admit the desirability of absolute marking, but do not believe that the scale can be used for such a purpose. It is easy to prove, however, that everyone recognizes gross differences in general merit. If the reader will scan, superficially or carefully, the two samples below (taken from the papers written at Gary) he will have no difficulty in recognizing that as samples of English composition one represents a more advanced stage of development than the other. SAMPLE I " One the day I was on a wonderful jourray. I travelled in the mountains and as I travelled about Two days and accident happen to me because I slipped of the rock and hurt my foot. And so I went on, and seen a bear behind me and I started to run away. And as I ren I fell into the water and then the bear disappeared. And so I went home very lately. "Next morning I was traveling in the different parts. When I was about in the middle of the forest" SAMPLE 2 "Ice," called George, a jolly, big, fat negro as he entered the yard with his wagon and mules. Today the ice wagon, however, was to be the conveyance for a picnic party and they were to traverse the beautiful 342 THE GARY SCHOOLS solitary roads in the mountains of North Carolina, There was a grand scramble, the picnicers climbed in, seated themselves on the hay, laughing and talking until the gloominess of the woods and the stony, socalled roads attracted their entire attention." These two samples, however, constitute a rough scale by which any other samples may be measured; for if one reads Sample 3 (below) he will have no difficulty in recog- nizing that it is intermediate in value between the other two. SAMPLE 3 "This is a real experience. It happened at Miller's beach in the summer of 19 13. "My brother, another young man, and myself went out in a row boat. "When we had gotten about half a mile from the shore my brother dived off of the boat. He came up once and went down again, he came up again and went down again he came up again and went down again. We waited awhile but he did not come up, "Our friend dived off after him and after some difficulty located him. "He brought him up out of the water and after they had struggled awhile knocked him senseless." Suppose that more and more samples were thus read and assigned a place in relation to the samples previously read. It is evident that the time would come when the difference from sample to sample would be so slight that ENGLISH COMPOSITION 243 it would be difficult to make judgments with certainty, just as it is impossible to tell with the eye alone whether a given bar is 3.65 inches long or 3.66 inches. The series of samples as a whole would give an illustration of every type of sample from the worst to the best. The Hillegas Scale provides a series of selected samples whose values have been determined by a statistical procedure which enables the results to be expressed in units according to a consistent plan. The scorer may compare a given specimen with the scale and de- termine its value from the values of the scale samples between which it is judged to fall, and this can be done accurately and consistently.^ ANALYSIS OF SCALE This, however, is the point at which many stumble, for misunderstanding easily arises. Many see in the scale only a series of distinct compositions differing in content and style. They do not generahze from these compositions and carry in mind a general concept of "progress-in-English-composition-as-a-whole" of which the samples are merely particular illustrations. Yet the fact of such progress is self-evident. The average child entering the first grade cannot express his thoughts in writing, and, naturally, when he begins to do so, his attempts are very imperfect. After twelve years of ^Values of the samples above in terms of Hillegas Scale: Sample i, one judge, 35. Sample 2, average of two judgments, 75. Sample 3, median of five judgments, 48. Average deviation 2.6. 244 THE GARY SCHOOLS training, however, a high level of abihty may be reached, and the progress from the lower to the higher level fol- lows certain general tendencies which are represented objectively in the samples of the scale. It becomes important, therefore, to express these gradations of de- velopment in their generalized form and to use the samples of the scale only as an aid to judgment in de- termining the precise value to be assigned a given com- position. The writer's generalization of the Hillegas Scale is as follows: The development of ability in English composi- tion passes through three phases. At first there is the struggle to master the mere mechanics of expression, the spelhng of words and their arrangement in the con- ventional order. Then follows a second stage in which the efforts of the person writing are expended mainly upon organization of subject matter, the selection of the details to be expressed and their organization into con- nected discourse. Finally there is the stage of develop- ment of literary merit in which choice of words, and the selection and organization of subject matter cease to be mechanical and become artistic (Figure 40, page 245) . Of course, no hard and fast lines can be drawn between one phase of ability and another, but the main characteristics of each of the three stages of development are well marked and serve to fix the limits of value within which a given composition must fall. In Appendix A will be found illustrations of samples of each type, chosen from the Gary compositions on the basis of the judgments of ENGLISH COMPOSITION 245 the Gary judges. Reference to this series of samples (which in themselves constitute a composition scale) will enable any reader to determine for himself exactly what a composition of a given value is Hke. Figure 40 Analysis of the Hillegas Composition Scale A. Mechanics B. Organization C. Literary Merit (Difficult to Read) (Tiresome to Read) (Interesting to Read) 0-9 30-39 70-79 Meaning uncertain Mere succession of Interesting material after study. sentences loosely marred by imperfect joined. choice of words. 10-19 40-49 80-89 Meaning decipher- Disconnected sen- Well selected material able but with dif- tences with much expressed in well ficulty. irrelevant matter. chosen words. 20-29 50-59 90-99 Meaning not ap- Connected sentences: Exceptional content parent on first Few mistakes, unin- and quality, reading. teresting material. 60-69 Well organized, but o common place in content. This analysis should be read as follows: If a composition because of its gross errors in mechanics proves difficult to read and understand, it falls in the first division of the scale (0-30). If the meaning is not clear after repeated attempts to decipher the composition, the value assigned it should be between o and 9 points, depending on the amount which can be read. If the meaning is decipherable, but with difficulty, its quality should be rated from 10 to 19 points, and so on. 246 THE GARY SCHOOLS TRAINING OF SCORERS The five judges who scored the eighth grade samples at Gary had had some experience in the use of the scale, but only one (the writer) was convinced that scoring by means of a scale yields constant and reliable results. Accordingly, the first work of scoring was the training of these judges in the use of the scale. The samples of the Hillegas Scale were cut apart and given to the scorers, one at a time, to be arranged in order of merit, as was illus- trated in the case of samples i, 2, and 3, pages 241 and 242. The differences between the first samples given out for comparison were made very large in order that the judg- ments of the scorers might agree. This estabhshed the fact that the judges were able to recognize gross differences in merit. Little by httle, however, the amount of difference from sample to sample was decreased until the limits of discrimination of the judges was reached. A discussion of the reasons for the relative positions assigned the samples by each judge, and of the characteristics of the samples themselves, then followed until some agreement had been reached as to the basis on which judgment was to be made. Finally, a certain amount of practice scor- ing was done on samples whose values are known, using those given in the Butte and other surveys, and in the articles on measurement of English composition which have been pubHshed from time to time.^ In this way the basis of making judgments was soon standardized. 'At tMs time the standard samples issued by Professor Thorndike were not available. ENGLISH COMPOSITION 247 SCORING OF PAPERS When it had been determined that the judges could use the scale consistently, the scoring of the Gary composi- tions began. The eighth grade papers were scored by all the judges, but two judges did most of the scoring of the papers of other grades, each class being scored by a single judge. At intervals, however, these judges re-scored certain classes to make sure their standards were not changing. Also, as early as possible a set of compositions (mainly those given in V, Appendix A) was chosen from the Gary papers to form the Gary scale, and in cases of doubt papers were referred to both the Gary Scale and the Hillegas Scale. Most of the judges, however, pre- ferred to use the Hillegas Scale rather than the scale of uniform material derived from it. Each paper was assigned the mark which expressed the judgment of the scorer as to its true value, as 37, 39, 43, etc. In this respect the practice at Gary differed from that sometimes followed of giving each paper the value of the scale sample it most resembled. ,Thus in the Butte survey, compositions were marked at "o, i, 2, etc., according to which one of the printed compositions they thought it most like." A similar practice was fol- lowed in the Salt Lake City survey. RELIABILITY OF SCORING A study was made of the variations in judgments in scoring the eighth grade samples. For instance, in one 248 THE GARY SCHOOLS X w I-) C ^ M 3 n < w H 3 u S?3 mnc4'-icv1'hm-<'-i.-i t-H i-H w 1-. eg •-! cOMNt~oowoLoiooomcot>«LooMOLnt^cooo I + l + l I +1 I 1 + 1+ +111 + irtooM-^cocoooinLoinooooooooMM^ooNNOOoooin I 7+1+1 (+1 +++I 11+7+ ++ OL005U5rtt>Caioooinoooooot>caLoirtMmoooMOLf5 +++ ++++I++ ++ +• ++++++ +i77 COCOMCOCOCOCOCOCO-*-*-*'*'*-"*''!"'*-*'*-' tDMOOtO(OOOOOCOOOOOCOO«5MOOMmCCMOOQt^MQ«OOOt5-lrt t^ ootoouooooooooootnoooinowQgoscQQujQLn <2: ^owLo^^gwogoowg^g^o^gwsSSSlSSSSJSS iNM^riotot^oooio^NM-^Lntot^oogOj-Ncoaifttcj^c ENGLISH COMPOSITION 249 8> ■it •* 00 IS m ^2 COM a) f» M M i 5^ oco It ■O " t< J. hJ3 "-■" o ■" ca "la ^ to S>*! I- o !5 !».mO ■a bS'S .2 .*.« 5 « I ^^ . > M 0) *? ft >> r. « ^ T3 -o « - r-ja • S3 n t^ fj m3t3 bi d 3 >►-, >> 3 O 51 be d '^ 3 'r! o. r- "^ E?'=5 2 3 «i o'^ -?. rt M r 2-«^ •a M ■*-» ^ O tn g-3 ii_ 2 S i> p s ? 13 ""5 0) o) « . 5 >> ^ ° a "-S S "05* rt 2 £? -1 o«a^ a «> ~ •" g 3 g 3i!vM a> tU [> ^ Q.-3 £ TS ^ 3^ 3 o;g . ■^g^6fi««'H§-S.. •="^2 00 =■'" 2SO THE GARY SCHOOLS Figure 41 Variations in Quality — ^Based on Table LX ENGLISH COMPOSITION -VARIATION IN SCORING QUA'LITY-HfLLEGAS DEVIATIONS FROn MEDIAN O 20 4-0 60 +10 +5 +1 0-1-5 -10-15 J — i A B II D £ A . B 13 c D E CLASS T to 2 7 till A B 7 C D E CL B A B 24C E A B ^0 1 1 1 1-Ht-I 1 CO 1 1 (MOq ] 1 IX> i 1 1 |'-<|tHth|t-Ii-Ith| '^ 1 1 1 1 1 1 1 1^ i 1 1 1 *iou3«D«ot-t>ooooa50i 1 S2 o u > < (3 ro £; en Q, aliza otal tion S er r of •t:!^ £5-0 2 art 3 aj G '-' « fl J5 H Pi P-3 -S ^ (u _ a u a g mista e thir takes paper of ty 2 a o:S-§«S % -^^o*:- A pupil N d type an pitalizatio ing to Ju per contai lU > c3 X) (U Cl (J lU C! > 3 in"^ 04:3 5-^ P bM 60 2 a bC-M (U S 4J ^ 'ui ^ '^ Jj-O^ o-g « S?8.s a rt cd ;-( o^<; *- So; .M., u ' fO-* '-=1 S-H 'S 1:2 JZ^ >^ e? 3 rt bxJ-^'S 258 THE GARY SCHOOLS In the Willing Composition Scale used in the Denver survey, not only is judgment based on general merit, but the attempt is made to grade the samples on the basis of the frequency of error as well; thus, composi- tions of quality 20 have, on the average, 30 mistakes in spelling, punctuation, and syntax per 100 words written. By quality 50 the number of mistakes has fallen to 14 and by quality 70 to 8. Unfortunately, however, no statement is made as to the type of mistakes which were counted as errors. For the Gary work, after a careful study of represen- tative papers, it was decided to Umit the scoring to gross errors. For capitalization three mistakes only were counted: 1. Failure to begin a sentence with a capital letter. 2. Failure to capitalize a proper noun. 3. The capitalization of common nouns. In punctuation, also, only three errors were counted: 1. Failure to place a period at the end of a sentence. 2. Failure to place a question mark at the end of a question. 3. Failure to enclose a direct quotation in quotation marks. In some cases it was found that where the period had been omitted at the end of a sentence, the following sentence was not commenced with a capital. This was not counted as two errors, but recorded as the "period" error only. ENGLISH COMPOSITION 259 Eight types of errors in syntax were recorded : 1. The use of the wrong case form, as "me and him went." 2. The use of one word in place of another, as "it would of (have) been." 3. Lack of agreement of noun and pronoun, as "the pieces were about the size— and it break," 4. Lack of agreement between subject and verb, as "they was." 5. Use of the wrong tense form, as "seen" for "saw." 6. Use of the double negative. 7. Confusion of dependent and independent clauses, as ". . . away, but worst thing was that there were not light on the streets and no road but a Httle path through the wood but I dressed up and took my dog and started off we were not far from home when my dog his name was Rover began to chase after I was a fright to go myself and began. . . ." 8. Omission of words essential to the thought, or the addition of irrelevant words, as "began to chase after (a rabbit, omitted) I was a fright to go myself." A few gross errors were recorded which do not fall under any of the headings given above, but their number was so small (8 per cent, of the total) that it has not been thought necessary to list them in detail. Each paper was scored by two examiners independ- ently. There were marked disagreements between their 26o THE GARY SCHOOLS TABLE LXII COEFFICTENTS OF CORRESPONDENCE BETWEEN QUALITY AND A NuMBER OF Specific Characteristics^ Quality Total Length DifEerent Words Vocabulary Index Spelling Errors (Coefi&cient) Errors in Grammar (Coefficient) .... Errors in Spelling and Grammar (Total per paper) MEDIAN 43.5 209 106 22.5 22.5 10.5 MEDIAN DEVIATION 4 54 24.5 2.8 12.5 15.2 6.5 TOTAL R.4NGE 36- 67 107^26 65-161 35- 62 0- 88 0-129 0-43 Percentage of Total Cases Which Do Not Vary in Relative Position More Than One Unit of Variability, Where Quality of Composition Is Compared with total NUMBER OF WORDS NUMBER OF DIFFERENT WORDS VOCABU- LARY INDEX COEFFICIENT OF ERRORS IN SPELLING COEFFICIENT OF ERRORS IN GRAMMAR TOTAL MISTAKES PER PAPER 33 33 33 16* —38 —30* —52 —44* —48 —36* The coefficient of correspondence between errors in spelling and errors in grammar was 48 per cent. (+ .60)*. 'Based on papers of forty two eighth grade children present for all tests. 'Pearson coefficient of correlation. This table is to be read as follows: If each child's position in the class distribution for total length of composition be compared with his position in the class distribution for quality of composition, when both positions are expressed in terms of the median deviation of the class as a whole^ 33 per cent, of the children will be found to have maintained the same relative position within one unit of variability. ENGLISH COMPOSITION 261 FiGTJEE 42 Degree of CoiiRESPO>rDENCE Between Errors nsr Spelling and Errors in Grammar "^•fWuHiy tLaiio Cofrel.tion CorT..po„,i.„c.of ACTUAL MISSPELLINGS to ACTUAL ERRORS !N GRAMMAR C.T.4. V„. 2.5 C.T.5.5 Vor. 4.0 > I ■" > it :=^ V N«..Wl 3 5 7 9 II 13 15 ir 19 21 23 25 27 29 31 33 J5 37 '„' 41 Tot«l Numbtr of Cnt»AZ.^amUt wlihl- I-, ,l!„i.. Pq_ Fnctnlt^ af CotM«pBndw.M__iiflg. The numbers along the base of the figure represent the 42 individuals of an eighth grade group. The scale along the left hand axis represents units of variability (median deviation above and below the median of the class). The solid line represents variability ratios for errors in spelling. The broken line represents similar ratios for errors in grammar. The curves show that approximately 50 per cent, of the children maintain the salne position in the two distributions within one unit of variability. That is, individual No. i makes very few errors in either spelling or grammar. Individual No. 3, however, while at the top of the class in accuracy of spelling, is below the median for number of errors in grammar. Individuals Nos. 12,25, and 39 represent extreme deviations. Individual No. 37 represents a variant of the opposite type. In other words, he makes many errors in spelling, but is exactly at the median for errors in grammar. The reader should note that both distributions are badly skewed; the range of variation above the median being a little over i and below the median 8.8. 262 THE GARY SCHOOLS scores. The attempt was made to harmonize the various judgments, but it proved so costly in time, and the final decision seemed to rest upon such uncertain bases, that in view of the lack of comparative data it was decided the results were not worth the time and effort. Accordingly, the averages of the errors recorded by the two exam- iners were taken as the scores for each individual.^ FACTORS DETERMINING MERIT The tabulation of the different types of errors affords a chance to investigate the relation between general merit in compvosition and its various characteristics — spelling, punctuation, and grammar. It would appear that judg- ments as to quahty of composition are affected more by errors in spelHng (coefficient of correspondence 38 per cent.) and still more by errors in capitaHzation, punctua- tion, and grammar (52 per cent.) than by the number of different words used {;^;^ per cent.) (Table LXII, page 260) . The coefficient of correspondence based upon total mis- takes is intermediate between those for spelling and gram- mar (48 per cent.) (Figure 42, page 261). These coeffi- cients mean that in judging compositions one is influ- enced now by one factor and now by another. ^A complete record of the scoring for Class No. 45 Froebel will be found in Table LXI, page 256. VII. READING " §1. General Results THE Gary schools recognize the importance of read- ing by allotting to the subject annually 1,323 hours, as compared with 1,280 hours in the conventional school, or 26 per cent, of the time given to fundamentals, as compared with 24 per cent, in conventional schools. READING ABILITY In current practice the direct teaching of the mechanics of reading rarely extends beyond the third grade. Read- ing in the higher grades passes over into training in ex- pression, in understanding, and in appreciation. Conse- quently ability in reading comes to have many meanings, each derived from the situation to which the term is applied. AbiHty in reading may mean : (i) Ability to recognize silently the general meaning of words of a given range of difficulty. (Otis.) (2) Ability to " sound '* correctly a given set of words. (Jones.) (3) Ability to read aloud smoothly and with proper expression (without regard to whether the mean- ing is understood or not). (Gray.) 263 264 THE GARY SCHOOLS (4) Ability to read either silently or orally and to under- stand the essential relations existing between the essential elements of what is read. (Courtis.) (5) Ability to read either silently or orally and tell in one's own words the substance of what has been read. (Starch, Brown, Gray.) (6) Ability to read instructions either silently or orally and be able to act in accordance with the instructions. (Kelly.) (7) Ability to read again and again (study) until one has mastered the contents of a passage so that one can answer questions about it or use the information in solving problems. (Thorndike.) (8) Ability to read a passage and interpret the allu- sions which it contains. (9) Ability to read a selection and be stirred emo- tionally by its aesthetic elements. (10) Ability to read a passage and interpret the mood, ideas, or ideals of the author. (11) Ability to read a selection and make Judgments as to its style and merits as a piece of "good English." And there are doubtless many other possible variations of the senses in which "ability to read " may be understood. Unfortunately, the makers of tests have not given much attention to this phase of the subject. They have labeled their productions "Tests of Reading," and have been content to say explicitly^ or to imply that by reading ^Teachers College Record, January 191 6, p. 40: An Improved Scale for Measuring Ability in Reading — Thorndike. "Call difficulty for para- READING 265 ability is meant the ability to complete this test success- fully. But when several such tests are given to the same children, as at Gary, one would seem to have the right to expect that the reading abilities of the children as revealed by one test will show some agreement with the reading abiHties revealed by the next test, since both are tests of reading. This, however, is precisely what the results in general do not show.^ The tests given at Gary simply reveal the character of the response made by the children to a nimiber of specific situations in which ability in reading enters as one ele- ment. To aid the reader in appraising the value of the results and in deciding to what extent each test measures mainly reading or mainly some other abilities, some de- scription of the tests used is necessary. MEASUREMENT OF ORAL READING In oral reading, the measurable elements are the rate and accuracy of reading, and the quaUty of expression. graph reading a characteristic of a paragraph and question about the paragraph much of which produces a large percentage of wrong responses to the questions, and little of which produces few, the individuals con- cerned being the same. "CaU achievement in paragraph reading that thing much of which en- ables an indi\ddual to respond correctly to a paragraph and a question about the paragraph involving much difficulty for paragraph reading whereas an individual of less achievement could respond correctly only to a paragraph and a question about the paragraph of less difficulty." iSee Richards and Davidson, School and Society, Vol. IV, September 2, 1916. 266 THE GARY SCHOOLS Gray's Oral Reading Scale was used to measure the first two of these elements. This scale consists of a number of paragraphs each a little more difficult in content, vocabulary, and structure than the one before it. It is essentially a difiiculty test; that is, each child begins with simple material well within his range of abihty and progresses through the scale until he reaches material which is so difficult that he fails. The time required to read each paragraph and the mistakes made in reading them are noted. It is thus possible to report the results of training in objective terms. The median ability of eighth grade Gary children in oral reading when thus measured may be inferred from Figure 43, page 267. Half of the eighth grade children are able to read satisfactorily the sample . paragraph there shown. (Gray's Standard 4.)^ The range of ability in the eighth grade may be com- prehended from the samples shown in Figure 44. Ap- proximately 10 per cent, of the children are able to read paragraph B under the standard conditions, while about 10 per cent, are not able to read paragraph C under the standard conditions. That is, the abilities of about 80 per cent, of the eighth grade children fall between these limits. The abilities of the Gary children expressed in terms of ^A paragraph is accepted under Gray's Standard 4 if read without more than one mistake, or if read in less than 20 seconds without more than two mistakes. READING 267 the Gray Scale are as follows: The second grade child of median ability is just not able to read paragraph i under Gray's Standard 4, the median third grade child can read paragraph 2, but not paragraph 3, while the median eighth grade child can read paragraph 8, but not paragraph 9. If successful reading is taken as reading a 'paragraph in less than 40 seconds with not more than six errors (Gray's Standard i), the median second grade child can read paragraph 3 but not 4, while the median eighth grade child can read paragraph 11 but not paragraph 12. Figure 43 Dlfficulty of Materiax Represented by Eighth Grade Score SAMPLE A 8 The crown and glory of a useful life is character. It is the noblest possession of man. It forms a rank in itself, an estate in the general good will, dignifying every sta- tion and exalting every position in society. It exercises a greater power than wealth, and is a valuable means of securing honor. Half of the eighth grade children are able to read orally the paragraph above in from 20 to 40 seconds without making more than one mistake, or to read it in less than 20 seconds without making more than two mistakes. (Gray's Standard 4.) The sample is paragraph 8 on Gray's Scale. 268 THE GARY SCHOOLS Figure 44 Variations in Eighth Grade Ability SAMPLE B 10 Responding to the impulse of habit Josephus spoke as of old. The others lis- tened attentively but in grim and contemp- tuous silence. He spoke at length, continu- ously, persistently, and ingratiatingly. Fin- ally exhausted through loss of strength he hesitated. As always happens in such exi- gencies he was lost. SAMPLE C 3 Once there were a cat and a mouse. They Hved in the same house. The cat bit off the mouse's tail. *Tray, puss," said the mouse, ''give me my long tail again. ' 'No, ' ' said the cat, ' 'I will not give you your tail till you bring me some milk. READING 269 Figure 44 — Continued Ten per cent, of the eighth grade children are able to read Sample B satisfactorily (under Gray's Standard 4),^ and 10 per cent, are unable to read satisfactorily Sample C. That is, the ability of 80 per cent, of the eighth grade children ranges between ability to read paragraph B and paragraph C. Sample B is paragraph No. 10 on Gray's Scale. Sample C is para- graph No. 3 on Gray's Scale. Sample C represents the median develop- ment of children in the upper half of the third grade at Gary. (Table LXIII, page 270.) That is, the development of ability in oral reading at Gary is quite uniform from grade to grade. (Figure 45, page 271). COMPARATIVE DATA^ For many who are not directly engaged in teaching, the tables and illustrations given on pages 267 and 268 will have little meaning. Gray's Scale, however, affords a score in points based upon the difficulty of the para- graphs, the time taken to read them, and the number of errors made. In points the Gary second grade score was 27 (Cleveland 42, Grand Rapids 44, St. Louis 47, aver- age of 23 Illinois cities 20) and the Gary eighth grade score 41 (Cleveland 48, Grand Rapids 48, St. Louis 51), (Table LXIV, page 272.) The development of ability in oral reading at Gary closely parallels the development of other cities, but at a level which is about a year below the average of other cities. (Figure 46, page 273.) ^To meet Gray's Standard 4, a paragraph must be read without making more than one mistake, or read in less than 20 seconds without making more than two mistakes. ^See page 38. 270 THE GARY SCHOOLS The differences in the performances of the Gary chil- dren as compared with those recorded for the average children in conventional schools are brought out by an analysis of the results. Thus, to read paragraph 4 the Gary eighth grade children required 18.7 seconds (Cleve- land 16.62, St. Louis 17.85) and made 2.1 mistakes (Cleveland 1.24, St. Louis .73). That is, the Gary chil- dren read more slowly and make more mistakes than the children in Cleveland and St. Louis. Silent reading was tested at Gary by a Reading and Reproduction Test, by the Kansas Silent Reading Test, and by the Trabue Language Scale, but it should be recognized at the outset that, because of the dif&culty TABLE LXIII Median Paragraphs of Gray's Scale Read by the Different Grades grade standard 1 STANDARD 4 difference 2 3.3 .9 2.4 3 5.4 2.1 3.3 4 6.8 3.9 2.9 5 7.6 4.4 3.2 6 9.0 5.4 3.6 7 10.7 6.8 3.9 8 11.3 8.1 3.2 Standard i. A paragraph is accepted under Gray's Standard i if read in forty or more seconds without more than four mistakes, or if read in less than forty seconds mthout more than six mistakes. Standard 4. A paragraph is accepted under Gray's Standard 4 if read without more than one mistake, or if read in less than twenty seconds without more than two mistakes. READING 271 Figure 45 Development in Oral Reading Ability— Gray's Oral Reading Scale PARAGRAPH \t- DEVELOPHENT OF ABILITY ^tanoaro 10 IN ORAL READING ^ 9 ^y^ STANDARD S ^-^'^""^^ '-'■''* 7- ^^--"^'^ „'" 6- ^^.^-^""^ -''' 5. 4- y^ ,.. ' " 3 ■ ^''' 2 ,"'' 1 - ^-'■' 0- GRADE. The scale along the base of the figure represents grades. The scale along the vertical axis represents the paragraphs of Gray's Scale. The solid line shows the median ability of each class under Standard i in terms of Gray's Scale. Broken line shows the median ability of each class in terms of Standard 4. The two curves show there is a steady and quite uniform development throughout the grades. of the problem and the lack of adequate tests, the meas- urements of silent reading at Gary are less conclusive than are the measurements previously discussed. Silent reading is carried on primarily for the reader's benefit. Its most important aspect is the degree of com- prehension of meaning, its second, the rate of reading. Unfortunately, however, silent reading, pure and simple, is limited to the perception and comprehension of the 2 72 THE GARY SCHOOLS TABLE LXrV" Ability in Oral Reading — Gray's Oral Reading Scale Scores by points according to Gray's methods. From each class at Gary at least ten children were measured (thirty children or more from classes in grades three, five and seven). Selections were made on basis of teacher's judgment, the three best readers, four average readers, and the three worst readers beinK chosen. GRADE 2 3 346 36 4 5 6 134 41 7 219 42 8 Number of children Gary, Actual Average 102 27 126 39 297 39 52 41 23 Illinois Cities^. Cleveland- Grand Rapids^ . . St. Louis* 20 27 [ 40 44 45 47 42 46 47 48 49 47 44 47 49 50 48 48 47 50 52 51 51 51 48 48 51 •Studies of Elementary School Reading through Standardized Tests — Gray. Page 130. 'Studies o£ Elementary School Reading through Standardized Tests — Gray. Page 131. 'Grand Rapids Survey. Page 66. ^Survey of St. Louis Public Schools, Vol. II, p. ia6. This table is to be read as follows: In Gary 102 second grade children, selected as representative children from the various classes tested, made an average score of 27 points when measured with Gray's Scale according to Gray's directions. Twenty three cities in Illinois made an average score of 20 in the second grade, Cleveland — 42, Grand Rapids — 44, and St. Louis — 47. matter read, while any test of comprehension is neces- sarily based upon the response an individual makes to the test situation. This brings into play new factors. Accurate measurement of ability in silent reading, there- fore, is exceedingly difficult, and at the time of the Gary survey there were no wholly satisfactory tests of silent reading. What is probably the best silent reading test of all, Thorndike's Scale Alpha 2, it was not practical READING 273 Figure 46 City Wide Average Scores by Grades — Gray's Oral Reading Scale School CIJ-Y- THE GARY SURVEY _Te.. No.- Q.R AL RCAptNG - GRAY GRAND RAPIOS GARY GRADES The scale along the base of the figure represents grades. The scale along the vertical axis represents scores in terms of Gray's Scale. For methods of computing these scores and of drawing the graph, see § 2. The solid line represents the Gary results; dotted line the scores made by the Grand Rapids pupils. The average difference between the Gary and Grand Rapids results is 7.6 points, while the average annual growth is 5 points. The Gary scores are thus approximately a year and a half lower than those from Grand Rapids. to use owing to the conditions under which the survey was conducted. However, several of the conventional read- ing tests were given and while the results do not tell the whole story, they tell enough to show some of the important characteristics of the product. 274 THE GARY SCHOOLS REPRODUCTION TESTS The simplest test of silent reading would seem to be the measurement of the rate at which a story is read, and the quality of a reproduction of the story. This method has been followed by Gray, Starch, Brown, and other investigators. However, a little reflection will show that reproduction is determined more largely by (i) memory and (2) abiHty in EngHsh composition than by ability to read and understand. 1 Hence, reading and reproduction tests were given at Gary to determine (i) the median rate of reading for the different grades, and (2) the stability of the reading habits of the Gary children as measured by individual fluctuations in rate in successive tests.^ The test materials were interesting stories taken from children's magazines. A very simple story was used in grades two to five, a little more complex story in grades five to eight, and a portion of an adult biography in grades eight to twelve. Finally a fourth story (child's) was given to grades four to twelve under uniform conditions. Thus all grades from four to twelve were measured at least twice, and some three times. From ^For a critical discussion of what reproduction tests measure and their relation to silent reading, see §2 of^this chapter, page 314, and IX of Appendix A, page 443. The tests were also scored for quality of reproduction according to the conventional plan, but as the writer does not regard the results as significant except as a confirmation of the conclusions of the chapter on English composition, they have been put in IX of Appendix A. READING 275 the repeated measurements, tabulations were made, both of the rate of reading and of the amount of indi- vidual variation in successive tests. The median rate at which the eighth grade children read the children's stories were 201 and 207 words per minute respectively. The median rate for the more difficult story was 170 words per minute. A rate of 204 (the average of 201 and 207) words per minute was chosen as best representing the ordinary reading rate of eighth grade children on material suitable for their grade. The corresponding eighth grade rate of oral reading was 200 words per minute (based on paragraphs 2 and 3 of Gray's Oral Reading Scale). That is, the rates of oral and silent reading were nearly the same. (Table LXV, page 276.) A comparison of the rates of oral and silent reading by grades shows that the rate of oral reading is at first greater than that for silent reading (second grade rate for oral reading 78 words per minute, rate for silent read- ing 54) but from the sixth grade on it is the rate of silent reading that is the greater (sixth grade oral 183, silent 185). The development of ability in silent reading is rapid in the lower grades (difference between oral and sUent reading for grade two was 20 words per minute; grade six, 2 words per minute) but from the sixth grade on the two rates differ but little. (Figure 47, page 278.) The curve for the development of rate of silent reading is of the conventional type and shows the usual well marked tendency to reach a maximum at the eighth 276 THE GARY SCHOOLS p^ > y, .a a y, ^ J a OJ bio M c^ ft C w < ^Pi H "HI ^2 ;z: § ^^ ooo->*ioa5(Mo lOO'^tOOOOiOCO-r}coiooo i-Hr-(T-H(M(M(MH bO ^ O ■^■0513 "5 I— < . >-i -w o c.S y? tn X H l-l < w Pi hJ m pq < H < <: Q w > , B O O t~ CO t> t- liJ o 1— ( CO !X> t- CX) o o 1— 1 T-l TH 1-1 T-I(M (M H ^ ' a * § 3 05 CO »a CO •<* t- o 1-1 ->* 00 O O (M > iH 1-1 T-( i-l(N (M (M % § CU >< < CO «D U5 lO 00 Oi to t- O (M U3 t- t- 05 1— 1 I-< I— I 1— ( 1— ( T— 1 « s ■^ O •>* 00 1-1 O 05 q 00 O i-( OS 1— 1 O 1-H rH i-l (M ( OOOOOCDO-^ < o t- CO to «D OS T-( O 1— 1 iH i-( 1-1 N (M t e < (NJ to Tj) lO ?D t- 00 (S c 4) cd _cg^ ■ > (UU O 4-1 H — o o "^ -S tn "^ O) > HT3 g 0-2 S fl ° S'-S ^ " -M o "^ cd ^"^ 2 S 4) *^ 43 m O M a> uj U) M 5 .n o to t: 5 5 S rt "I e READING 281 first two paragraphs of the test for the third, fourth, and fifth grades (Test I), two paragraphs of a different sort (numbers six and eight) from the test for the sixth, seventh, and eighth grades (Test II), and two para- graphs of still a different kind from the test for the ninth, tenth, eleventh, and twelfth grades (Test III) are shown in Figure 48, page 283. From these it will be seen that the tests cover a very wide range of reading material and very different reading situations, yet in every case the response is simple and is easily judged to be right or wrong. It will be apparent upon inspection that while reading enters as one element of the activities called for by the test, the other activities of observing, analyzing, judg- ing, reasoning, etc., form such a large part of the total activity that the tests may legitimately be considered to be measures of general intelligence rather than meas- ures of mechanical skill in reading. In the judgment of the author of this report, the tests afford a valuable index of the degree of development attained in the ability "to read and think about what is read." At Gary Test I was given to all grades from the third to the twelfth, but the time was reduced to three minutes in all grades above the eighth, and the scores that would have been made in five minutes, the standard time allowance, were computed by multiplying the actual scores by %. Test 11^ was given in grades four ^Scores in Test II have proved to be slightly lower than scores in Tests I and III. 282 THE GARY SCHOOLS^ TABLE LXVII Comparative Data for Rates of Silent Reading GAEY» staechS SALT LAKE CITY' GHAY< BROWN^ COXIRTIS* NOBMAL READING ACTUAL ADJUSTED CAREFUL READING 2 3 4 5 6 7 8 9 10 11 12 54 109 140 166 185 198 204 235 249 262 270 108 126 144 168 192 216 240 189 212 219 209 90 138 132 152 167 161 172 90 138 180 204 216 228 234 199 213 269 272 279 290 161 180 226 256 262 106 183 172 178 200 Material increases in difficulty from lower to JOn the basis of uniform material. ^Educational Measurements, p. 32. higher grades. 'Report of Salt Lake City Survey, p. 160. Uniform material. *Scale from diagram 17, Studies of Elementary School Reading through Standardized Tests— Gray. Adjusted rates are on basis of uniform material. 5 Bulletin No. i. Bureau of Research, Department of Public Instruction, New Hamp- shire, p. 57. Variable material. ^Fourteenth Yearbook of the National Society for the Study of Education, p. 50. Uni- form material. Rates in different cities are based on different materials, hence the results are of value only for general comparisons. This table is to be read as follows: In Gary, the fifth grade children read silently a simple story at the rate of 166 words per minute. Ac- cording to Starch the average rate at which fifth grade children should read stories silently is 168 words per minute. In Salt Lake City the average rate for the fifth grade was 189 words per minute. According to Gray the rate is 152 words per minute on difficult material or 204 words per minute if adjustment is made for difficulty. (Based on scores of 13 cities of Iowa, Minnesota, Tennessee, and Illinois.) His method of computing rate of silent reading is considered later. According to Brown, the best average made by any fifth grade class so far measured by him is 269 words per minute. According to Courtis, the average rate of nor- mal reading is i8o words per minute; of careful reading 133 words per minute. READING 283 FiGXJRE 48 Sample Paragraphs from the Kansas Silent Reading Tests PARAGRAPH A FROM TEST I I have red, green and yellow papers in my Value hand. If I place the red and green papers on 1.2" the chair, which color do I still have in my hand? PARAGRAPH B FROM TEST I Think of the thickness of the peelings of Value apples and oranges. Put a line around the 1 . 2 name of the fruit having the thinner peeling. Apples Oranges. PARAGRAPH C FROM TEST II In going to school, James has to pass John's Value house, but does not pass Frank's. If Harry 2.3 goes to school with James whose house will Harry pass, John's or Frank's? PARAGRAPH D FROM TEST 11 Here are two squares. Draw a line from the Value upper left-hand corner of the small square to 2 . 6 the lower right-hand corner of the large square. D 284 THE GARY SCHOOLS Figure 4S— Continued PARAGRAPH E FROM TEST III Bone is composed of animal matter and mineral matter. The former gives it tough- Value ness and the latter rigidity. Yesterday I 4.3 placed a bone from a chicken's leg in a bottle of acid, and found this morning that I could wrap the bone around my finger Hke gristle. Which kind of matter was removed from the bone? PARAGRAPH F FROM TEST HI There are three horizontal lines; the first is three inches in length, the second two inches, the third one inch. We know that if the second Value and third Hnes are joined end to end the re- 4.8 suiting line will be as long as the first line. Suppose that the first and second lines are joined end to end. How many times as long as the third line will the resulting line be? Paragraphs A and B from Test I (for grades three, four, and five). Paragraphs C and D from Test II (for grades five, sLx, seven, and eight) . Paragraphs E and F from Test III (for high school grades) . These paragraphs were selected as representative of the different ty'pes of activities called for in the test. It should be noted that while the READING 28s Figure 48 — Cofitinued reading activity is one element determining a correct response, it is but one. The Kansas Tests probably measure the ability to read and under- stand, and to think or reason about what is read. to nine and Test III from grades eight to twelve. The conventional scores were found according to the standard instructions, and for comparative purposes only those results were taken which were derived from tests given to the grades and under the conditions provided in the standard directions. For the discussion of the condi- tions as they appear at Gary, however, the results from Test I (for all three tests in the eighth grade) were tabu- lated for rate of work (total mmiber of points attempted) and for accuracy (ratio of points right to points at- tempted, expressed as rate per cent).^ The eighth grade at Gary attempted 23.9 points in Test I with an accuracy of 83 per cent. (Table LXVIII, page 287). Both the rate and the accuracy of work in- crease quite regularly throughout the grades (third grade 12.3 points attempted, 28 per cent, accuracy) showing that the development of abiHty in rate is of the conven- tional type (Figure 49, page 288) . The ninth grade scores indicate a very great increase over those of the eighth grade (23.9 to 34.0 points attempted and S^ to 88 per cent, accuracy), but this may be due to difference in con- ditions imder which the tests were given. The reader should note that the level of accuracy ^The reader should note that this method of tabulation is a departure from conventional methods. See also page 186. 286 THE GARY SCHOOLS reached by the eighth grade is St, per cent. That is, half the children in the eighth grade are able to read the simple paragraphs designed for the measurement of the ability of the third grade children only well enough to give correct answers to a little more than eight out of ten questions. This will furnish the reader with a basis of estimating directly the degree of ability in reading of the Gary eighth grade children, and of interpreting the results given in the table.'^ In terms of the conventional score for the Kansas Test, the ability of the eighth grade is about 21 points (Test I, 21.3 points; Test II, 18.7 points, Test III, 22.1 points). (Figure 50, page 289.) COMPARATIVE DATA- The scores of the Gary eighth grade classes are practi- cally the same as the corresponding scores made by classes in other cities (Gary 18.7, Kansas 20.1, Iowa 20.6, Detroit 19.0, New Orleans 19.1, combined tabulations from all parts of the country 19.2). The scores of the third grade at Gary are very much below those made by other cities (Gary 2.5, median of country 5.3) (Table LXIX, page 290). That is, the abiKties measured by the Kansas Reading Tests begin to develop later at Gary ^The reader should note that this statement stands by itself and has no significance from the comparative point of view. In the absence of com- parative data, it is impossible to say whether this performance is better or worse than that of children in conventional schools. -See page 3S. READING 287 TABLE LXVIII City Wide Median Scores — Kansas Reading Tests GRADE GARY TEST I GARY TEST if GARY TEST II GARY KATE ACXJTJRACYt TEST in 3 12.3 28.2 2.5 4 14.9 50.1 6.7 5.6 6 18.1 59.3 9.8 9.9 6 20.5 68.5 13.8 11.9 — 7 28.8 75.2 18.3 15.3 — 8 23.9 83. 21.3 18.7 22.1 9 20.4 34.0* 87.8 14.2 23.7* 22.2 24.5 10 20.1 33.5* 92.7 16.7 27.8* — 29.2 11 20.2 33.7* 97.7 17.7 29.5* — 33.5 12 18.5 30.8* 95.7 16.8 28.0* — 29.6 •Smaller values, actual scores made in three minutes; larger values, scores computed on basis of five minutes, the standard time allowance, t Disagreement due to difference in method of tabulation. This table is to be read as follows: The third grade children in the Kansas Silent Reading Tests attempted 12.3 points in the time allowed^ of which 28.2 per cent, were correct. The score of this grade computed in the manner provided by Kelly is 2.5 points. The score of the ninth grade in Test I was 20.4 points attempted in three minutes, from which the amount that would have been done in five minutes, the time allowed the lower grades, was computed to be 34.0 points. The accuracy of the ninth grade was 87.8 per cent. The score of the ninth grade as determined in the manner provided in the instructions, 14.2 points right in three minutes, or 23.7 points right in five minutes. In Test II it was 22.2 points right in five minutes, and in Test III it was 24.5 points right in five minutes. The three scores 23.7, 22.2, and 24.5 show how correctly the three tests have been weighted to give scores of equal value. 288 THE GARY SCHOOLS Figure 49 Development in Rate and Accuracy — Kansas Silent Reading Tests AccuKACY KANSAS READING TE5T5 RELATION BETWEEN RATI AND ACCU15.ACY 80- 40 •RATE 5 10 15 20 Z5 30 The scale along the base of the figure shows the number of points tried. The scale along the vertical axis shows the per cent, the points right are of the points tried. The position of the various grade median scotes is shown by the figures on the curve. Ability to read the simple material in the test devised for the third grade develops quite regularly through the seventh grade. The eighth grade has a score which is greater in accuracy only. The high school grades are very much higher in rate and somewhat higher in accuracy. The break between the eighth and ninth grades is probably due in part to the selective action of promotion to high school upon ability and in part to the fact that in these grades it was necessary to stop the tests at the end of three minutes and compute the score that would have been made in five minutes. The curve for the development of reading ability in Gary is of quite the conventional type. READING :289 Figure 50 Development in Accuracy — Kansas Silent Reading Tests (Conventional Scores) T&STi TE5T 2 1151 3 30- 20 10- CBACBXCBACBACB^CBACBA 2345 i 7 S 9 GRAPES The scale along the base of the figure represents grades. The vertical scale represents the score in Kansas Silent Reading Tests as determined from the right answers according to the standard instructions. The solid Une represents Gary; the dotted line, scores made by about 9,000 chil- dren in 19 cities in Kansas. The Gary results are lower than the Kansas results in grades three, four, and five, and equal to the Kansas results in grades six, seven, eight, and nine, and are very much above the Kansas results in other high school years. than in the conventional schools, but by the seventh and eighth grades the handicap of the late start has been over- come. The scores in high school years at Gary are equal 290 THE GARY SCHOOLS ^ I ai OS 1 t-" 00 I as t-' C5 th i-i ai \ (M(N (MCa CO(MIMCOO0 (M woo 0"5 O00OU3t1< U5 CO t-' I CD CD I U3 (M ui CO ?D CD I I CO(N N0 CD oi t~^ CO CO »j3 I O 00 -* IC Cd' U5 I W(MCO(MCg (M CD CD tH CD 00 00 (M N CD N O t-( ooasoooomt-osooo osoios iH i-H (N C td •gUj bcc/2 {5 rt S8, « ,cf3 .2i rt U tn -^ ifi ^ ^^y, SW la (U •Ti u) to U •ss irt'O 43 S •-^^ 1) 4) ■S-^ HV. "S = — °m a 2 N D -^ 5fl '^>, c^ o-i 8^ s a Si =• • § .0-43 Hot READING 291 to or much greater than the corresponding scores in other cities, although the data for this statement are less com- plete than for the comparisons in the other grades (twelfth grade 29.6, median of country 29.7). As meas- ured by the Kansas Reading Tests, the eighth grade product of the instruction in reading in the grades at Gary is equal to that of the conventional school. TRABUE LANGUAGE SCALES The Trabue Language Scales consist of ten sentences differing in complexity, in which one or more words are missing. Children are asked to supply the missing words. Three representative sentences are shown in Figure 51, page 292. From these it may be seen that the Trabue Test measures a complex of many abilities, of which reading is but one. The scales are issued in the form of four tests, known respectively as B, C, D, and E. A child is given as much time as he needs and his score does not show the amount of work done in this time, but the difficulty of the most difl&cult sentence that he is able to complete. Thus except for certain minor irregularities in scoring im- perfect answers, a child who is able to complete correctly sentence 2 in Figure 51 would be given a score of 10, while a child who was able to complete correctly sentence 3 in Figure 5 1 would be given a score of 20. Sentence 3 is twice as hard as sentence 2 in terms of the units adopted for measurement of the difficulty of these sentences. The score of a class, therefore, is to be interpreted in terms 292 THE GARY SCHOOLS not of amount of work, but of the degree of difficulty of sentence which the class is able to solve correctly without regard to the time required. It is perfectly possible, however, to u^e the Trabue Tests as rate tests and determine the rate and accuracy of work in order to secure a different type of information about the de- velopment of the abihties represented by these tests. At Gary Scales B and C were given under standard conditions, and Scales D and E were given under a time limit (two minutes) . Figure 51 Illustrative Sentences, Teabue Language Scale C Typic.\l Correct SENTENCES Answers 1. The sky blue. (is) 2. The rises the morn- ing and at night. (sun, in, sets) 3. One ought to great care to the right of , for one who bad (take, form, habits it to get kind, habits, away from them. forms, finds, difficult) The median eighth grade child at Gary has an ability represented by a score of 14.0 in Scales B and C, or of 12.9 and 12.3 respectively in Scales D and E when given under limited time conditions. Considering the entire range of scores from 6.5 for the third grade to READING 293 16.0 at the twelfth grade, the value which best represents eighth grade ability is probably 13.6 (Table LXX, page 294) . The development in the lower grades is rapid (from 6.5 at the third grade to 8.5, or 2.0 points increase at the fourth grade) and gradually decreases as the upper grades are reached (eighth to ninth grade difference, .7 ; eleventh to twelfth grade difference, .5). The development is, therefore, of the conventional type (Figure 52, page 295). COMPAEATIVE DATA^ A comparison of the Gary scores with the standards published by Trabue shows almost perfect correspondence at all grades (third grade difference, Gary, +.5; eighth grade difference, +.3; twelfth grade difference, — .2). The differences are sometimes positive and sometimes negative and are insignificant in amounts. Comparisons with such other scores as are available show the Gary scores slightly lower (eighth grade comparisons, Gary — Detroit, — 2.4; Gary — Chatham, — 2.1; Gary — ^Nassau County, — .4). Table LXXI, page 296.) On this basis the Gary schools would be judged slightly^ below conven- tional schools in the products measured by these scales. SCHOOL TO SCHOOL COMPARISONS Rather marked differences in reading abihty from school to school are evident. Thus, in oral reading the Emerson School is distinctly better than the other three schools. In silent reading the Froebel and Beveridge ^See page 38. ^See footnote, page 296. 294 THE GARY SCHOOLS TABLE LXX Scores in Trabue Language Scales Scales B and C given under standard conditions. Scales D and E given on a two minute time allowance instead of a seven minute al- lowance. GENERALIZED GARY SCORE 3 4 5 6 7 8 9 10 11 12 SCALE SCALE SCALE SCALE B C D E 6.3 6.7 5.5 5.1 8.5 8.8 8.2 8.3 10.2 10.2 9.7 9.9 10.7 10.9 10.7 10.7 12.9 12.5 12.6 11.7 14.0 14.0 12.9 12.3 14.4 13.9 13.8 12.7 15.4 14.0 13.5 14.4 15.3 16.0 14.7 14.4- 16.2 15.5 14.4 14.5 6.5 8.5 9.9 11.2 12.5 13.6 14.3 14.9 15.5 16.0 The table is to be read as follows: The median third grade score in Scale B was 6.3; in Scale C, 6.7; in Scales D and E, given under limited time conditions, 5.5 and 5.1 respectively. Considering the entire range of grade scores in Scales B and C, 6.5 has been set as best representing third grade development. schools read more rapidly than the other two, but in the reproduction test the Beveridge School falls much be- low those of the other three. As measured by the Kansas Silent Reading Test, the Jefferson School makes the best showing, with the Emerson a close second. Both the Froebel and the Beveridge schools have many low scores. In the scores from the Trabue Language Scales the differences are less marked. Froebel is again distinctly below the level of the Emerson and Jefferson, but the Bev- eridge does not differ markedly from them. It is probable, READING 295 Figure 52 Development of Abilities — ^Trabtte Language Scales THE GARY SURVEY tita«L. riTY T...N.« TPATMJF t ANfillAnF Ar ^I f A 15- - ^^?^ 10- ^f^ ■ GARY - J/"^ 3TANDAT&D > 5- n 0- CBACBACUACBACBACDACIA 2 3 4 5 6 7 8 9 10 11 n GRADES The scale along the base of the figure represents grades. The scale along the vertical axis indicates the median degree of development attained in Trabue units. The solid line, Gary. The broken line, Trabue Standard. The dotted line, actual score of Tests B and C. The Gary children do quite as well as, or a little better than, children in other schools, according to Trabue's Standard. therefore, that the differences summarized in Table LXXII, page 298, represent merely the effect of foreign parentage rather than any significant difference due to the extent to which the schools afford modern programs. *See footnote on page 296. 296 THE GARY SCHOOLS o ^ 5^ O W Q .1 th c0«0 K5i005(MlO«OCOOU30 «OOOOiT-IOJCO-rf-^lrt'cO 03rJOT-M ^ S V? V 9 ^ u . rt PS H -a 4) ^ 2 H \o b o a a; •a "S5 -a M c u 3 « in 'U c " - lot *J d M tn tn a> CO* '-I S o5 !^ 3 'I s « O M ■^ o in •*-" "^ 4) _« c li ^1 a READING 297 TRABUE LANGUAGE SCALE B Median Scores Grades^ Trabue Standard . . . Omaha (January) Louisville (White) .... St. Paul Rochester, N. Y Horace Mann School (Columbia University) Pacific School-Seattle. Nassau Co., N. Y Janesville, Wis Mobile, AJa Chatham, N. J Deadwood, S. Dak. . . . Muscatine, la. (Janu- ary) Cherokee, la Washington, la. (June) Webster City, I.. (March) Storm Lake, la. (Nov.) Sutherland, la Gary 5th 11.4 5A-11.1 5B-11.6 11.0 5A-10.9 5B-11.0 12.4 li.l 11.55 10.9 11.3 11.7 11.7 11.0 10.9 10.9 12.3 11.8 11.4 11.3 10.2 6th 12.4 6A-12.2 6B-12.6 12.4 6-A12.4 6-B12.5 13.5 12.2 12.57 12.1 12.4 12.8 12.7 12.2 12.8 12.2 12.7 13.4 12.6 12.9 13.4 10.7 7th 13.4 7A-13.1 7B-13.6 13.4 7A-13.3 7B-13.6 14.4 13.1 14.37 16.2 13.3 13.6 13.9 14.8 13.0 13.1 14.0 14.0 13 13 13 12, 8th 14.4 8A-14.1 8B-14.6 14.7 8A-14.7] 8B-14.8 15.3 14.0 14.75 17.1 13.7 14.0 14.3 14.4 15.8 13.3 14.3 14.4 14.4 14 13 14 14. iFrom Bulletin of Department of Educational Research, Om^ha, Neb. §2. Critical Discussion GRi\Y S ORAL READING SCALE A consideration of each of the various reading tests is now in order, the first of which is Gray's Oral Reading Scale. No discussion of the methods by which the scale 298 THE GARY SCHOOLS TABLE LXXII School to School Comparisons Number of Classes Whose Scores Vary from the City Wide Grade Median More Than One Tenth the Generalized Score Oral Reading Silent Reading — Story gray's scale RATE OF SILENT READING RATE OF REPRODUCTION SCHOOL TOTAL NO. OF CLASSES 4- 7 5 2 2 7 1 3 2 TOTAL NO. OF CLASSES + - TOTAL NO. OF CLASSES + - Froebel Emerson Jefferson Beveridge 28 12 16 11 32 18 18 17 9 4 4 5 3 6 7 5 32 18 18 17 7 2 2 2 5 3 1 8 KANSAS silent READING TEST TRABUE LANGUAGE SCALE SCHOOL TOTAL NO. or CLASSES + 6 6 10 3 15 1 1 9 TOTAL NO. or CLASSES 1, + - Froebel Emerson Jefferson Beveridge 30 15 16 15 22 11 16 13 3 4 3 6 2 2 This table is to be read as follows: In the Froebel school, of 28 classes tested in oral reading, 7 classes had scores markedly above the city wide medians and 7 classes had scores markedly below. In rate of silent reading, of 32 classes tested, 9 had scores markedly above the city wide medians and 3 below. In rate of reproduction, of 32 classes tested, 7 had scores markedly above the city wide medians and 5 below. In the Kansas Silent Reading Tests, of 30 classes tested, 6 had scores mark- edly above the city wide medians and 15 below. In the Trabue Lan- guage Scale, of 22 classes tested, none had scores markedly above the city wide medians and 6 below. READING 299 was derived or of the principles upon which it is based is necessary, as the same are available elsewhere.^ There are, however, certain criticisms of the scale itself and certain points connected with its use at Gary which the reader should know. Gray's Oral Reading Scale as given at Gary measures the ability to " sound" connectedly and correctly the words in a given passage. It does not measure expression nor does it measure understanding. The attempt was made to have the examiner record his judgment as to the quality of the expression in the reading, but it was found impossible to compare expression with a standard, except on the most intangible and subjective basis. The ex- aminers soon decided their records were so unreliable as to be worthless, and the practice was discontinued. It is possible to ask a child to reproduce orally or in writing what -has been read, or to answer questions in regard to the same. However, this was not attempted systematically at Gary. The scale thus affords merely a measure of the child's performance in oral reading. If he mispronounces words, omits words, or adds to the text, if he repeats, or otherwise misreads, the errors take away from his score. The giving of the test is well standardized and there are adequate and reliable com- parative data. As a whole, therefore, the test is a satis- factory measure of skill in oral reading, so far as that skill ^Gray, W. S., Studies of Elementary School Reading through Stan- dardized Tests. Supplementary Educational Monograph No. i, Uni- versity of Chicago Press. 300 THE GARY SCHOOLS is defined as ability to pronounce words correctly and in proper sequence. One limitation of the scale is that the causes of the increasing difficulty from paragraph to paragraph are not known and may be due to factors not vitally con- nected with reading ability. A single illustration will make this plain. Paragraphs 2, 3, and 4 have for the first word " once " ; " Once there was," " Once there were," "Once there lived." Paragraph 5, however, begins "One of the most interesting birds." Child after child influenced by the preceding paragraphs begins: "Once of the." Thus in class No. 11 Jefferson, out of 40 children, 5 missed on this particular point. Tabula- tion of other classes 3delded similar results. In general, one child in 10 is so susceptible to the habit forming influence of the succession of "onces" that he will mis- read "one" in paragraph 5. In other words, in working with the scale one gains the impression that the difficulty of certain paragraphs of the scale is in part caused by the occurrence of certain traps or pitfalls, rather than by real increases in the difficulty of the reading. This is due, of course, to the empirical basis on which the selection of the paragraphs rests. On the other hand, an inspection of the scale shows that there is in general a real increase in difficulty of vo- cabulary, in length of sentence, in difficulty of sentence structure, and in content of material. The length of the sentences in the various paragraphs increases from 7 words to 23, but the increase is READING 301 irregular (Table LXXIII, page 302). A word, however, is a poor unit to use in measuring length of sentence as words vary so in size. A better unit would seem to be "sound divisions" or syllables. A syllable in the con- ventional sense means merely a group of letters. Sylla- bles are not always sounded. Thus "dressed," a two syllable word on the basis of spelling, is pronounced as though it were spelled "drest"; that is, as a single sound unit. The term "sound division" or "sound unit" will be used to indicate that words have been divided in accordance with their pronunciation, and not in accordance with their spelling. From the point of view of such sound division, the sentence length varies from an average of 8 to over 45 units, but again the increase is irregular. To say that a child can read one paragraph but cannot read the next higher one is to indicate but roughly the length of sentence which he can read. In future improvements of the scale attention will need to be given to a more careful gradation of sentence length from paragraph to paragraph. In similar fashion it was found that a word of six sound divisions occurs in paragraphs 10 and 11, while paragraph 2 has but five words of more than a single sound. Roughly, therefore, the paragraphs increase in difficulty because of the increase in the length of the words as well as the length of the sentence, but for this factor also the increase is irregular (Table LXXIV, page 303)- 302 THE GARY SCHOOLS TABLE LXXIII Analysis of Gray's Oral Reading Scale PARAGRAPH NUMBER NUMBER OF SENTENCES IN PARAGRAPH NUMBER OF WORDS NUMBER OF SOUND DIVISIONS AVERAGE WORDS PER SENTENCE AVERAGE SOUNDS PER SENTENCE 1 7 48 55 6.9 7.9 2 6 49 54 8.1 9.0 3 5 49 61 9.8 10.2 4 6 61 72 10.1 12.0 5 3 60 75 20.0 25.0 6 4 62 81 15.5 20.2 7 3 53 74 17.7 24.7 8 4 54 89 13.5 22.2 9 4 52 82 13.0 20.5 10 5 46 85 9.2 17.0 11 2 47 89 23.5 44.5 12 2 38 91 19.0 45.5 This table is to be read as follows: In paragraph i of Gray's Oral Reading Scale there are 7 sentences containing a total of 48 words. In reading these words orally there are 55 separate sound syllables or di- visions. (See Test.) That is, the average length of a sentence is 6.9 words or 7.9 sounds. As a whole, the table shows that the average length of the sentences in words increases irregularly from the beginning to the end of the scale; that in terms of the number of sound divisions the average length also increases irregularly. The vocabulary also increases in difficulty from para- graph to paragraph. The various words new to each paragraph are shown in Table LXXV, between pages 304 and 305. Unfortunately, there is no information as to the relative frequency of occurrence of words in children's reading vocabularies, so that it is impossible to evaluate READING 303 !2; a < 1 T-! T-i iH 1-1 1-H T-I T-I r-i rH T-i ^ OJ TOTAL SOUND DIVISION «3»O»Ot-t-00C-00Q0C»00O5 oooso»T-(©coo 1 « 1 Q pa 1 «> 1 1 1 1 1 1 1 1 I-- 1 «3 1 1 1 1 1 1 1 1 i>J^'-^Oci ■«* 1 { 1 1 th N T-i Tt< oci >o 10 as CO 1 1 1 1-1 1 1 CO «5 -^ -5j< i^f t- iM C-»0(M05(MCO(Mi-IOOC-0^ T-( i-Hl-l 1-1 iH tH T-l-<*t-i-lt-b-t-O3«OQ0CDCD •«*Tj CI s ^ rr 1 rt rt v> Cj 0) > rt csy:! b ii ci -u T) O.-^ u rt rt H to B jq M v n, J cS nl in .3 3 ^ 2 en rr* ^ S en a & ;^ > -3 1 -1 lU -n •u H (3 3 S3 1) a 3 ■§ 2 .23 a -g 1 Tt H 1 i r^ e 1 a a, Tt C3 y c 304 THE GARY SCHOOLS the difficulty of the paragraphs from this point of view. However, in each paragraph the children are called upon to read 24 or 25 new words^ and these words are, in general, increasingly longer and less common from para- graph to paragraph. The analysis will not be pushed further. Enough has been said to show the incompleteness of our knowl- edge of the causes of the increase in difficulty from para- graph to paragraph. Nevertheless, in the opinion of the author, Gray's Scale is one of the most satisfactory of the various measuring scales and is probably as per- fect as it can be made on the basis of present knowledge. A drawback in the use of the Gray Scale is that children must be measured individually, so that the measurement of a school system requires a great deal of time. For this reason, in certain grades at Gary only selected children were measured. The method of sam- pling was as follows: For all grades from two to eight inclusive the teachers of reading were asked to fill out for each child in the class a judgment card like that shown in Figure 53, page 305. In grades three, five, and seven practically all the children were tested. In the other grades 10 children were chosen from each class — the 3 given the highest marks by the teacher, the 3 lowest, and the 4 nearest the center of the class. In all 1557 children were examined. This method of measurement by sampling is open to question. If 10 children are chosen from a class of ^Median values. PARAGRAPH 1 a — C after 2 1 began 2 L boy 2F cannot. . . . — J cry 2 — dog 2— . 2B had., he... home. I. ... into., little, my. . ran. . said., to. . . the. . . then. . wanted... . without. . . 3 K woods 2 — . 2G . 2E . 2H . 2H . 3F . 2E . 2E . 2— ■ 2 J . 2H . 2D . 2H PARAGRAPH 2 and 2 B can 2 C day 2H do 2 A feet 2 1 four 2 L his 2H in 2D lived 2— mother. . . . 2 G once 2 L one 2H pen 2 — pig 2— ran 2 H round 2 K run 2 C saw 2 J so 2D them 2 H there 2 N was 2 H what 2 I with 2 J you 2E TABLE LXXV _]^;^^^^^-----_PA^^ A..x:oKA. WoRns ™ Oxkkr Paraokaphs-Grav's Ora. ^^^. Sca.k PARAGRAPH 3 agam. . . .. 2M bit .. 2— bring. . . .. 2H cat . . 2— give. . . . ..2 1 house. . . .. 2H long. . . . .. 2H me .. 2A milk. . . . . . 2— mouse. . . . . 2— mouse's. . . 2— no . 2D not . 2E off . 2M pray. . . . . 2— puss . same. . . . . 2 1 some. . . . . 2H taU . 2— they. . . . . 2K till . 2— were .... . 2L will . 2E your. . . . . 2F paragraph 4 at beautiful. but children. . door found .... garden. . . girl happy King made. . or. .... own. . . palace. . poor. . . Queen 4 — their 2 Q took 2M — B 3P 2F 2— 2H 2 J 2— 2 J 2 J 2— 2 J 2 J 2 I 2L 7— 2K paragraph 5 as 2 H been 2 N before 2 L blue- jay.. . birds 2 — bird-room. — — business, could, ever. . fly.... from. , fuU. . . given. 4T 2K — L 2— 2 J 2K 3— mterestmg Jackie. .. . morning. .. 2 L most 2 J named, nest. . night. . of pet. . . reared. . . . — • — • scarcely. . . still 2 J stolen — ■ — which 2P 2— 2K 2F 2-=. paragraph 6 almost 3M appear — O better 2K blackberrying . . by — G does 2 P else 2N enjoyed — P farming fishing 2 — good 2E industrious 5 — liked 2 F making 2 — maple 6 — part 2 J reason 4 N someone sort 5— sugar 2 — than 2 I that 2 H this 2F very 2 I why 2 J work 2 J yet 2 1 paragraph 7 agamst 2R are 2 G behind 2 K brilliancy contrast dull 2— embraced evenings 4 N evident • glow its 2 J hght — K magnificent . . . . mountains 2M masses only 2 K pretty 2M region 7 Q sky 2— such 2 L sun 2 — sunk 2 — stood 2M third 2L those 2M twilight 7— white 2 1 wonderful 6 Q paragraph 8 an 2E crown 3 — dignifying every 2 J exalting exercises 5 — forms 5 I general 6 R glory goodwill greater honor 7R itself — N life — J man 2D means noblest position — Q possession power 8 L rank 7 — securing • society 7— station 40 useful 2 — • valuable ■ — • — wealth 8— paragraph 9 although — Q apart appearance 8 — approximately . blue 2 J body 3 L complexion 8 — covered 2 J dressed 2 — early 2 L eyes 2 K far 2 I florid — — forehead 4 — habitually ■ hair 2— inclined left 2 J neat 2 — profusion proportioned. . . remarkably .... — ■ — scrupulous'if . . . six ■. 2 F tent well 2H paragraph 10 3U always 2 N attentively . . . . contemptuous. . continuously. . . exhausted 8 — exigencies finally. . . . grim • habit 4— happens 2M hesitated impulse ingratiatingly . . — ■ — Josephus length 4B listened 2 — loss — B lost 2 J old 2 E others — H persistently . . . . ■ responding silence spoke 2 — strength 6 — through 2 paragraph 11 alluvian . . American . antique . . antiquities, archaeological, architectural architecture . attractions. . azure dehght .... deposits . . . Egypt..... fanaticism . fondness. . . have Italy unto overcome .. prairies. . . . Roman .... skies studies. ... verges 2G .. 3— .. 2— paragraph 12 accurate 7 — applicable arduously be 2F combine concerning established . ... forces formulated . ...—- — hypothesis inconsistent. . . . mathematicians. phenomena . . . . philosophers . . . physical 8 — physicists principles 6 Y proved 2 N relatively statisticians. . . . universally 8 — The relative difficulty of the different words in the various paragraphs may be judged from the figures and letters which are placed after them. These are to be interpreted as folloiv. figures refer to the grade to which the word is assigned by Jones; the letters refer to the division of the Ayres Spelling Scale in which the word is found. A dash means that the word does not occur in these lists. tj. f • fi, rftr. it of thp bets A to L of the Ayres Scale are considered second grade words although set L is spelled on the average with but fifty per cent, accuracy by second grade children. Set A, on the other i land, is speUed with ninety-nme per cent, accuracy. Itiat is, tne mtncuity ™Tk ^ ^P^^^S (and therefore probably also in reading) increases from set to set as the set letter is later in the alphabet. The table makes it evident that the increase in the difficulty of the words from paragraph to paragraph is irregular. READING FlGUIUE ss Teachers' Judgment Caeds 30s EMERSON TEACHERS JUDGMENT CARD '^ Reading Instructions; Indicate by a check mark the items on the opposite side of this card which express your judg- ment of the ability in reading of the pupil whose name appears below. Express the general mark in per cent. Namft of Pupil ^^ ^<\^'^' icIi^-hSA,^. Grade ^ /^<. Name of Teacher ^7].^r^ , '^rr^f - ^ 'Subject S^ijuCcL^Ci'. ^ The Gary Public Schools Check the items which apply »9 SI 00 1^ OS Ml U1 >-• S r w z K3 I-' tc Qo N 9» tfil >(»> w hi ** \ \ \ \ \ ? \ \| 11 Tl •n W H M > W > :*> •fl ?=1 W •n •n w 13 > •^ i^i C !i' a. 2. § I (re 5' < ■0 51 5 > 1 n' a 1 1 G Z a p) w > z D Z 8 1 •1' 1 3 X n CO 01 Z JO a. c 2 2 2: §. > Si imi -1 Tiai y- ■Ge ner alAbi Uty in Re adL ngr- J :^ % ^ Cards were filled out by the teachers for each individual child in grades 2 to 12. Selection of children to be measured by Gray's Oral Reading Scale was made on the basis of the marks shown on these cards. Ten children were chosen from each class, the group being composed of the three having the highest rating, the three having the lowest rating, and the four of average ability. In grades 3, 5, and 7, practically all the other children in the classes were measured as well. 3o6 THE GARY SCHOOLS 40, the class will be misrepresented by the resulting score based upon the performances of the 10 children, unless the teacher's judgment is reliable. In grades three, five, and seven practically the full class member- ship (except for absence) was measured in each class, although the scores were tabulated in groups of 10 as in other grades. This makes it possible to determine the extent to which the method of sampling is valid. It was found that the differences between the scores made by all the children in a class and by the selected group were small, averaging 3 points. This is less than the usual error of measurement (Table LXXVI, page 308). The examiners at Gary were the writer and his assist- ants and six graduate students from Professor Gray's own classes in the University of Chicago. These latter were trained in the use of the scale by Professor Gray himself. As a further precaution a large part of the first day's scoring was done in duplicate. That is, as a child read from the scale, two examiners made inde- pendent records, and after the child had left the room the two records were compared and doubtful entries discussed. A study of these duphcate scores makes it plain that the use of the scale leads to consistent records of children's performances. (Tables LXXVII and LXXVIII, pages 309-310). In 21 per cent, of the cases the two records agreed exactly; in 70 per cent, of the cases the differencea were 2.5 points or less (about half a year's growth), READING 307 and in only 6 per cent, was the .difference greater than 5 points. There were differences in the close- ness of the agreement between the records of the vari- ous pairs of judges, and, in general, those who had had the most experience in the use of the scale had the most consistent records. None of the five pairs of judges had an average difference of as much as 3 points. The results may, therefore, be accepted as revealing the true abihties of the children within a half grade. In this test, as else- where, the class averages, as determined by two indepen- dent observers, differ by very small amounts. In making the comparisons noted above it was found that there were often marked disagreements as to the actual mistakes made, particularly when the number of mistakes was large, but usually close agreement as to the number of mistakes. Two sets of independent rec- ords are given in Figure 54, pages 312-313. For para- graph 3, the two records agree; for paragraph 5 there is a disagreement of 2 seconds in the time records and of i error in the mistakes. In both will be found differences in the manner in which the mistakes were recorded. As, however, six is the maximum number of mistakes, that can be made and have the performance count at all, the differences are not serious and, as has been shown, do not affect many scores. On the other hand, they in- validate the records so far as types of mistakes are con- cerned and no such analyses were attempted. The method of scoring adopted by Professor Gray 3delds approximately constant scores from grade to grade. 3o8 THE GARY SCHOOLS > n X X I-] s u D hj "; m H ^ S o a il Q O incoor-txc^oot^Lrt't o z + .1 +1.0 +2.9 +1.6 +1.6 -1.3 -4.4 ++++I I++++ " lo -.J — i o (b CO CO .^j. Tj- T]< Tj< CO -^ -^ MONLOmoOMONLO 11 c^cocococMrtpqcMrtco COWC^NMCvJfO a s W Q I++I+I l+l 1 +1 OOOOWMNOOLO'-i'^inLOCO N-^MCO^NNCOMMM^CO 11 a)oqL(5qioqt>Mt>-«i"inoocv) r-lCT>t^I>0-*03rHcdot>-OOi |1 00— looioffir-imc^jootDLn •-iCOCgoqNINNCONCOlM ss ui"0U ♦J aj *3 ■a o.f> ^o| ^ .— all ? i-i go. ■u 3 S E-S ^_« 1,5 «J « o ^ > . ^ a2-5 o^ 00 j3 "^ ''^M aj "^ " J) U 4) U g S =^ " ^ g g.>; S C c3 »! . 0) o 3-d fO .S^ESE^.0 ti gJ:~^ g-H «) H O n "•S' M W lU U C3 O-^ > C «J u o w •" ~H S <^ " tn K « feJ-i "i, ta -js c S H- '■' ^ ^ « C (fl +J^ 0) 2 Si! " S $ " s s * 2 S. g ■- y^ d E «<^ •S ^ o „ ° i>~j •S o^ S 2 2 READING 309 s '^ a I P3 g c ^Ti-TT?++" CT) in 1— > «3 .-H Oi N (M 00 CTUD 1 1 CO q CO s ^ « 00 N 10 rf CO 1 1 CO CO W ■-! M CO -"J" •* CO f3 in 3-S ■-Si ^1 >-iNC0T)C0 1 1 u i 5 CDOMrHNO 1 1 1 1 c^ S: ^ 1— > W^^-COCOC^ 1 1 1 1 CO 06 CO tn CO 1— > moOCDOOOtD MM Tf -* Tj" 00 rj eg IN in 3^ rtlNCOM'intD MM 00 6 (3 S ^HW^O^C^^OOtD ++I i+i 7+ ^ " ,^ lU 8^^ CO a 'J'CTlMogOOOOOtCO LnTfin^incocococor-i 9 8^-S Ln a 1—1 LOt-jooogtoocsiotDLn 9 CO 2^ "1 .-icgc0'*in(0i>ooo)0 > < c ''3 1 s s S ^ I 4XJ ^■si bo j3 .0 a lai •3 ce n! a <« •G.S| u ;2;o^ "rn •a " ti «a — 4) (U kf "m"» i! "o^ JJ C4 ■ss-s >> T3 ^ rA "H l^g- > 00 2 M Si c m ^^ la;"- F a K^'3 ^ ^ P -J ^ «^§is ^ 0, f^ M §« "■ Sr^ >- ■a «H^ bo >c« 4) a S-3 .S2 oi ii; cj H.||| 3IO THE GARY SCHOOLS w t3 h-l P5 n ^ in C/3 % B Q 2j < < o < u fA o r/i 1 H 1 f) 9 n 'A H rt O o U I C/J Fh ^ o o >< en H 8 o ^ ^ 1 1 1 I-l CO ^1 CO o as ?£> lo CO (N i-i 1 1 1 00 |(M CO 1-H I— 1 -^ 1 en O gS 1 O O H m « M Iz; Ol t- go si -^ CO t- (M CO T-KM T-l 1 1 1-1 ^ CO CO CO ^1 1 tr ^ W CO o(mootj " . a Ul Si T1 o J3 tn O c 11) 43 T1 ■4-> C 0) (U c O a Jtt a o\ o a, lO T3 ■* O M 2 (; :s .i2 —> c3 4-) M tn c u T3 1 < Eb-*^ ■3 READING 311 Thus the fourth and fifth grades at Gary made an aver- age score of 39. This does not mean that they have the same abihty; it means merely that the fourth grade is as near the fourth grade standard as the sixth grade is near the sixth grade standard. To indicate the progress from "grade to grade it is necessary to convert the actual scores into relative scores by adding 20 points to the second grade score and 5 points additional, the average yearly progress, for each grade above the second. In the form of graph adopted by Professor Gray this allow- ance is made by shifting the grade scales. If, however, the Gary grade averages were reduced to units on a single scale with an arbitrary zero point, the scores would have the values shown in Table LXXIX. TABLE LXXIX Gary Results, Gray's Oral Reading Scale, Expressed in Terms OF A SmGLE Uniform Scale GRADE 2 3 4 5 6 7 27 47 36 61 39 69 39 74 41 81 42 87 Gary Average 27 36 39 39 41 42 41 Gary Values on a uniform scale. . . 47 61 69 74 81 87 91 In Figure 46, Section I, of this chapter the scale along the vertical axis enables the conventional graph to be read in terms of a single scale. It will be noted that the increase in score from grade to grade is small (5 points after the second grade). The conventional method of scoring obscures this fact as all grades make nearly the same score, and within each 312 THE GARY SCHOOLS Figure 54 Agreement and Disagreement in Scoring of Two Judges Judge A Paragraph 3 Once there were a cat and a mouse. They lived in the same house. The cat bit off the mouse'(|) tail. 'Tray, pussM- said the mouse, "'give me my long tail again," **No" said the cat, *i will not give- you your tail till you bring me some milk." Paragraph s One of the most interesting.birds whichever lived in my bird-room was a^ffiS^^y named J' Jacl^^ He was full of business from morning till night, scarcel y f ver still. He had been stolei*. from a nest long before he could fly, and he had been reared in a house long before he had been given to me a5 aj»et. READING 313 Judge B Paragraph 3 iS seco)i^s 3 3 >n urates. if Onc^ there were a eat and a mousei They Kved in the same house?. The cat bit off the mouse(s)taiL *'Pray, puss,^ said the mouse, *'give me my long tail agaiijf/* 'iJa- said the cat, **I will not give you your tail till you bring me some milk." Paragraph 5 60 seconds 5 // Tnisf-^^es Q neoi the most interesting- birds whichever lived in my bird-room was a ^ aluo j ay named Jaci^ He was fall of bny^ine^ s from morning till night, s^c^jcglyVer still. He had been stole© from a nest long before he could fly, and he had been reared in a house long before he had been given to me as apet. In paragraph s, although the actual number of mistakes recorded dif- jers but i, the errors recorded are differently marked. 314 THE GARY SCHOOLS grade the range of scores is large. For instance, in Froebel class No. 44 (seventh grade) the lowest individual score was 20, the highest 52. There were 4 scores in the twenties, 6 in the thirties, 16 in the forties, and 2 in the fifties. In other words, the individual scores show a range of variation equal to more than six times the average yearly progress, yet the maximum differ- ence between the scores of the 24 seventh grade groups (of approximately ten children each) examined was 10.4 points. In other words, differences between scores of classes and of cities are much more significant than differences between individuals. The differences between the Gary and Grand Rapids scores, for instance, may be transformed into years of difference by dividing by 5. Whether or not differences between individual scores may be similarly transformed must await the evaluation of the results of repeated measurement of the same individual children from grade to grade. REPRODUCTION TESTS The reading and reproduction tests, as already pointed out, measure, on the one hand, rate of reading, and, on the other, a complex ability made up mainly of three elements — ability to comprehend something of what was read, abihty to remember the same for several minutes while engaged in reproducing it, and abihty to organize the words and ideas remembered into a connected story. Ability to read must be credited with a minor part in determininor a child's score. READING 315 The rate of reading in itself is probably not a significant measure of anything. The range of variable perform- ance is so large that rate scores in such tests as were given at Gary are mere S3nTiptoms, that is, slight changes in the conditions or incentives to reading effort produce large variations. Inferences from symptoms are reliable only when the conditions which cause variation in the symptom are known. For instance, the rate of silent reading at Gary appears low in comparison with results from similar tests in other cities. So far as is known, there is nothing in the tests and the testing conditions which would cause this variation, and the results from two or three trials of different tests are consistent. Yet the same tests, as has been pointed out, bring to light very large individual variations. The causes of these are not known. The reader must be careful, therefore, to remember that while every precaution has been taken to make the data reliable, there is so much to be learned about reading ability that later investigations may prove present conclusions to have been unwarranted. Analysis of the reproductions, item by item, affords an opportunity for showing the manner in which the factor of memory operates. Certain subdivisions of the story were recalled by nearly all who read them. These elements form the gist of the story. Within each sub- division, however, there is marked decline of the fre- quency with which the various items were recalled, and, in general, the longer an item has to be remembered the less frequently it is recalled. That is, only children 3i6 THE GARY SCHOOLS ^ ^ Id W o IS 0»0?OOOt-«it-eOU5<35t- jz; O ^ CPi i^Hh5>^X!>^n ij3-<*oot--<*T-UI t-1 S to M 0) ol 35 £ a 3 « S « S S o -a5--« ? S g in 60 « H O O cd in CIi u C S . in "-1 S 'T &43 rt in -w bo 5 C — m g •- g a "^^ >> C bo O & rt ., „ '^ > -^ o *j o c o r"^ U in f^ IBIS'S in3 -w S **- o^ >^ 22 .y aj M -1-1-5 _o u in nJ O S «« -i-> '-' j_| in M "^ O o &3 a oodr-i a CO CO CO Oi i-i COt-00(M coococo T-I(MC rt S CI a "IS Is o "<« I, g a . _> uj w l_J 2 "^H « a (H -pS H^H ■!-> 'ij 3 C^hH W C! ■" .. r--*j '"' "" h ^ g D-S ^ eg -^ (3.C g In « •" «a 'o « ii ^ D 1—1 T3 +j cd en "■^ '^ i^ D " ^« .■2 o o-« 2 2 n, PC tn W) rt U3i_r m rt « J H.S READING 327 by the approximate method adopted for the arithmetic tests. The record sheet was so arranged that the mere entry of the rate and conventional scores indicated the accuracy score without computation. As no com- parable data are available, it was not thought necessary to determine the size of the error in such short cut tabulations. It cannot be large and the method was used consistently. Small errors would have no influence in the general conclusions. As affording a slight indication of the reliability of the Kansas Tests, the coefiEicients of correspondence were computed for scores of the 42 eighth grade children present in all tests. The correspondence is greater for the rate scores than for either the accuracy or the conventional scores. A Httle more than half the children maintain the same relative position in the group within about 5 points for rate of work, while only approximately a third of the group maintain the same relative positions within 10 per cent, for accuracy and 5 points for the conventional score. (Table LXXXII, page 325.) These results mean that a single measurement of a child with the Kansas Tests does not yield very reliable information in regard to his abilities. He may do very much better or very much worse in a second measurement with a different test (Table LXXXIII, page 326). How much of this variation is an indication of the inefficiency of training and how much of it is due to the defects of the tests themselves cannot be told at present. 328 THE GARY SCHOOLS The Kansas Reading Tests are interesting as throwing some light on the abihty of the Gary children to solve the conventional arithmetical problems. The exercises of this character are as follows: TEST I, EXERCISE XI "We planted three trees in a row. The first one was nine feet tall and the last one was three feet shorter than the first one. The middle one was two feet taller than the last one. How tall was the middle one? " (Assigned value, 2.2. Per cent, of eighth grade chil- dren missing, 28. Number of cases, 121.) TEST I, EXERCISE XV " Fred has eight marbles. Mary said to him : ' If you will give me four of your marbles, I will have three times as many as you will then have.' How many marbles do they both have?" (Assigned value, 4.8. Per cent, of eighth grade children missing, 44. Number of cases, 48.) TEST II, EXERCISE Xni "If it takes a man an hour to walk around a square, each side of which is a mile in length, how long will it take him to walk eight miles? " (Assigned value, 4.3. Per cent, of eighth grade chil- dren missing, 44. Number of cases, 16.) READING 329 TEST m, EXERCISE III "I have five plums and Mary has four plums. Jane comes along and we see that she hasn't any. . . . We wanted to divide with Jane in such a way that we shall all three have the same number. I give Jane two plums. How many must Mary give her? " (Assigned value, 3.5. Per cent, of eighth grade chil- dren missing, 9. Number of cases, 131.) TEST in, EXERCISE V "A, B, C and D in the straight line represent four places lying in a straight line. From A to B is four miles, from C to D is seven miles, from A to D is fourteen miles. How far is it from B to C? " A B C D (Assigned value, 3.8. Per cent, of eighth grade chil- dren missing, 34. Nimiber of cases, 125.) TEST III, EXERCISE Vlll "There are three horizontal lines; the first is three inches in length, the second two inches, the third one inch. We know that if the second and third lines are joined end to end the resulting line will be as long as the first line. Suppose that the first and second lines are joined end to end. How many times as long as the third line will the resulting line be.^ " 330 THE GARY SCHOOLS (Assigned value, 4.8. Per cent, of eighth grade chil- dren missing, 29. Number of cases, 82.) It will be observed that all of these are simple problems as far as the numerical relations are concerned. Exercise VIII contains a great many words, but in the others the problems are clearly stated, and the situations are within the experiences of the children. In Tests I and II the arithmetical problems come so late in the test that but relatively few of the children get to them and these, of course, are either the more able members of the class or those who worked at high rate with low accuracy. In Test III, however, practically the full class membership attempted problems 3 and 5. (Table LXXXIV, page 332.) For Test III two other exercises which are non- mathematical have been included for comparison. For in- stance, number 4 was : " In the following words, find one letter which is contained in only three of them, and then cross out the word which does not contain that letter": ail thief live anvil The results show conclusively that many of the Gary eighth grade children are unable to solve simple arith- metical problems when presented in printed form and under test conditions (average accuracy, 6 problems, 69 per cent.). On the other hand, it must be remembered that the Gary eighth grade scores in the Kansas Test are almost exactly at the Kansas standard, which is also the score made by many eighth grade classes in other cities. READING 331 Comparative data for the number missing on the different exercises of the test are lacking. It may well be that the Gary children do as well with such problems as the chil- dren in the conventional schools. However, the results are presented for their absolute, not their comparative, value. The reader must judge for himself, therefore, whether the conditions revealed by the data above and in the tables are satisfactory. To the writer the results seem poor, but whether the difficulty is caused by poor training in reading or in arithmetic he cannot tell. He prefers, therefore, to take the position that reasoning abihty in arithmetic has not been measured at Gary, and to terminate this discussion with the repetition of the statement that the eighth grade children do as well in the Kansas Reading Tests as the children in many other cities. TRABUE LANGUAGE SCALES The Trabue Language Scales measure a very complex ability. Their author makes no claim that they measure reading abihty, but they are classed with the reading tests because reading abihty is one factor in determining a child's score. However, for these tests as for the Kansas Tests, while a child who cannot read cannot make a high score, a low score does not necessarily mean ina- bihty to read. If a child scans such a sentence as "The sky — blue," the word ''is" rises to consciousness spontaneously. The test is, therefore, in one aspect at least, a measure 332 THE GARY SCHOOLS « S &< O H 2 o z g w M 3 I; o fi n H X P H w fe^^^fe? gW gtt) COlO I lO IS o WW pqw 00 w pa p:^ ^ ■^ Tj* ;^; pa READING 333 g h O " O M 00 00 "* 00 CO T— 1 M P« H H P< (M tH OS m H en M fa O ■^'^(M'^IO eH z w o fe ^"^ ogS « « I;* w Q 5 ooooooc-o B d S lag •^ "^ "^ "^ "^ g"5 f^ m Hfao " o « looco-^as (MCOC K* w itt fe? M O .-a ^1 •^ < -o 1 iJ < ft' > fl * .s s 3 ^, bO'X3 en lU '-' "^ u u u o 3 M, (U H > ON O *-> c! ^ - i< i-t tH 1-1 o < oo5050ooot-t-«><;ou3U5-»*-rticoco T-l m w < u f 1 ; 3 o tH U5 Ol t- CD (M CO Tt< (Mi-I(M o CO t- CO lO (M CO ->* rH W CO CO 00 1-1 tHtHtHiHtH iHi-l t> 1-1 iH iH N 1—1 i-( O 1—1 1 o omowsousoioo^oiaouso oosasoooot-t-sDcoiOiOT^Tj E- *i "*-" 2i « rtii °^>, 3 S R c ?i 3 60 •> O •a "S ^ <« 01 '^■l-> *■' U o iJ !r! S ^.^ N CS W IH S 3 . ,^_, a> fO P +^ -O g J, 2-fl O rt S 1- ^ t! " rt ft: lo OT fl S ^ 2 n S-g o^ '^O =« •h 53 00 « <*3 m •-00 3 g«g^ 3 oBiilS '=^iJ ^3 «-S Cj K^ pa 0] ^ 344 THE GARY SCHOOLS Put in other words, the conventional Trabue scores eliminate very large differences in degrees of skill, and differences in the efficiency with which the skills are used. They reveal only the utmost which a child is capable of achieving, without regard to the effort by which it is achieved. Under such conditions it is not surprising that the results correlate highly with the results from the Binet-Simon Tests and other tests of general intelK- gence.^ But to a corresponding degree they are not measures of the effects of classroom training. If the Kansas Reading Tests were given under condi- tions similar to those required by Trabue, very dift'erent scores would result. For each of the other tests used an equivalent statement can be made. It is important, therefore, that the reader recognize the difference in the conditions and make no comparisons between the results of the Trabue Scales and the results of other tests without keeping this fact in mind. It is contended by many persons that both the Kansas Silent Reading Tests and the Trabue Scales should not be considered either measures of reading ability or of ability in language work, but measures of general in- telligence. It should be evident from the discussions above that neither of these tests measures directly any - single product of classroom teaching, and both call for the exercise of much initiative, judgment, and reasoning ability in addition to the ability to read 'See Teachers College, Columbia University, Contributions to Educa- tion, No. 77, p. 77. READING 345 and understand the material of the tests. The reader should note at this point that if the tests are considered as measures of reading ability, they confirm the previous results in showing that the Gary children do very nearly, if not quite, as well in reading as the children in conven- tional schools. If the tests are considered as measures of general intelligence and reasoning abiHty (without attempting to define what those terms may mean), then they show that the Gary children are normal in general capacity and intelligence. RELATIONS BETWEEN TESTS Perhaps the most convincing proof of the fact that each of the various reading tests used at Gary measures a particular phase of reading ability is found in the relation between the individual scores in various tests. The coefficients of correspondence were computed from the scores of 33 eighth grade children, all who were meas- ured in all of the reading tests (Table LXXXVII, pages 346 and 347). Teachers' marks refer to the marks (in per cent.) assigned by the teachers on the estimate card shown in Figure 53, page 305. Time in oral reading is the number of seconds taken to read a paragraph of Gray's Reading Scale (based on the average for para- graphs four, five, and six). Points in oral reading rep- resent the conventional score in oral reading. Kansas scores, rate scores, and accuracy scores have the meanings previously indicated, but were determined by averaging the two out of three variability ratios that 346 THE GARY SCHOOLS USCQi— I • -OOi O CO CO T-H CO o a5T*cou3T— lOi-HcaosoeoT— lo lO to 1— I 'sj' '-I I I I , I I I I I t I 1 I CO 1— I CO T-1 c i-i .<^ 2 bf be ^ tJ -(i >^ . rt e^ O O) aj 3 C 5 +J Oh READING 347 O iz; Q !zi O Pi ui W P4 o u W<*eO I t- ■<* t- -<* CO «3 U3 CO U5 t- .s .^-5 rt H CI M o 4j p r7i iV U 14-. » » « 2 3 u 2 la-si rt £ «^ o eJ ;* THCgoO-<*lf3CDt-OOOSOr-)(MCO-^ 5 a^"; c (g " S «J o •a " •S " •a a CO D O 3 0) J, ^ «J !« O 2.2 bow rt w a ui J3 4-> 4-> 348 THE GARY SCHOOLS were nearest alike. For instance, if a child's score was + 1.6 in one test, — 2.7 in a second, and +1.8 in a third, his average would be taken as +1.7, the low score in the second test being rejected as a chance varia- tion. This procedure was adopted after comparing its effects with results obtained from averaging the actual scores. The ratios were deemed a better basis from which to work because of the differences in the difficulty of the tests. In the case of the Trabue Tests, however, the actual scores in Scales B and C were averaged for score and the actual rates and accuracies in tests with Scales D and E for rate and accuracy respectively. The rate of silent reading is based upon the average of the two nearest varia- bility ratios out of the three in the reading tests. For the reproduction score, ideas reproduced represent the actual number of ideas reproduced, while accuracy of reproduction expresses the relation between the actual points and the possible points. Rate of reproduction refers to the number of words reproduced per minute. Finally, average position was found by averaging the thirteen variability ratios so far described. The significance of this last measure may need com- ment. If a child stands very high in all tests he has a high average position. If he does well in some tests and poorly in others, his average position is lower. It is probable that a relatively constant position in all tests is an indication of his natural capacity, so that aver- age position may be regarded as a measure expressing the general capacity of the individual so far as the capac- READING 349 ity is revealed by the reading test (Figure 57, page 350). For most of the reading tests the degrees of corre- spondence between any two tests are approximately constant. The distribution of coefficients is as follows : TABLE LXXXVIII Distribution of Coefficients of Correspondence of Fourteen Different Types of Scores from Reading Tests RANGE OF COEFFICIENT 20-29 30 14 40 27 50 20 60 16 70 8 80 1 TOTAL Frequency 5 91 The median coefficient is 50. That is, about half the children will maintain the same relative position in any two sets of scores from reading tests. A coefficient of from 40 to 60 means, therefore, only the degree of correspondence which is to be expected froni the fact that all scores are, in general, predetermined by the major factors of heredity, maturity, and training. Where, however, the coefficient of correspondence falls to 30 or 20 per cent, it signifies that one or both of the tests measure peculiar or specific abilities.^ When the co- efficient rises to 60, 70, or 80 per cent, it signifies that scores in the last two tests are determined more nearly by the same factors. It may be that the factor is simi- larity in the abilities measured, or it may be that the factor is general intelligence, but whatever causes a high score in one test causes a high score in the other also. ^Journal of Applied Psychology, March, 191 7, Vol. I, p. 26. 350 THE GARY SCHOOLS Figure 57 Individual Records in Reading Tests UNIT5 or VARIABILITY ABOVE MEDIAN BELOW MEDIAN 654-321012345676 f«»eh*r*« Hark Oril Rsadlng Fata 4Tera«« Fsallion_ INDIVIDUAL The horizontal lines represent 14 different measures of pupil's ability in reading. The vertical line marked "O" represents class median in each phase of reading ability. Distances to the left and right of the class median represent positions above and below in terms of the vari- ability. (Median deviation.) The solid line shows the position in each of the 14 sets of results of the member of the class who had the highest average position. The broken line represents a similar record for the member of the class who had the lowest average position. The dotted line represents the record of the member of the class whose average position was exactly median. The curve shows that while for individual tests there is a large amount of individual variation, the position of each child shows a tendency to vary about a certain general level of ability. It is probable that this general level is determined more by capacity than by training. READING 3SI Figure 58 Average Position m All Reading Tests Var.Ubilltr R»Uoj^ CT. 39 V-r. 6 to Average Position _ e,T. -i-J V.r. 4 tt^ba I I. 3 4^ S « 7 S ToUl Number of Coaei J ■? Number within ! Unit..28_ 20 21 22 2i 2+ ?5 2fc 27 28 29 30 31 32 33 P«rc«ntas9 of Correiporwlenc*, — n»^ S The scale along the base of the figure represents the mdividuals of a group of 33 eighth grade children. The solid line shows the relative position of each child in the class as determined by the average of his position in each of 13 sets of scores from Reading Tests. The broken line shows the relative position of the same children as determined by the scores in Gray's Oral Reading Scale. Twenty eight out of 33 children, or 85 per cent., maintain the same position within one unit of variability. The dotted line shows the relative position of the same children based upon accuracy scores in the Trabue Test when given with a time allow- ance of but two minutes. The percentage of correspondence between the Trabue scores and the oral reading is 33 per cent. The curves show that the scores in oral reading are probably deter- mined more by capacity than by training, while accuracy scores in the Trabue Tests used as rate tests are measures of a specific ability. 352 THE GARY SCHOOLS The correspondence between teachers' marks and score for oral reading is high. The correspondence is equally great between teachers' marks and the Kansas Silent Reading Tests, or between teachers' marks and the measure of average position. The correspondence with Trabue is also high. All these, however, are measures of general intelligence. It is extremely probable, therefore, that at Gary teachers' marks reflect the general capacities of the children. On the other hand, rate of silent reading and accuracy of reproduction have very little corre- spondence with the teachers' marks. A corollary of the foregoing conclusions is that at Gary- scores in Gray's Oral Reading Tests are determined largely by the capacities of the children. The coefficients between the oral reading scores and the measures of general in- telligence are all high. The correspondence between scores in oral reading and average position is 85^ (Figure 58, page 351), a further confirmation of this conclusion. The coefficients also show that time required to read is a large factor in determining the oral reading score, so this in turn must be determined largely by the native capacities of the children. The coefficients for the different types of scores for the Kansas Silent Reading Tests and for the Trabue Tests are interesting and significant. The conventional scores in these tests show considerable correspondence with measures of general ability but very Kttle with specific abilities. Accuracy, or degree of understanding, isa ^Pearson's Coefficient of Correlation, +• 73, P- E. ± .05. READING 353 specific phase of skill in reading, and like all other specific abilities shows a low degree of correspondence with other abilities which are equally specific. The scores for rate of silent reading show considerable correspondence with the time scores, and with scores for average position, but with all accuracy scores the correspondence is low. Apparently at Gary the rapid readers are not those who read understandingly. It is probable that the rate of reading scores and all the reproduction scores are measures of general abilities, but here again the general correspondence is lower for accuracy of reproduction than for the other abilities. It should be remembered that these results are based upon very few data and have significance only for Gary. Even for Gary the chief point to the foregoing discussion is that the scores of children in the various tests vary in every conceivable fashion. That is, the relation be- tween abilities or the dependence of one ability upon another varies from child to child. Each test, so far as it measures a specific ability, will yield scores which are significant for that test alone. Therefore, no general comparisons have been made. Only certain phases of reading work have been measured and conclusions drawn are to be interpreted as applying to these particular phases alone. CONCLUSION The foregoing discussion must have rendered evident the truth of the statement previously made that the 354 THE GARY SCHOOLS measurements of reading ability at Gary are much less satisfactory than for other subjects. But at least the conventional reading tests have been given carefully, and as much is known about the reading abilities of the Gary children as such tests reveal. VIII. FACTORS AFFECTING PERFORMANCE MEASUREMENT of classroom products is or- dinarily accomplished by giving standard tests under controlled conditions. The testing ac- tivity results in a series of figures (scores) describing either quantitatively or qualitatively the way the children behaved under the test conditions. The question imme- diately arises: What relations do the scores made by the children in standard tests bear to their real abilities? A Uttle reflection will show that the results of tests are affected by many factors. For instance, nervousness may lower a child's score for accuracy of work in addition from loo to o per cent., while recent study or practice on a particular test may lead to scores far above the normal level. Hence, the score of a child in a test is a reliable measure of just one thing, what he did in that test. HEREDITY The basic factor in the performance of each individual is his heredity, or capacity.^ That children differ in capacity is the common experience of all, and these 'The tliree technical terms used in this discussion — ^perfonaance, ability, and capacity — may be defined as follows: Performance is the specific achievement (actual score) made in a particular test. Ability is the general power to perform. It is best 355 3S6 THE GARY SCHOOLS differences are observable from birth. So far as the Gary results are concerned, the effects of heredity are either negligible or unknown. In any large unselected group of children the percentages of individuals of ex- ceptional ability, of average ability, and of small abihty are probably constant. If, as seems probable, unusually large numbers of the Gary children are born of foreign parents,^ they might form a selected group as to capacity, provided there were marked racial differences in capacity from country to country. However, the facts in regard to all such hypotheses are wanting. It is known that the children in the Gary schools come from a wide variety of racial stocks. Therefore, as a group, the Gary children probably do not differ greatly in their basic capacities from the average of children in other cities. For example, the Gary eighth grade children copy figures at the rate of 1 1 1 figures per minute as compared with the score of io8 figures per minute for children in other cities (Table LXXXIX, page 358, Figure 59, page 359). The activity involved here is almost entirely the motor activity in writing figures, and no direct training of this character is given in the schools. The differences between the Gary and the country wide results in this inferred from the median performance in a series of trials of the tests since the amount of variation shown by the series as a whole furnishes a measure of the reliability of the inference. Capacity is potential, or undeveloped ability, the possibilities of development inherent in a child's original nature. For an extended discussion of these definitions see Bulletin No. 4, Courtis Standard Research Tests. ' ^See The Gary Public Schools: A General Account. FACTORS AFFECTING PERFORMANCE 357 test are not significant and are, if anything, in favor of the Gary children. In the Kansas Silent Reading Test and in the Trabue Test, which are considered by many as measures of general intelligence, the differences between the Gary results and those from other cities are small.^ In view of all the facts that are available, the author con- siders it probable that the Gary children represent a nor- mal group as far as inheritance of average mental capacity is concerned.^ MATURITY The second important general factor affecting per- formance is maturity. By maturity is meant that in- crease in abiHty from grade to grade caused not by addi- tional purposive school training, but by the greater development, increased vigor, and riper experience due to added age. For instance, in the first trial of the test of copying figures, 53 figures per minute were written by the third grade children. This proves that as a result of the train- ing in the early grades the ability to write figures is well developed by the third grade. The grade scores ^See Chapter VII, page 290. ^In Gary there is a formal organization of health, dental, and psy- chological clinics, and several classes are composed wholly of children whose mental condition is such that they are not the equal of normal children. The scores of such classes are not included in the tabulations of the preceding chapters. The Gary results would be lower if the chil- dren from the special classes had been distributed through the grades on the basis of their ages. 358 THE GARY SCHOOLS O td 5 ■" — E " o aj a OJ u 4) ■" C O o *-> 4> £ c "-'03 ^ Si" ■" c ■" c P, *> !§>" o~ ni in 3 5 -S 4> o g o t^ ■S;^ a c_; « 4? C 3 3 to-S *-> 1) tn (U ■"ii CO*". S-S,^ H Ci .> tn c3 «i « O (U ft 2 Tj ^ Pi S i , W3 P3 O •a o- ^^ So a.S i< go o „ V rrH °i *+-" t-i on ''^ .2h ^ 3:3 O JJ 10 M Oi CD I> CO H 2; a B< « 1^ H u 1 > to s M ij 2 W s H <; d r"' > » s H E4 H < < ^ Si ^'S n 1-! S'-3 2 Mg;^ S mS^ .Cp2 C •Sps.e >. *-M >! +-> P g 3 a ": S PQ w ^ 5; 364 THE GARY SCHOOLS & o an ^' N o> t- O CO CNfoa>*9«i CM'"' COCO ,-1 ,H r-< rH rt ,-1 -H CD COOKD •^3 00"* • • • .-H O) I> Irt 00 CO (O PJ C^P3 8888888 W00t>^C^COCO SooSoooooooo IrtOOtXOtOlCtO I> t* t* t> t^ c^ t* -I COr-IC» <*' S S M l^ 00 1» 5 -q; CO -^ cot; D -q" ^ N ■<* ■* CO 25 in 05 LO Ln CO coco NCO' S !5 "3 •0 3 Oi M CO *^ w ca "« FACTORS AFFECTING PERFORMANCE 365 Si u i ^00 00 00 in CM (xJ 00 00-^ SujlOOON JO N N rf 05 rf • • • • rHNr-lrHOO Jill Nt^ 000 82 ?S8 83 SOONQin ■-I— !000 88888 •-Ir-lOXJ-.CM OtOC^OOOO t-IOOOO-*CO 00-^WtOC^ O to Oi ^ M u3coint-((B K § ^ 5 ■■§ ^_> g g tS 4>^ Is si r < ■s ^ 5 - ^■a ;^ 60 5 * 0) « gg £ in -5 tDM C M 13 oj g a h en •a 3 < It 00 C2 0-0 bo 4J-^ ej Qi .a- 4, 9 Wo 2'3 sua .Q 3 O-^S .a t^c-g M •3'^ h 3i p3 H 3 ^•H (i> 50^ 3S "1 >•" 4) 366 THE GARY SCHOOLS rect training for the test in canceling triangles, as it was comparatively a new test, not in general use, and not related to conventional school work. Yet the rate of relative development in the cancellation test is almost the same as the rate of development in addition and in the various other tests of the group. For all tests, the average third grade score is 39 per cent, of the eighth grade score, the fourth grade 52 per cent., the fifth grade 64 per cent., and so on. Moreover, these percentages correspond (except in the lowest grades) with the rate of develop- ment of the strength of grip of boys and girls in the elementary grades (derived from Smedley's measure- ment of 6,000 children in the Chicago schools). Under the circumstances, it is plainly to be seen that rate scores in the tests of class B are determined by maturity and general training rather than by the direct effects of school work. In this connection the reader is referred to Figure 25 which is reproduced here as Figure 60. The difference between the two curves in the graph represents the difference between the Gary product and the conven- tional product, or the difference between incidental development and development under formal training. In other words, there is no evidence that the Gary children develop in ability any more rapidly because of the training received in school than they would develop if they left school at the fourth grade and were subject merely to the general training of life. The increase in score in addition would seem to be due almost whoUy to FACTORS AFFECTING PERFORMANCE 367 Figure 60 Development in Speed and Accuracy — ^Addition AddiHon— Diagnmtic Curve of Median Development ia Speed and Accuracy. Gradei 4 to 8 induiive^ June WB s-^l Speed Number ol Exarp.plei Attempted- Accuracy 10051 .r, >,,.... Ill- ' ' " r 1 T ii Any clan whose potilion fatU on thit side of the curve it high in accuracy. Do net nefilccl Bpeed. .. ■ 1 E < 80- _-(?' ---Cf --;|;- - — " — - - S - § 60- ^ 4 1 - / / / ^ <^ » 9 - ■»-^ 1^ f ■? <\ "S Anir dm wboi ifl posidoa in.accun falli an thla tide of the icy. Work to iacreaw Poiitiom to the ritftit (or left) of agrstje Dumber in the medioa curve indicate greater(or len) ipeed thna the medicBipeed for thai Arade.'l Comparison of development in addition at Gary with median develop- ment in rate and accuracy in addition based upon a tabulation of the results from tests of thousands of children in cities of many different types. The scale along the top of the figure represents rate, or the number of examples attempted (speed). The scale along the left hand side of the figure represents the ratio of examples right to examples attempted, or the accuracy of work expressed in per cent. Each point of the diagram, therefore, represents two scores, rate^and accuracy. The por- tion of the circle marked "4" on the general curve (broken line) repre- sents a rate of 7.4 examples attempted and 64 per cent, accuracy. The curve for the Gary results is shown by the heavy line. The circles indicate the position of the different grade scores. The twelfth grade score in rate falls between the sixth and seventh grade score on the general curve, and in accuracy is slightly below the fifth grade level. The eighth grade Gary results are not quite equal to the general fifth grade scores in rate, and very much lower than the fourth grade in accuracy. The position of the general curve below the fourth grade is not very 368 THE GARY SCHOOLS Figure 6o — Continued reliable, but the differences between the general character of the Gary curve and that of the general curve is marked. The general curve indicates that the development of skill in addition is, in the con- ventional school, nearly completed by the end of grade five, while the Gary curve shows that there is a very small, but regular, increase in rate and accuracy from grade to grade up to the end of the high school years. In high school years there is no direct training for the development of skill in addition, so the progress from grade to grade must represent either incidental training or the effect of the elimination of the less able by non-promotion. Therefore, the Gary curve as a whole would seem to indicate that growth in skill in addition in all grades is due mainly to the same causes, and very little to direct training. maturity and not to direct training. If this resulted in adequate rate and accuracy of work, no greater com- mendation of the Gary system could be given. But the levels of ability developed are not adequate, and, in view of the attention given to formal driU, the figures show merely the extent to which the classroom training fails to function.^ Class C in Table XCI, page 363, includes the scores of a number of other tests which have very different rates of development. They are given to prove that the figures iPerhaps it is well to point out that if any part of the low scores at Gary were due to the care with which the tests were given and scored, the effect of making the Gary scores comparable with those from other cities would be merely to shift the position of the Gary curve in the figure, not to change its character. In the opinion of the author, Figure 60 and the tables of this chapter are satisfactory evidence that the Gary scores reflect a real condition, and not merely the effect of some unusual element in the testing conditions. FACTORS AFFECTING PERFORMANCE 369 previously quoted are not the results of chance. A child cannot learn long division, for example, or how to multi- ply one fraction by another by any ordinary form of incidental training, and the rate of development for such activities is very different from those given in the other two parts of the table. For instance, in accuracy of work in the four operations with fractions, the fourth grade accuracy is 58 per cent, of the eighth grade development, but the increase in the next two grades is very sHght, and in the following two grades the increase is very rapid (Table XCII, pages 370-1). In other words, in those grades in which there was practically no training, the ac- curacy scores are nearly stationary, but as soon as training begins they develop rapidly.^ From the foregoing discussion it should be evident that in making comparisons from city to city the ma- turity of the children must be taken into consideration. If the children in one city are much older for the grade than those of another city, the rate scores made in any test in the first city would be ahnost invariably higher than those in the second city, even though the two were really equal in educational efficiency. As has already been mentioned, in Gary tabulations of the ages of children show that in the Froebel school in some grades the children are, on the average, a year older than the children of the corresponding grades in the other schools ^For the benefit of those who would like to check the conclusions above by reference to data from other surveys, such data, arranged in form simi- lar to the Gary results, will be found in Appendix A, page 397. 370 THE GARY SCHOOLS t-Ot-U3000i010t--^(MOO oo)u3iot-«Dooa5oo?ot-u5oo T-H'Tt'eooai-^i— loco-^ooiccoio O en ni & a QJ *^ 7; H '-M 2 c -2 5=i) c3 3.0 tj.S O k!r< 5 V "S "^ •^3 S 1^ y r'- i:* ■ g 2 -Q 's > UW FACTORS AFFECTING PERFORMANCE 371 B--^ SQU.2 g'35.2 0;9 w c« w u < c^ ;^ Q ;^ O H ^ U M a -^<« H .1z| S-S t£ ■<-• .>^^ 13 TJ 3 rl ci a . ■n T3 "rt 3 ■(-J CT 0)^ ^ ■4-» t-i ^ u< a; OJ PO M t^ br bO .a CI a; (U f^-ci ni en 1 rC T3 v2 "0 tfl ;-< a, |x| w M -a a 1 < r/1 a 9 7 8 Quality ^O ^^vT- 3>S^ . Rate__ifl__ Median Scored bv Quality ^XJnder quality, the score under nine is the quality score for the Free Choice Test, under seven, for the handwriting in the Dictation Tests, under eight, for the handwriting in the Composition Test. (The small figures at the end of any word show the total number of letters up to the end of that word.) i2 83 >ft 127 Fourscore g and 12 seven 17 years 22 ago 25 our 23 fathers 35 brought forth 47 upon 51 this 55 continent 64 a es new gs nation 74 conceived 33 in liberty 92 and ^r, dedicated 104 to loe the 109 proposition 120 that 124 ail men ,30 are 133 created uo equal. ,45 Now^g we .150 are^ss engaged leo >n 162 a 163 great 1C8 civil ^a warivR testing ,s3 whether 190 that 194 natiop 200 or 202 any 205 nation 211 so 213 conceived 222 and 225 so 227 dedicated 235 can 239 long 243 endure. 249 We 251 are 254 met 257 on 259 a 260 great 2® battlefield 2:6 ,of278^that282 war.285 Tabulation Sheet Gary Public Schools School Scores— Speed L__^25: Class No. Quality. : » i8 Grade ..... Efficiency... QUALITY AYRES SCALE SPEED, "",r! =0 30 40 so 60 70 1 80 1 90 Freguenay ieo-iB9 i 140-159 1 / 1 1 / 130-l3!9 1 1 120-129 -2^ ; ! S 115-119 / 1 2- ! ^ IIO-IU :£- s? i tf 105-109 / ^ 3 100-1 04 / -7^ • / r 5" 95-99 3 / ^ 90-34 3 5 85-89 S / ^ 80-84 75-79 70-74 ES-69 80-04 SS-S9 50-54 45-48 40-44 35-39 30-34 20-^9 lOvlB 0-a /fri Tm^l / ^ y^ I / 7,1^, JL. 489 490 APPENDIX B keep up. Now give me your best writing and spell- ing." When the second hand of the watch reaches the sixty second mark, read all of the first sentence and tell the children to "write it." Wait until the second hand reaches the position shown by the figures in a parenthesis just before the second sentence, then read that sentence, and so on through the test, the children writing the sen- tences during the intervals between dictation. At the conclusion of the test pass the answer cards and have the blanks in the first two Hnes filled out. Collect cards and papers. In the fourth, sixth, and eighth grades give both the easy and the difficult words. 491 Answer and Record Card' FROEBEL The Gary Public Schoe!«k . . ' *^ Age--L2 Grades •6,7, 8 1^ .. Class / i^S Total Number of Woi'ds Missed Accuracy Scored ^y f4> Score Card— Spelling — Test 6B Word No. jCheck Check No. Word victim 1 11 majority ought 2 12 organize occupy 3 13 minute senate 4 14 century agreement 5 15 piece entitle 6 16 assist govemment 7 17 suggest responsible 8 18 serious Wednesday 9 ^ 19 expense pleasant 10 20 business Total Words M i««(>r) Accur&cv 492 Tabulation Sheet EXERSOV CusRKord^ Sheet School E-jn Garx Public Schoola HANDWRITING ^?9^/^ TEST7 Mt£^ AV-/* AixA 3./^ DICTATION TESTS QUAUTY. AYRES FREQUENCY i2 yaiii .fe^- 80 70 60 SO 40 30 20 10 J£L 4«? M. ^ L 13- Qi TEACHER'S SAMPLES QUALITY, AYRES FREQUENCY Per C«Dt 69, or Bettat 36fQ- 5g? J^yUy^ ^hiL^ t" 90 80 70 60 SO 40 30 20 10 2C ^ 493 II. SPELLING LIST TEST Instructions. — On entering a room, give the teacher an answer card for the test and tell her that as soon as the children are ready, you are going to ask her to dictate the words. Give her no further instructions. If she asks any questions about the giving of the test, tell her to follow her usual practice. Then say to the children : "We are going to have a spell- ing test to-day. I am going to ask your teacher to dic- tate twenty words to you, and you will write them on these sUps of paper." Distribute the test papers and have the blanks filled out. Next ask the teacher to dictate the test words. When she has finished, have the papers exchanged twice; then distribute the answer cards for the test and have the blanks filled out. Have the cards exchanged to correspond with the papers, and in the lower grades collect the cards and papers together. In the upper grades the papers are to be corrected by the children; each word being marked (x) or (c), (for wrong or right), the marks to be made on the cards, and not on the papers. Collect the cards and papers together. 494 Test Paper The Gary Public Schools Test No. 6 A, Spelling S^ Age. J3 Grade P'/:^ Class ^6 . rh-^nj-cUji . -^2r i4 , „-yt^ ^ Ja£jl^ -Ao ^ytAj^^VT^.cn'vr^ njL^^ ^^yt^a^-^-^y^ ^..-->' tyC j ^A,^\n^j?j> ^ /".-g'">7^y-C>-vT^jg,/>-7^^e Jj J?^ , q y 1 // 11 •• / if 10 ' 1 ^ J^o 10 " lo\ 9 ' i. 9 '• 8 • ( s ^Hi 8 " ^ ^¥ 7 ' • / 7 7 " / 7 6 ' • S' .^a 6 " 7 ^ 5 • • ^ 10 5 " r 4 • • ^ r 4 " ^ 3 * i 3 1 3 " ? 2. ' 1 ^ s- ¥ 2 '* ^t. 1 ' 1 r 1 '• r _£ c • t ^ " r Tot»I ^$ /f ^ TouJ ^j f^ f cim* ; Average Accuracy 4,6 CUi 4kverage 1 Accuracy ^r cent Perfect ^j^^^jiXt^' ly^ 497 III. ARITHMETIC SERIES B (cOURTIs) Instructions. — On entering a room, ask the teacher for permission to give a test, and make sure that the children are provided with pencils. Say: "We are going to have arithmetic tests this week. To-day we are to have addition and subtraction; to- morrow multiplication and division. Please do not turn these papers over (showing the two sides of the test) until we are ready." Then distribute the papers. Have the blanks filled in. Read the instructions out loud before giving the warning signal: "Get Ready. Hands up"; then when the bell rings, "Start." Observe the time intervals given in the instructions and have the timing checked by the teacher. Give the stopping signal by bell, but say also: "Stop. Hands up. Make a cross by the last example finished." Distribute the score cards. Have the blanks filled in, writing the number of the class after "Grade." Have the papers and cards exchanged twice. Read the answers from the card, letting the children check the examples right (c) or wrong (x) on the score card in the column marked "Check." Part of an answer does not 498 APPENDIX B 499 count. Find the total number of the examples tried and the total right, writing these scores on the face of the card also. Collect the cards by rows. Have the papers exchanged again. Distribute new score" cards, and again have the papers scored. Collect both cards and papers by rows. Pass the papers for the subtraction test. Have the blanks filled in and read the instructions out loud. Give the test as before. Pass the score cards and have the blanks filled in, then collect both cards and papers. SERIES B (Cleveland) Instructions. — On entering a room say to the children; "We are to have an arithmetic test to-day. These papers contain five short tests. The first set of examples (point) is called set C; second, set H; the others, sets G, O, and L. Please do not look at any of these examples until we are ready. As soon as you get a copy of the tests, fill out blanks on cover. Distribute the papers. When I say: "Get ready," raise your pencil hand, take hold of cover with the other so you can open test quickly when I say: "Start." When I say: "Stop," close your papers and stand beside your- desk. The first test is a test in the multiplication tables. It is called set C and comes at the top of the page. Write the an- swers on the paper directly underneath the examples. 500 APPENDIX B Test Paper Illustration of form in which series B was used fht Garp fubtie Schools Arithmetic Test No. S A. Addition. Series B. Fonn 3 , YoiT^rall be given eight minutes to find the answers to as many of these Ad^tioh examples as possible. Write the answers on this paper directly under- neath the examples. You are not expected to be able to do them all. You will be marked for boOx speed and accuracy, but it is more important to have your answers right than to try a great many examples. Name fta,^'»i..*>^^ *J^/^uk> /^<^g^ Ag ft/&^^ Grade_gjt3U.i/£' Hour /^ '- ^^ T APPENDIX B 501 339 799 952 937 489 789 872 309 276 584 397 274 877 555 657 964 977 135 535 468 482 342 329 673 861 647 669 836 645 908 794 437 757 624 386 323 761 471 563 338 698 512 974 485 598 896 128 591 269 146 458 357 352 123 856 636 136 1699 702 925 431 637 962 ,704 322 109 397 819 367 254 287 119 ^ ^]f23y 20 4960 4944 9 L. 21 4982 4807 10 o 22 5553 5283 jll 23 5642 3866 12 24 5608 Total Tried_ Total Right- 502 Tabulation. Sheet Clasa Gary Public Schools Sh^f ARITHMETIC TESTS ScDfCs — SpBtdx'''*^..- Accuracy L*hC^^± P«i a JOrli iV"." PER CENT OF ACCURACY | SPEED 100 80 , 80 70 so so 49-0 Fraqueney Spaad 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 il 24.iJ ao-2t l;-i» iS-lii 12-14 0-11 a 21.23 19-30 17-ia 14-16 12-13 u-u 'J 20-21 1819 16-17 14-15 J U-13 0-lU i\ 19j:o 17-18 15-16 13-14 11-12 110 » 18-19 16-17 11-15 12-13 10-11 0-9 19 18 16-17 U-15 1213 ' 10-11 0-9 18 17 15-16 13-14 11-12 u-ia 0-8 17 W 14-16 12-13 11 910 0.8 ->• l« l.'i 13-lt 12 10-11 B-U- "■' 15 U 1313 Jl )1U 8 0-7 / / 1« i:i U 10-U J-9 ' 0-9 m 12 U 10 i-0 ■ !-« :i- 12 11 '"t 9 i 3 67 / O-S ^ J 11 1 ID f s f 6 0-5 t^^ 10 9' 8 / '1 f z 6 S-* ^ X > — ■ B f 5 0-4 ^'^ i - ' 6 t 4 "^ % r ~. II i 4 11.3 1 » ■ ~ f } 0-2 i ~ '■ i. - - 0-2 / 1 - 3 ~ I 0-1 / ^ > -~ ~ 3 - 0-1 ! - - ~ - I [) L ~ ~~ ~ - - 9 " ~ ~ ~* "..^ J 3X, Totil % 0^ _/ ^ ,T^ AL X 1 ^ 3.^ S^ /^o/ /^^ ^t-^ /.cr >iV . 1 , Median Scores— Speed •••/^stt..- Accuracy.-.,..-. .t:... Efficiency — v « 503 504 APPENDIX B Now lay your pencils down while we practice starting — "Get ready, start, stop, close papers, stand." When all understand, give the first test, warning the children to multiply just before giving the signal: "Get ready." Have children put a circle around the last prob- lem completed when they stop. While the children are standing, say: "The next set of examples, Set H, on the lower half of this page, con- tains easy examples in the addition and subtraction of fractions. Watch the signs and do what they tell you to do. If you cannot work these examples, close your papers and wait for the next test." Give the test. For the next test, say: "Set G is another multiplica- tion test. Write your answers directly underneath the examples. Give the test. The next test, Set O, on lower half of cover page, is made up of examples in addition, subtraction, multipli- cation, and division of fractions. Watch the signs and do what they tell you to do. Give the test. Set L, the last test, is on the back of the paper, and is a test in long multiplication. Give the test. Distribute cards, one set at a time, and have blanks filled out. Then place all cards inside folder. Collect in grades three and four. In other grades exchange twice and score in this order : L, G, and C as far as time permits. APPENDIX B 50s Test Paper "Tha Gaty Pobtie Sc&ficib A/M»elieT*9 -rs' llt*«tll«4 I 4 ^ir • 4 1 M t « • tCTH-Fn<&a>- •i+i Jl »_i.i i r4- l*i. ■<>1 ••*• MSI ••" ,'S<} ^r 5o6 APPENDIX B Name. Answer and Record Card lAH^i^£,^t^^,tU Test No. 11, C ^?,';°' Age—Zif Grade ^^ Class //^ Scores. Number Tried ^ -^ — Number Right Accuracy The QAry Public Schools. Xo Answers for Test 11, C 6 28 72 30 4 18 42 45 r^ C-, r^ v( e . O d. C_ A o 9 10 32 30 2 9 54 28 c. O r a; <^ e cr <=_ c. d ,6 16 49 24 7 24 81 12 c r. O 5 16 72 20 4 12 64 27 7 12 48 27 4 18 63 25 24 Totiil Tried_ Xl> Total Right. -^^ Class Record ^Sheet . Bpofotian Tabulation Sheet Gary Public Schools ARITHMETIC - Scores -Speed IH^JI^A it-/. A,/S, Accuracy .^/xX Efficiency--^ ■— ' " 'f'ER CENT OF ACCURACY | -SPEED .24 23 22 21 20 19 18 17 16 J5 14 13 12 11 10 9 8 7 6 5 4 3 2 1 100 so 80 70 w. SO 4S-0 , Fraquonoy Spoed ■ u ■ii.-ii 20-21 17.lit i.>l» 12-14 u-11 /r^f'f , a 21-22 19-20 17-18 14-18 12-1.1 0-11 •a 20-21 18-W 16-17 14-1-^ 11-13 0-10 y r ' !l lU-20 17.W 15-16 13-14 ll-W o-io- iO 1M8 16-17 / U-15 12-13 10-11 0-9 / ' 19 "n^ 16-17 U-15 12-13 t 10-11 0-9 .3 ; 18 17 "•'7 13-14 11-12 U-10 0-8 / ".5, "/ U..U 12-13 11 9-10 0-& 7 'W ir, ' 13-14 12 10-11 8-9 ^ ■ 0-7 / ^ 'V »y 12-13 11 9-10 8 0-7 ^&> * / 13 V lU-ll / B-9 • fl.» •^^ la 13 11 10 li-9 r 0-0, /^ "/ "^ .0^ 9 3 6-7 0-5 cT "^ . , B 8 7 6 0-^> «3 10 'y 8 7 a '■' 0-4 ^ 9 8 ' / ' 5 0-4 / e - ' S 5 4 0-3 » - » ' — 4 03 a - 5 - 4 3 0-i 5 - * / - 3 - 0-S / 4 - - 3 - ' O-l 1 - - - % ~ 0-1 2 - - ~ - I 11 I — ~ - - ~ D ' - ~ ~ — ~ - J - Total K^ 7/ ^0 n2 / ^ 1 ^^o; % ' 1 S — Speed- -^.'- "- flceura By _ .Effic ieney.-«. efv d-AV 507 IV. COMPOSITION Instructions. — Give list of subjects to teacher to write on board. Say to the children: "I want to find out to-day what kind of a composition you can write. I am going to ask you to write a story about an interesting experience that you or your friend may have had sometime or other. These are to be your own stories; nothing that you have just read somewhere or seen at a moving picture theatre. A real story will probably be best, but if you cannot write a real story, you may make up'one of your own. Make it as interesting or as exciting as you can. Your teacher has written some suggestions on the board to help if you cannot think of anything yourself." Read them. An Interesting Experience. A Storm. An Accident. A Runaway. An Errand at Night. An Unexpected Meeting, On the Ice. In the Woods. On the Water. In the Mountains. So8 APPENDIX B 509 ''However, you do not have to use any of these subjects unless you want to. Will those in the front seats please distribute the papers for me?" FiU in the blanks. "I am going to ask you to start together. You may spend the first few minutes thinking of what you are going to say, if you wish. I shall give you about twenty minutes in which to write your story. If you need more paper than the sheet which you have, raise your hand and I will bring you another. If you should finish your story before the time is up, please let me know by raising your hand." Start — allow twenty minutes — stop. Have children count words written. Answer and Record Card EMERSON 1. Number of words written. /^ / Ratf^ 1 1 ^ Z /.-y^t^-e^n^ 2. Vocabulary / / Coefficienf_.^!!2 3. Spelling ^- ^ " ^"^1 4. Caoitalization 5. Punctuation CL 6. Grammar ^ :z^ irammar ^ */ c . 7. Time/>T^^^ r.rnJi' d ^ ' Total Different Total Different S c^V* /J* For A c^3 / d C ^ ¥ 10 j?^ P / 1 I /3 3 T ^/ // N 3 3 F 7 1 G 7 7 B V y U d W /o L, V d B • ^ 3 Y / / R 7 ^^ J ^ / E ^ J K d d M It jJ- Q d 3 H in ^•^ Z D ^ L 3 c2^ X ^ ^ TOTAL 1 / /-^ f 7 % ^z-;^ . *Suinmary of records on page 512. 512 APPENDIX B Vocabulary Analysis Sheet Tabulation Sheet CUu Record She^ School Gary Public Schools COMPOSITION - - T^ MM— Class Ho^ ^^r*: TEST e ■iliit *(Sr ^d 9^^,¥o Scores— Spee ^/sr" '(1 uality. 7^, m, Efficieney,,. j QUALITY HILLEGAS SCALE SPEED 10 20 30 40 60 60 70 SO «0 Frequency Speed 30-39 26-29 24 23 22 21 9fi 19 / /■ 18 / / 17 16 / / ■ , 15 ?r A- 14 / / JL 13 n? Q,& 12 / f 11 •TL- 10 / / 8 « ■ T 6 6 4 3 ( 2 1 /?>. J 3 T.,.. 7. ^^ / /^<*' SI3 V. READING gray's oral reading scale Instructions. — Find as quiet a place as possible to give the tests; arrange with the teacher to send one pupil to the door of this room every seven minutes. When you are ready to begin, hand the pupil a copy of the test and give the following directions : "I have here some paragraphs which I should Hke to have you read for me. Read at your natural rate, as you read for your teacher in class. Begin at the first paragraph when I say 'Begin.' Stop at the end of each paragraph until I say : ' Next. ' When you have finished a paragraph hold the paper at your side until I ask you to read the next. If you should find some hard words, read them as well as you can without help and continue reading." SUGGESTIONS A mispronunciation that is evidently due to a foreign accent is not to be counted as an error. However, in- dicate the foreign accent at the top of the page. In general, there are six types of errors; gross errors, minor errors, substitutions, insertions, omissions, and repetitions. When two errors occur in a single word or 514 APPENDIX B 5i< phrase, only one error — the last — should be counted. (For instance, a mispronunciation and a repetition.) A single word repeated once is not counted a repetition, but if it is repeated more than once it is a repetition. Watch for the substitution, insertion, or omission of letters* as well as of words and syllables. Keep time to the nearest seconds only. Place each pupil under each of the following cate- gories: Expression: No expression (N. E.), mechanical (M. E.), intelHgent (I. E.) . Type of reading : Word (W), sentence (S), paragraph (P). REPRODUCTION TEST Instruction. — ^The Reading and Reproduction Tests are to be given as follows: "Little Baby Bear" in grades two, three, four, and live; "Nothing on the Breakfast Table," in grades five, six, seven, and eight; "An Acci- dent" in grades eight, nine, ten, eleven, and twelve; "Two Ways of Asking a Favor" in grades four, five, six, seven, and eight. In second grade classes no attempt need be made to have the children reproduce the story, if in the judgment of the teacher they lack ability to write connected sen- tences. Time allowance for "Little Baby Bear" is one minute for second and third grades, thirty seconds for fourth and fifth grades; for "Nothing on the Breakfast Table" one minute; for "An Accident," one minute; and for "Two Ways of Asking a Favor," forty five seconds. Answer and Record Card EMERSON GRAY'S ORAL READING TEST Name Test No. 12, Boysy ^y /'^ Grade oA Class. /V^ /^^__Standard 4 iC- Ability, Standard 1 Rate of Reading Points. ±1. The Gary Public Schools GRAY'S READING SCALE No. Time Enort St'd' Value Product 1 /?- d 4^ 2.d 2 Id / y 5 *3^6 3 /¥- / V 5 %o 4 n / V- 5 >o S 11 V 5 ^o 6 Up i . v- 5 2^o 7 %o > .? 5 isr 8 %1^ d V- 5 i-4 9 If r y 5 6^ 10 5,? -f. / 5 s^ 11 :xc 7 o 10 12 5 Grade Values for 4 /^^ V I II aragraph 1. 55 I. 35 I. SO r. 25 "^ I V. 20 VI. 15 VII. K ^III. S Score H Si6 Tabulation Sheet This sheet was designed for test in copying figures, but was used for tabulating scores in oral reading FROEBEL Te»t No. H Rate of Work The Gary Public Schools ^Vo Class Sheet . Date._M-.>-.4 Hour... Trial >i« ^ ?3 14 ► 140 /3.<- nu .9¥ 13 » 130 li^D sS S.<" 12 ) / 120 /vr ?^ ?(^ ill > i" 110 js-o a^ q/ io< 1 100 /^r fi- J1 9< 1 ^ 90 i^r f. u B( s"© 80 nc Ol. y^ 7( 1 70 9,3^" Oy x^ 6( /© 60 5 !> a- 50 4 s / 40 3 ) / 30 2 ) 20 lb / 10 Total \h^'\ '/o / rT Grade > A^ Class 74^ ^ \ 4- •>- J> •^ Number of Words Read /^^ ^S^ U^ Rate /^^ / ^ -^^ Number of Words Wr;Hf>n /^/^ 77 Rate ^0- =?^ Quality of Reproduction____, . . -J^ '^^ . The Gary Public Schools. c School*. /tiT'TS. t 30 & ^^ to be three minutes. Set the clock at 12 :oo. Start with bell and stop with bell. Have teacher time test with stop-watch. Record the time interval reported by teacher. For each test, give the warning signal: "Get ready. Hands up," and the stopping signal: "Stop. Put a cross in the right hand margin opposite the last question you have answered. Close your papers." Distribute score cards. Have name, sex, age, and grade filled in. For grades three, four, five, and six have the score cards placed within the test paper. Collect the papers. For grades seven to twelve distribute the score cards. Have name, sex, age, and grade filled in, then have the pupils exchange papers and cards (twice). Test No.t Form ' kTCRO^v „ Tabulation Sheet The Gary Public Schools^ jji^yj" Oass Sbeet .:??*r?:?t*fe^ €!««[»... ..Date... ..Hour... orm 1 SCORE 48, 49, 50 45, 46, 47 42, 43, 44 39, 40, 41 36, 37, : 33, 34, 35f 30, 31. 32 27, 28, 29 24, 25, 26 21, 22, 23 18, 19, 20 15, 16, 17 12, 13, 14 9,10,11 3> 4, 5 (fl 0, 1, 2 /^3 /^>^ -hHh /3f J30> /3() 6,. 7, 8 ^J- /4^7 LrZ Mz±- 11 ■^ ;i^rt) .2^ ifi- 48, 49, 50 45, 46, 47 j^,/ 42. 43, 44 j^^^ ^9.40,41^^ ,^ g 36, 37, 38 33, 34, 35 li 30, 31, 32 .2^ ^ ^^ brm 2 SCORE ^-^ -I 27, 28, 29^ 24, 25, 26^J^ 21, 22, 23 18, 19, 20 15, 16, 17 12, 13, 14 9, 10, 11 6, 7, 8 3, 4, 5 0, 1, 2 Frq. ^L Frq. 30 1^ PC" ^3 SI ^«^ -?^ ^3-/ ■^Z If- I 19- f ^ iq-t li>-l iS't IS-- 1 TFT ^ // ^ '^ II f so. TOTAJ- M,S'^,0 /\7l Possible errors: Scorins.. I Actual errors: Scoring Accuracy.. Mffpi^. Tabalating . TabDlating . Miscelluieoas.. Mtscdlaneons.. Frq. _^ ^This sheet was designed for tabulating results of cancellation tests, but was used as shown. S2S 526 APPENDIX B Then say: "Here is a large score card like the small one you have. Watch me score a paper. The correct answer to the first one is 'yellow' (pointing). It is yellow on this paper so I will mark this score C; the second answer, etc.; this answer is wrong so I will mark out the score with a cross. This is the last answer on the paper, so I will mark out all the rest of the scores. Now I will add all the scores not crossed out and write the sum here. Score your paper in this same way. If you don't understand, study the instructions on the card. When you have found your total score, record it after the word 'Score' here (pointing)." Collect the cards by rows. Have the papers exchanged again. Distribute new score cards. Have each child fill out a card from the paper he now has, following the same procedure as before. Collect the cards. TRABUE LANGUAGE SCALE Instructions. — "I have another reading test to-day. This sheet contains some incomplete sentences, which form a scale. This scale is to measure how carefully and rapidly you can think, and especially how good you are in your language work. You are to write one word on each blank, in each case selecting the word which makes the most sensible statement. You may have just seven minutes in which to sign your name at the top of the page and write the words that are missing. The paper will be passed to you face downward. Do not Answer and Record Card Mnnif> T^^*--'^;/^-'*"*^",,, ^^^f*f-^ Kansas Reading Test Age J^ Grade P^- ^ P- Score— 2^^^^ INSTRUCTIONS The opposite side of this card gives the correct answers. Compare each answer on the test paper with the corresponding answer-on the card. When an answer is wrong, draw a line through the value for that answer in the column headed Score. In the same way mark out the scores of aH exercises not tried. Find the total of all the scores not crossed out and record it at bottom of the column, and after the word score above. Answers to tests for Grades 5, 6, 7, 8 The Gary Public Schools. ^ ^Z^c^ SCORE CARD KANSAS READING TCST ■■SS •B S S-9 S.1 10.4 12.8 15.4 18.4 1 /4/i/i-ij^ CIO 4 rabbit shepherd dog^2.0 5 If v ou find a word (^2.2 39 5 .45.3 J5.S Correct An»weril^v m Od curfew" tiloulghman success failure ftelted comno-jff ZO\jMi* S27 Tabulation Sheets — Conventional Scores TROEBEL M^^.^'^^.M^ ^ /f^ *^6 (XhlB record sheel BTo be returned to the Bnrean of Educational Measnrefflenttf and Standards, Kansas State Normal School, Emporia, Kan. A daplicat« may be' retained by the teacher. If needed, additional copies of this record sheet will- be sent free.) /» Kansas Sflent-reading Test— Class Record Sheet Gity : „... £fcfe>ot :...,._ ._ -Srwfe f^.r^L'T Teacher. _ _ _ _. „.. Date.....¥..rJi.^..t.&i.. DISTRIBUTION OF PUPILS' SCORES. :UJr^X^ Numher Pupils Instructions for Making the Distri- bution of Pupils' Scores, and fof - Finding the Median Score 1. The teacher must be careful that her papers are grouped correctly by classes. If she • has but one grade of pupils, say 5th grade, or but two divisions of one grade, say 5th A and 5th B, then her papers are all grouped to- gether and but one "distribu- tion" made. If, however, she has parts of two or more grades, say part 5th and part 6th, she must make two or more piles of papers, one for «ach grade. 2. Arrange the children's papers for any class group in order of scores,, the lowest score on top. ' 3. To make thei distribution called tor, count the number of pa- pers whose scores fall ^within the successive groups listed; For instance, if the lowest score is 3.5, the next lowest 5.7, the next 7.1, 7.8, 8.3, and so on, you will put "1" in group marked "between 3 and 3.9," "1" in the group marked "between 5 and 6.9, "3" in the group marked "be- tween 7 and 8.9," and so on until the whole number sf scores are recorded. The sum of these numbers must equal the number of children taking the test. 4. The median score is the score on the middle paper in the pile of papers arranged according to size of scores. If there are 35 papers, the median score is the ^j_ score on the ISth paper. If 1%? there are 36 papers, the me- ^ dian score is half way between , th'g score on the ISth paper anrt t&e score on the 19th paper between and 9 between land 1.9 between 2 and 2.9 between Sand 3.9 ...L..... :::e: between Sand 6.9 t^ zkz. between 9 and 10.9 between 11 and 12.9 between 13 and 14.9. . - ^ between 15. and 17 .9 „ . . . . between 18 and 20.9 between 24 and 26.9 between 27 and 29.9 between 30 and 34.9 between 35 and 39 9 between 45 and 49.9 between 60 and 69.9 between 70 and 79 .§ SO and above. ^, Total number of children .... Median score It HUM (OVEit) ^oJi^i^c^AteU^ ^0~^ 528 Rate and Accuracy Scores Class Gary Public Schools TEST 3. i te* KANSAS READING TEST — Sc!iool-_^?i«*:M?- «--:s Cass Ho.™.'S^4-^ Seores-Speed..™.jK./iA-^iftlccuracy — clM. % Sfandard.- Grade- ItCLy PER ^ 3,ENT OF • ACCUR ACY SPEED 1-19 20-39 40-59 CO-69 70-79 J BO-89 90-99 100 T8t«r , 4-1 n 8.S 16.4 / 24.6 28.7 32.8 36.9 41.0 |-^ 40 . -8.0 1«.0 24.0 28.0 32.0 36.0 40.0 39 7.8 15.6 33.4 27.3 31.2 a5.i 39.0 38 7.6 15.2 22.8 28.6 30.4 34,2 38.0 37 7.4 14.8 22.2 25.9 29.6 33,3. 87.0 36 a 7.2 14.4 21.6 35.2 28.8 32.4 86.0 35 7.0 14.0 21.0 24.5 28.0 a.5 35.0 34 6.8 13.6 20.4 23.8 27.2 30.6 34.0 33 6.6 13.2 19.8 2:i.l 26.4 29.7 33.0 32 6.4 12.8 19,2 1 22.4 25.6 28.8 '32.0 f ' 31 6.2 12.4 18.6 21.7 24.8 27.9 31.0 30 6.0 12.0 18.0 31.0 24.0 27.0 30.0 29 a.8 11.6 17.4 20.3 23.2 23.1 29.0 28 5.B 11.2 10.8 19.6 22.4 25.2 28.0 27 5.4 10.8 10.2 18.9 21.6 f 24.3 27.0 ( ' 26 5.2 10.4 15.6 18.3 20.8 23.4 26.0 25 S.O . 10.0 W.O 17.5 20.0 22.5 25.0 24 4.8 9.6 14.4 16.8 19.2 21.6 24.0 23 4.6 9.-.> 13.8 1 IS.l 3 18.4 9- 30.7 3» 23.0 'f f 22 .8 4.4 8.8 13.2 15.4" 17.0 19.8 22.0 21 4.2 8.4 12.6 14.7 16.8 18.9 21.0 20 4.0 8.0 12.0 14.0 16,0 ^ 18.0 / 20.0 •»- 19 3.8 7.6 11.4 13,3 15.2 17.1 19.0 18 3.6 7.2 10.8 12.6 / 14.4 16.2 ^ 18 0yL r - 17 3.4 6.8 10.8 11.9 -13.8 15.3 17.0 16 3.2 6.4 9.6 11.2 IB'.S f 14.4 18.0 f 15 3.0 6.0 9.0 10.5 12.0 13.5 15.0 14 > 1 2.8 S.6 8.4 9.8 11.3 / 13.6 H.0/'3 13 3.6 5.3 7.8 9.1 10.4 11.7 13.0 12 "1 2.4 4.8 7.2 8.4 9.6 10.8 12.0 t 11 u 2.2 4.4 8.6 7.7 8.8 9.9 11.0 10 2.0 4.0 6.0 7.0 8.0 9.6 lO.n 9 1.8 3.6 6.4 0.3 7.2 8.1 a.o 8 t.6 3.2 4.8 5.6 e.i 7.2 8.0 7 » 1.4 2.8 4.2 4.9 6.8 6.3 7.D 6 1.2 2.4 8.e- 4.2 1.8 5.-4- 6.0 5 1.0 2.0 3.0 3.5 4.0 •4.6 S.O 4 3 2 1 .8 1.6 2.4 2.8 3.3 3.6 4.0 n .6 1.2 1.8 . 2.1 2.4 ^2.7 3.0 .4 .8 1.2 • ■* 1.6 1.8 2.0 .2 .4 .« .7 .8 .-9. 1.0 .0 .0 .0 .0 .0 J) .0 TdS'i _i_ f 1 t 4^ i. 5" 4 i^: 529 530 APPENDIX B turn it over until we are all ready. After the signal is given to start, remember that you are to write just one word on each blank and that your score depends on the number of perfect sentences you have at the end of seven minutes." The papers are then distributed "After you have been working seven minutes, I shall say: 'Stop.' You will all please stop at once. Now if you are all ready, when the bell rings you may turn your papers over, sign your names and fill the blanks." Five seconds before the signal from the automatic timer, the warning: "Get ready" was given, and when the bell rang, the command "Start" was also given. Seven minutes later, when the bell rang the second time, the command "Stop" was given, and then the instruc- tions: "Mark a cross just before the nmnber of the sentence at which you stopped. Now write 'boy' or 'girl' after your name at the top of the sheet. Then just below your name, write your age, your grade, and the hour. Turn your papers face down. The children at the back of the room please collect the papers for me." APPENDIX B 531 Trabue Language Scales 4(= (L Name y^>Llf!L:n>d,J^<:fiL^ Answer and Record Card Boy or Grade, Trials — Test 4— Trabue /±- , f // . . //„ il Deviations from Weighted Score Tlie Gary Public Schools Weighted Score trre^olanties FkOEBEL Test No.Hv Rate of Work (P. Tabulation Sheet n. ^^^ ' ' The Gary Public Schools ^^ V^ Class Sheet Trial SCORE OVER 160 150 140 130 ^O -"Sto" 120 'i 110 Boy » Fiq. tt I GirU h I Fig- -J- S'© -^ ^ c2.^ is 2q> ^Q H WTP j±j. I^OMtble erron: Scorinf.. Actual errort: Scoring.. Accuracy l_/a_ Fiq. ^ X ^ 3 ^ al S05RE i io 16p 15 146 13( 121 11 10 60 4 ) ^i^ 12^ Fiq. ^. 3(^ 3^ "^ TWI Mfidian /.^ Z^ /Ak ] Tabulating.... J Ti^ R* iThis sheet was designed for the test in copying figures, but was adapted to the needs of this test as above. 532 THE PUBLICATIONS OF THE GENERAL EDUCATION BOARD REPORTS: THE GENERAL EDUCATION BOARD: AN ACCOUNT OF ITS ACTIV- ITIES, I902-I9I4. CLOTH, 240 PAGES, WITH 33 FULL-PAGE ILLUSTRATIONS AND 3 1 MAPS. REPORT OF THE SECRETARY OF THE GENERAL EDUCATION BOARD, I9I4-I915. CLOTH AND PAPER, 82 PAGES, WITH 8 MAPS. - REPORT OF THE SECRETARY OF THE GENERAL EDUCATION BOARD, I9I5-I916. CLOTH AND PAPER, 86 PAGES, WITH lO MAPS. REPORT OF THE SECRETARY OF THE GENERAL EDUCATION BOARD, I916-I9I7. CLOTH AND PAPER, 92 PAGES, WITH 3 FULL-PAGE ILLUSTRATIONS AND IJ MAPS. ANNUAL REPORT OF THE GENERAL EDUCATION BOARD, I917-1918. STUDIES: PUBLIC EDUCATION IN MARYLAND, BY ABRAHAM FLEXNER AND FRANK P. BACHMAN. 2ND EDITION. 176 PAGES, AND APPEN- DIX, WITH 25 FULL-PAGE ILLUSTRATIONS AND 34 CUTS. THE JUNIOR HIGH SCHOOL, BY THOMAS H. BRIGGS.* COLLEGE AND UNIVERSITY FINANCE, BY TREVOR ARNETT.* OCCASIONAL PAPERS: 1. THE COUNTRY SCHOOL OF TO-MORROW, BY FREDERICK T. GATES. PAPER, I5 PAGES. 2. CHANGES NEEDED IN AMERICAN SECONDARY EDUCATION, BY CHARLES W. ELIOT. PAPER, 29 PAGES. 3. A MODERN SCHOOL, BY ABRAHAM FLEXNER. PAPER, 23 PAGES. 4. THE FUNCTION AJTO NEEDS OF SCHOOLS OF EDUCATION IN UNIVERSITIES AND COLLEGES, BY EDWIN A. ALDERMAN. PAPER, 29 PAGES, WITH APPENDIX. 5. LATIN AND THE A. B. DEGREE, BY CHARLES W. ELIOT. PAPER, 21 PAGES, WITH APPENDIX. 6. THE WORTH OF ANCIENT LITERATURE TO THE MODERN WORLD, BY VISCOUNT BRYCE. PAPER, 20 PAGES. 7. THE POSITIVE CASE FOR LATIN, BY PAUL SHOREY.* * In Preparation. The REPORTS issued by the Board are official accounts of its ac- tivities and expenditures. The STUDIES represent work in the field of educational investigation and research which the Board has made possible by appropriations defraying all or part of the expense involved. The OCCASIONAL PAPERS are essays on matters of current edu- cational discussion, presenting topics of immediate interest from vari- ous points of view. In issuing the STUDIES and OCCASIONAL PAPERS, the Board acts simply as publisher, assuming no responsibil- ity for the opinions of the authors. The publications of the Boa*i may he obtained on request I C r "-in LIBRARY OF CONGRESS 021 338 802 2 ■vX mm m'4 i i^u. 5^H 1