6 \S-73 UNIVERSITY OF ILLINOIS BULLETIN Issued Weekly Vol. XIX January 30, 1922 No. 22 [Entered as second-class matter December li, 1912, at the post office at Urbana, Illinois, under the Act of August 24, 1912. Acceptance for mailing at the special rate of postage provided for in section 1 103, Act of October 3, 1917, authorized July 31, 1918.] BULLETIN NO. 8 BUREAU OF EDUCATIONAL RESEARCH COLLEGE OF EDUCATION A CRITICAL STUDY OF CERTAIN SILENT READING TESTS By Walter S. Monroe, Director Price 50 Cents PUBLISHED BY THE UNIVERSITY OF ILLINOIS URBANA ^nog-faph BULLETIN NO. 8 BUREAU OF EDUCATIONAL RESEARCH COLLEGE OF EDUCATION A CRITICAL STUDY OF CERTAIN SILENT READING TESTS t By Walter S. Monroe, Director PUBLISHED BY THE UNIVERSITY OF ILLINOIS URBANA LIBKARY OF CONGWFSS DOCUMENT TABLE OF CONTENTS PAGE Preface 4 The Measurement of Silent Reading Ability 5 The Problem 5 The Data Collected 6 The Performances Required of a Pupil 9 Description of Pupils' Performances 13 Scoring Reproductions 15 The Idea-Counting Method 16 Brown's Method of Idea Counting 17 The Word-Counting Method 19 Subjectivity of Describing Reproductions 20 Constant Errors and Variable Errors 20 Summary for Describing Reproductions 25 Scoring Answers to Questions 25 Describing the Quality of Compositions 25 Time Required for Scoring Test Papers 25 Average Scores and Standard Deviations 26 Equivalence of Duplicate Forms 29 Relation of Vocabulary to Difficulty 30 Formation of Composite Scores 31 Reliability 32 Methods of Determining Reliability 32 Probable Error of r Due to Sampling 33 Reliability of Tests Studied 34 Discrimination 37 Comparison with Teachers' Ratings 39 Correlation of Comprehension with Memory 39 Corrected Coefficients of Correlation 41 Correlation of Comprehension with Vocabulary 42 Correlation of Cancellation Scores with Measures of Rate of Reading 44 Correlation of Comprehension with Written Composition 44 Inter-correlation between Tests 46 Correlation of Single Tests with Composites 50 Summary of Conclusions 51 Correlation with Composites 52 PREFACE In the field of silent reading, as well as in the fields of other school subjects, the number of available educational tests has been increased so that one desiring to use a test is confronted with the necessity of making a choice. If such a choice is to be made intelli- gently it Is necessary to have at hand experimental data with refer- ence to the reliability and validity of the tests considered. The study which is reported in this monograph was undertaken for the purpose of securing such data with reference to certain silent reading tests. The report is presented in hopes that users of silent reading tests will find the information that it contains helpful in making an intel- ligent selection of educational tests in this field. The monograph will doubtless also be of interest to students in the field of educa- tional measurements. Walter S. Monroe, Director, Bureau of Educational Research. A CRITICAL STUDY OF CERTAIN SILENT READING TESTS The measurement of silent reading ability. The scores yielded by silent reading tests may fail to be true measures of silent reading ability for two reasons. First, the scores may not be reliable or ac- curate. A score is lacking in reliability when two applications of a test or of duplicate forms of it do not yield approximately the same score when administered to the same pupils, as far as possible, under the same conditions. Included in this is any lack of objectivity in the scoring of the test. Second, the performance which a pupil gives on a silent reading test may depend upon other factors in such a way that it is an index of these factors rather than of silent read- ing ability. For example, when a pupil answers questions from memory his answers may be influenced to such an extent by his ability to remember that his performance is not a truthful index of his ability to read silently. Two aspects of the activity of silent reading may be recognized. First, the reading mechanism consists of perception, eye-movement habits, etc. The rate of silent reading is largely dependent upon this mechanism and hence any measure of rate is an index or symptom of the quality of the mechanism. Second, the thought-getting or comprehension aspect of silent reading involves the higher mental processes. The quality of this is indicated by the comprehension scores. Comprehension is not entirely independent of the mechan- ism of silent reading, but, if sufficient time is allowed, pupils who possess poor reading mechanism may stand high in thought-getting. The problem. The problem of this study is to ascertain the reliability and, so far as possible, the function and validity of certain silent reading tests. These tests, as will be shown later, differ in the performances which are required of the pupils. They also differ in other respects. Their titles suggest that all of the silent reading tests included in this study are designed to measure silent reading ability. The fact that they differ widely in certain respects suggests the possibility that no two of them measure the same type of read- ing ability, or at least that they do this with different degrees of validity. The study has been restricted to tests which yield some measure of the rate of reading as well as a measure of comprehen- sion in order that the measurement of both phases of silent reading activity might be studied. With one exception, the tests which have been used have dupHcate forms. In addition to the silent reading tests, certain jother tests were given to the same pupils, because it was thought that the scores yielded by them might assist in the analysis and interpretation of the scores yielded by the silent reading tests. The data collected. Through the courtesy of Superintendent W. W. Earnest and certain teachers of the Champaign Public Schools, the tests chosen for this study were given in the spring of 1920 to a number of pupils in the fourth and seventh grades. All of the tests were administered by Miss Dora Keen, at that time a research assistant in the Bureau of Educational Research. Care was exer- cised to secure as nearly uniform testing conditions as can be ob- tained in the ordinary schoolroom. The lapse of time between the giving of the different forms of the same test was made as nearly equal as possible for the different groups. Only in rare instances were tests given after recess in the afternoon or during the afternoon session on Friday. The tests were given to all pupils in four rooms in both the fourth and seventh grades. The total number of pupils tested in each grade was approximately 140. The study is, however, based upon the records of only those pupils who took all of the tests. The number of complete records in the fourth grade is 80 and in the seventh grade, 91. The following tests were given in the fourth grade: 1. The Courtis Silent Reading Test No. 2^, Form i, "The Kitten Who Played May Queen," and Form 3, "The Kitten Who Caught a Fish." 2. Brown's Silent Reading Test^, Form i, "The Long Slide," and Form 2, "A Morning Adventure." 3. Monroe's Standardized Silent Reading Test P, Forms I, 2, and 3. 'Courtis Silent Reading Test No. 2. Forty-sixth Annual Report. Kansas City, Missouri: Board of Education, 1917. pp. 79-85. ^Brown, H. A. "The Measurement of Ability to Read." A Manual of Direc- tions Concerning Giving and Scoring of Reading Tests, Statistical Treatment of the Data and Diagnosis of School Class and Individual Needs. Concord: New Hampshire Department of Public Instruction (in cooperation with the General Education Board). Bureau of Research Bulletin No. i, Second Edition, 1916. PP- 57- "Monroe, W. S. ''Monroe's Standardized Silent Reading Tests." Journal of Educational Psychology, 9:303-12, June, 1918. 4- Fordyce's Scale for Measuring Achievement* in read- ing Test No. I, "Narcissus." 5. Experimental Reproduction Test I, Form i, based on pages 84 and 85 of the supplementary reader, "The Strike at Shane's"^, and Form 2 based on pages 6 and 7 of the same publication. The passage for Form i contains 370 words and that for Form 2, 395 words. In administering these tests the pupils read from the sup- plementary reader. The exact place of beginning had been marked in each copy. Also the end of the passage to be read was indicated. 6. Cross-Out Silent Reading Test I, Form i and Form 2. This is an experimental silent reading test. In a passage of rather simple reading material, words were substituted, which did not agree with the meaning of the preceding words in the sentence. A pupil is asked to cross out the words which do not fit. With the exception of the substituted words, the selection is a connected story. 7. Vocabulary Test. The words of this test are those used by Terman and Childs. The form of the test is that proposed by Whipple". 8. Cancellation Test, "a-t" and "e-r^ 9. Memory, "How Mr. Lincoln Helped the Pig."^ The following tests were given in the seventh grade: 1. Starch's Silent Reading Test No. 6 and Test No. 7.^ 2. Monroe's Standardized Silent Reading Test II, Forms i, 2, and 3. 3. Fordyce's Scale for Measuring Achievement in Reading, Test No. 2, "Spirit of Spring." *Fordyce, Charles. "A Scale for Measuring the Achievements in Reading." The University Publishing Company, Lincoln, Nebraska, and Chicago. 1916. '^■'The Strike at Shane's." (Gold Mine Series, No. 2.) Boston: American Humane Education Society, 1908. pp. 91. (A supplementary reader for the fourth grade which has as its lesson kindness to domestic animals.) "Whipple, G. M. Manual of Mental and Physical Tests, Complex Processes Chapter 12. Baltimore: Warwick and York, 19 14. 'This test is described by Whipple in the Manual of Mental and Physical Tests, Simpler Processes, p. 311. ^Whipple, G. M. Manual of Mental and Physical Tests, Simpler Processes, Pages 207-10. 'Starch, Daniel. The Measurement of Efficiency in Reading. Journal of Ed- ucational Psychology, 4:1-24, 1915. These tests were used as duplicate forms. 4- Experimental Reproduction Test II, Form i, based on pages 6 and 7 of the supplementary reader, "Old English Heroes,"^" and Form 2, based upon pages 8 and 9 of the same publication. The passage for Form i contains 662 words, and that for Form 2, 611 words. 5. Cross-Out Silent Reading Test II, Form i and Form 2. This test Is similar to the Cross-Out Silent Reading Test used in the fourth grade but is based upon more difficult material. 6. Pressey Silent Reading Test for Grades VI, VII, and VIII, Form I and Form 2. This is an experimental test. 7. Vocabulary Test. This is the same test as that used in the fourth grade. 8. Cancellation Test, "a-t" and "e-r." This is also the same as that used in the fourth grade. 9. Memory Test, "Marble Statue."" 10. Composition Test. The Willing Composition Scale^^ and the directions which accompany it were used. In addition to the above tests a rating for ability in silent read- ing was secured from the teachers. To guide them in making this rating, the teachers were given the following directions: Think of all the fourth (seventh) grade pupils with whose silent reading ability you have ever become acquainted from the best to the poorest. Compare each child in your present class with this distribution of pupils. Give a pupil a rating of S if he has very superior ability in silent reading equalled only by about seven out of every hundred, or 7 percent of fourth (seventh) grade pupils. Give him a rating of 4 if he has superior ability or ability above the average, yet is ex- celled by the very superior group. About 24 out of every hundred, or 24 percent of fourth (seventh) grade pupils, will fall in the superior group. Give him a rating of 3 if he possesses average ability, i. e., ability which lies somewhere close to the middle of the difference between the very best pupil and the very poorest. About 38 out of every hundred, or 38 percent of fourth (seventh) grade pupils, will fall in this average group. If the pupil is below the average in ability to read and yet ^"Bush, Bertha E. Old English Heroes. (Instructor Literature Series — No. 116.) Danville, N. Y., and Chicago: F. A. Owen Publishing Co., and Hall and McCreary, 1909. Pp. 31. This is a supplementary reader suitable for the upper elementary grades. It contains brief sketches of the lives of Alfred the Great, Richard the Lion-Hearted, and the Black Prince. "Wliipple, G. M. Manual of Mental and Physical Tests, Simpler Pro- cesses, Pages 107-10. "Willing, M. H. Measurement of Written Composition in Grades IV to VIII, English Journal, 7:193-202, March, 1918. does not equal the poorest you have ever known give him a rating of 2. This group is called inferior and will contain about 24 out of every hundred, or 24 percent of fourth (seventh) grade pupils. Give the pupil a rating of i if he is very inferior in ability to read so that he is as poor or very nearly as poor as the poorest pupil you have ever known. About 7 out of every hundred, or 7 percent of fourth (seventh) grade pupils, will fall in this very inferior group. The above directions do not mean that you will necessarily be obliged to give 7 percent of your class a rating of 5 ; 24 percent, a rating of 4; 38 percent, a rating of 3; 24 percent, a rating of 2; and 7 percent, a rating of i. They do mean, however, that a large number of pupils, a number running up into the hundreds, can be divided in exactly this manner, i. e., 7 percent, very superior; 24 percent, superior; 38 percent, average; 24 percent, inferior; and 7 percent, very inferior. You are to think of all the pupils you have ever known from the best to the poorest and by comparison give each pupil in your present class the rating he would receive if he were included with all the pupils you have known and the entire number should be rated in the above manner. The performances required of a pupil. All of the silent reading tests in the above list are designed to measure the ability to read silently. However, they require a variety of performances from the pupil. In the Courtis Silent Reading Test No. 2, the pupil is re- quired to read a continuous selection for three minutes. At the end of this time he turns to another section of the test and answers ques- tions based upon the selection he has just read. The questions are to be answered by either "yes" or "no." The selection read is re- peated in connection with the questions so that the pupil may refer to it in case he does not remember the answer to any question. The Brown Silent Reading Test and the Starch Silent Reading Tests require the pupil to read a selection and then reproduce what he can remember. Starch allows thirty seconds reading time, while Brown allows one minute. The Monroe Standardized Silent Reading Tests consist of a series of exercises. Each exercise consists of one para- graph and a question based on it. Most of the answers are to be given by drawing a line under a word. Five minutes are allowed for the test. The Fordyce Scale for Measuring Achievement in Silent Reading^^ requires the pupil to read a selection and then an- swer from memory questions based on it. The selection for Test i contains 300 words. The time allowance is 125 seconds. The selec- tion for Test 2 contains 512 words with a time allowance of 140 seconds. The time allowed for the reading is intended to be such that 50 percent of the pupils will finish before time is called. The directions which accompany the Fordyce Scale for Meas- "This test has only one form. Test i was given in the fourth grade and Test 2 in the seventh grade. uring Achievement in Silent Reading are stated in general terms. For this reason it was necessary to formulate the exact explana- tion to be given to the pupils. The following was used: Do not turn over your paper until I tell you to begin. These papers have a story on them. You are to read the story at your ordinary rate of reading, care- fully enough so that you will be able to reproduce the leading thoughts. When I say "mark," draw a line around the word at which you are looking at that time. If you have not finished go right on reading until you come to the end of the story. Then immediately turn your paper face down and sit quietly until all have finished. You are to read the story once and once only, and just as soon as you have finished, turn your paper down. Is there any one who does not understand exactly what to do.^ All right! Begin! In the Experimental Reproduction Test the following directions were used: Do not open your books until I tell you to begin. Write your name and school on the card." This is a test to find out how rapidly and how well you can read. Read carefully; for you will be asked to write out what you have read. Put your finger in the book this way (illustrating). When I say "begin" open your books and begin to read at the first blue mark here (illustrating). When I say, "mark," draw a line around the word at which you are looking, (illustrate), then go right on reading until you come to the last blue mark. Then close your book and sit quietly until all have finished. Read over only once. Do not forget to draw a line around the word where you are reading when I say, "mark." Is there anyone who does not understand just what he is to do? All right! Begin! The time allowance was thirty seconds. After they had com- pleted the reading, the pupils were asked to write, in as nearly the same words as possible, all that they had read. This reproduction completed, they were asked to answer a list of questions based upon the selection read. They were not given an opportunity to consult the reproduction nor to add to it after answering the questions. The nature of the Cross-Out Silent Reading Test is illustrated by the directions given to the seventh grade pupils: Below you will find a paragraph of a story. Certain words in this paragraph do not belong there, that is, they do not make sense and do not agree with what has gone before. Read this paragraph carefully and draw a line through all the words which do not belong there. Do not write anything. Do nothing except cross out the words which do not make sense with what has gone before. Is there anyone who does not understand what he is to do? Remember to cross out only the words which do not agree with what has gone before. All right! Go ahead! "A 3x5 card was fastened to the copy of the supplementary reader which was given to each pupil. Before the books were distributed to another class the rate scores were recorded on the cards and a new card attached. 10 "It happened in our country long ago, in those old days when only a fev white people lived here and everything was rough and civilized. Strong men were at work among the hills, cutting down the brooks and planting corn in the new fields, and towns were springing up all along the walls, but still there were many miles of forest where Indians hunted and bears and wolves had their palaces." In this paragraph the words to be crossed out are "dvilized", "brooks", "walls", and "palaces". These answers were read to the pupils after they had marked the paragraph. In case any failed to understand the nature of the exercise it was explained to them. They were then directed as follows: In the following pages you will find part of a story. It is not a fairy story. In this stery, as in the paragraph above, there are words which do not agree with the meaning of what has gone before. Cross them out just as you did in the above paragraph. Be sure to cross out all the words which do not belong, but cross out only those words; for if you cross out any word which should not be crossed out it will be counted as a mistake. You will be allowed four minutes to work. Many of you will be unable to finish during this time. It is more important, however, to do your work correctly than to cover a great deal of ground. Do all three pages. When I say "begin" turn the page and start to work. If anyone finishes before the time is up, close your paper and sit quietly. Is there anyone who does not understand just what he is to do? All right! Begin! The directions to the fourth grade pupils differed from the above in only two respects. Two additional illustrative paragraphs were used and the time allowance was three minutes instead of four. The nature of the Pressey Silent Reading Test for Grades six to eight may be illustrated by the directions: Look at the first example given just below: 1. February is the longest month in the year. The above statement is not true; but there is only one word that makes the sentence untrue. This one word is the word "longest"; if "longest" were changed to "shortest", the sentence would then read, "February is the shortest month in the year", which is true. "Longest" is wrong; so take your pencils and cross It out. Draw a line through it because it is wrong. Look at the second example just below: 2. The day dawned bright and dreary; the clear morning light streamed in through the windows and filled the room with its cheery brightness. In this paragraph, also, there is one, and only one, word that is wrong, the meaning of which does not fit in with the meaning of the rest of the paragraph. The word is "dreary". Cross it out. Two additional illustrative exercises were given and the pupil directed as follows: And now — everyone attention! In each of the paragraphs on the other side of the page, there is one, and only one, word that is wrong, which makes the para- graph untrue, or whose meaning does not fit in with the meaning of the rest of 11 the paragraph. Cross that Avord out. And remember, there is only one word in each paragraph that is wrong. Be sure to take the paragraphs in order. Never skip a paragraph without attempting it. Read rapidly and accurately. You will be given lo minutes in which to work. Ask no questions. Now, turn over the page, and all start! In the vocabulary tests the following directions, which are printed on the test papers, were read to the pupils: Below are lOO words which are designed to measure the size of your vocabulary. Consider each one carefully, and place before it one of these four marks: (i) the mark "D" if you could define it as exactly as words are ordinarily defined in the dictionary. (2) the mark ''E" if you could explain it well enough to give some idea of its meaning to one who is not familiar with it, though you could not give an exact definition that would satisfy an expert. (3) the mark "F" if the word is merely roughly familiar, so that you have only an indefinite idea of its meaning and could not use it intelligently. (4) the mark "N" if the word is entirely new and unknown to you. When you have finished, count the marks and fill out these blanks, making sure that the numbers add to one hundred. In the fourth grade these directions were modified somewhat in order to make certain that the pupils would understand them. Fifteen minutes were allowed for the test in both grades. The Cancellation Tests consist of a page of Spanish text. For the "a — t" test the following directions were given to the pupils: On this paper you will find a large number of words from a foreign language. Draw a line through each of these words which contain both an "a" and a "t." If the word has an "a" but not a "t" in it do not cross out the word. If it has a "t" but not an "a" do not cross it out. Be sure to draw a line through all words which contain both an "a" and a "t," but only through these words; for if you cross out a word which does not have both an "a" and a "t" in it, it will count as a mistake. When I say "begin" turn over your paper and begin work. You will be allowed two minutes to work. Your score will depend on the number of words you cross out correctly. In addition to this explanation of the test, four non-consecutive words were selected from the text and written on the blackboard in order to illustrate the kind of words to be crossed out. The ex- planation for the "e — r" test is identical with the above except that "e" and "r" are used in the place of "a" and "t." In the Memory Tests the pupil was directed as follows: This is to be a test to see how well you remember what you hear. I am going to read a little story, and I want every one to pay close attention; for as soon as I have finished I want you to write down, in as nearly the same words as possible, what I have just read to you. Listen carefully, and as soon as I stop reading write down all that I have just read. Your score will depend on how nearly you re- member what has been read to you. Do not begin to write until I have finished 12 reading. Is there anyone who does not understand just exactly what he is to do? All right! Attention! In the composition test the following topics were written on the blackboard. Then the directions given below were read to the pupils: AN EXCITING EXPERIENCE. A storm. An unexpected meeting. An accident. In the woods. An errand at night. In the mountains. A wonderful story. On the ice. A runaway. On the water. I want you to write me a story. It is to be a story about some exciting ex- periences that you have had, or about something very interesting that has happened to you. If nothing of the sort has ever happened to you, then tell me of an ex- citing experience someone whom you know has had. You may even make up a story of this kind, if you have to, though I believe you will do better, on the whole, with a real one. I am going to give you about twenty minutes in which to write. You are to write on both sides of the paper, to do all the work yourselves, and to ask no questions at all after you begin. You may make whatever corrections you wish between the lines. There will be no time to rewrite your story. I have written the general subject on the blackboard, together with some sug- gestions. You do not have to write on any of these topics unless you want to; they are merely to help out in case you cannot think of an exciting experience yourself. Is there anyone who does not understand just what he is to do? All right! Begin! Twenty minutes were allowed for the actual writing. Then the pupils were directed as follows: You are to have four or five minutes in which to finish your stories, make corrections, and count the number of words written. Write this number at the end of your story. Description of pupils' performances. In order to eliminate or reduce accidental errors and subjective errors to a minimum, all test papers were scored independently by two persons working under careful supervision. In the case of those scores for which the sub- jective factor was negligible, any differences between the two scores were reconciled by a third person.^ ^ When a subjective error was involved the average of the two scores was taken unless the differ- ence between them exceeded a fixed maximum. In this case the paper was scored by a third person in an attempt to reconcile the two scores. The description of a pupil's rate of reading is objective. Hence only accidental errors are involved. The rate was expressed in terms of words per minute. The scoring of comprehension in the ^This third person was the same for all tests, and also was the one who super- vised the scoring. 13 following tests was also highly objective: Monroe's Standardized Silent Reading Tests, Courtis' Silent Reading Test No. 2, Cross-Out Silent Reading Tests, Pressey's Silent Reading Test, and Cancella- tion Test. Monroe's Standardized Silent Reading Tests were scored for comprehension according to the usual directions with a few slight changes with respect to the answers which were considered correct. The pupil's comprehension score is the sum of the comprehension values of the exercises which he does correctly. The directions which accompany the Courtis Silent Reading Tests No. 2, provide for two measures of comprehension, the index of comprehension and the number of questions answered. The index of comprehension is found by subtracting the number of wrong an- swers from the number of right answers and dividing the difference by the number of right answers. In addition to these two scores the number of right answers was recorded. Two methods of scoring the Cross-Out Silent Reading Tests for comprehension were used. It was found that pupils made two types of errors. Some crossed out words which should not have been crossed out, and words which should have been crossed out were not marked. One description was obtained by taking the dif- ference between the number of words correctly marked and the number of words wrongly marked. (This included only the first type of error.) This score is indicated by the symbols c — w. In the second score, the number of inconsistent words, which the pupil failed to mark in the part of the test read, was recognized. c — — w The score was obtained by evaluating the following fraction, ^ — -j- — In this fraction c and w have the same meaning as above and o stands for the number of words omitted.^® In the Pressey Silent Reading Test a pupil's comprehension score is the number of exercises which he does correctly within the time allowed. In order to have an exercise counted as right the correct word must be crossed out and no other word in the para- graph marked. The Vocabulary Test was scored according to standard direc- tions." Each "D" and "E" was regarded as indicating one point and each "F" as indicating a half-point. (See page 12.) The total "Whipple, G. M. Manual of Mental and Physical Tests. Simpler Pro- cesses, p. 313. "Whipple, G. M. Manual of Mental and Physical Tests, Part II, Complex Processes, p. 310-11. 14 number of points represents a vocabulary-index. This index, taken as a percent and multiplied by 18,000, affords a measure of the size of the pupil's total vocabulary. In the cancellation tests the score was obtained by convertincr rate and accuracy into a single index of efficiency (E).^^ -pj^jg -^^^^^ was obtained by the following formulae: A= ^ E=e A c-j-o Here A == the index of accuracy. E = the index of net efficiency. e = the number of words examined. o == the number of words erroneously omitted, c = the number of words crossed, w = the number of words wrongly crossed. After computing the index of accuracy the score in terms of the in- dex of efficiency was obtained. The scoring of answers to questions obtained from Fordyce's Scale for Measurement of Achievement in Silent Reading and from the Experimental Reproduction Tests is less objective than the scor- ing of the tests just described. Fordyce gives a list of correct an- swers. This, together with the nature of the questions, makes the scoring of his test highly objective for its type. In the course of scoring the answers to the questions of the Experimental Reproduction Tests, lists of correct answers were compiled and all scoring was done in accordance with them. The acceptable answers were chosen with care from the complete array of all answers given in each of the tests. Any word or group of words judged to give correctly the total idea called for by the question was counted as correct. Scoring Reproductions. The reproductions obtained from Brown's Silent Reading Test, Starch's Silent Reading Tests, the Experimental Reproduction Tests, and the Memory Tests were scored by both the "idea-counting method" and the "word-counting method." In addition, Brown's tests were scored according to the directions which he gives. The description of a reproduction is not highly objective. Pupils differ widely with respect to vocabulary and to sentence structure. In addition to incorrect statements, re- productions contain superfluous statements and repetitions. The order of ideas is frequently transposed so that their significance is modified. Ideas contained in the passage read are expressed with "Whipple, G. M. Manual of Mental and Physical Tests, Part I. Simple Pro- cesses, pp. 312-13. 15 various degrees of completeness. These characteristics of reproduc- tions create many opportunities for differences of opinion In their description. 1. The idea-counting method. The first step in using this method is to divide the selection read Into ideas. In making this division one may adopt a relatively small unit, which is essentially a word or phrase, or a large unit, which approximates a sentence. After experimenting with these two plans of division the former was chosen. A portion of Brown's Silent Reading Test, "The Long Slide," with the divisions indicated, is reproduced below: THE LONG SLIDE The boys / and girls / who live / in a certain part / of a small / town/ in the country / several miles / from any village / attend / school / in a little / red / school- house / known as / the Long Hill / school. / It has / this name / because / it is situated / on the top / of a very long / steep/ hill./ Ever since anyone / can remember, / the scholars / of the Long Hill / school / have always had / time / to slide / down the hill / just once / at recess / in winter / and get back / to the school house / before the bell / rings / to call them back again / into school. / They can go down / very rapidly, / but it takes / a long time / to walk back./ Last Monday / morning / Frank Lane / appeared / at school / with a fine / new/ sled. / It was a double-runner / which his uncle, / who owns / a carriage factory / in the city, /had given him. / He named / his new / sled / the Simoon / and almost had/ a fight /with Tom Smith, / who said / it was foolish / to put / such a name / on a sled, / but he kept on / calling it / the Simoon. / At recess / that day / Frank / invited / the whole / school / to go / for a coast/ and the twelve / boys / and girls / got onto / the sled / and away they went / down the steep hill. / When recess was over / Miss Black, / the teacher, / rang the bell / but not a scholar / appeared./ Thinking that / the children / had stopped / to play / on the way back / from their slide, / Miss Black / went / to the door / and looked / down the hill / and rang / the bell / again./ But not a scholar / was in sight./ Then she was greatly astonished / and began / to be very angry, / for nothing / like this / had ever happened / in all of her twenty-eight / years / as a teacher. / She waited / and waited / but still / no scholars / appeared. / She stopped / every team / that came / up the hill, / but no one / had seen / anything of them. / She stayed / at the schoolhouse / and wondered / what had become of / her children / until it was time / to let out / school / and then / she went / over to John Reed's / who lives / nearest to the school house / and whose son / and daughter / were among the missing / scholars. / Mr. Reed / was greatly frightened / at what Miss Black / told him / about the disappearance / of her school / and immediately/ hitched up / his horse / to go in search / of the lost / children. / Just / as he was driving / out of the dooryard / the scholars / appeared / far down the hill. / It was almost / dark / before / they got back / to the schoolhouse. / The pupil's score is the number of ideas which he reproduces correctly. Thus, the scorer must determine what ideas, occurring in the passage read, appear in the pupil's reproduction. Two rules were adopted. 1. Misplaced clauses and phrases, that is, clauses and phrases which are tacked on to the wrong part of a sentence, are to be counted as incorrect. 2. Correct ideas found in a statement, which, as a whole, Is directly contrary to the meaning of the text read, are to be counted 16 as correct. The following example may be cited: John Shafts was not cruel. Here, both the ideas, John Shane and cruel, are held to be correct, while was not is incorrect. In practically aJl cases com- ing under this rule the incorrectness of the statement was caused by the use of a wrong verb or a wrong adverbial modifier, as in this illustration. The scorers were urged to keep in mind the general rule that they were to match up identical ideas in the passage read and in the pupil's reproduction, even though sometimes the ideas were not expressed in the same language. In order to secure independent scorings, each selection, with the divisions into ideas indicated as shown above, was mimeographed. The scorer indicated on this mimeographed copy the ideas which in his judgment the pupil had reproduced. In this way no record of the scoring was made on the pupil's test paper, and complete independence of scoring was secured. In putting together the results from two independent scorings, when the difference in the number of ideas was six or less, the av- erage was taken. In the case of a difference of more than six the third person went over both papers to change too lenient or too severe scoring. These changes were made until the difference was reduced to six or less. Then the average was taken. Brown's method of idea-counting. Brown has given directions for describing the reproductions written by pupils in terms of "quantity of reproduction" and "quality of reproduction." As a basis for his method of scoring, the selection is divided into sections each of which he considers to represent a unit of thought. A por- tion of "The Long Slide" is reproduced to show his plan of division: THE LONG SLIDE The boys and girls who Hve in a certain part of a small town in the country several miles away from any village attend school(i) in a little red schoolhouse known as the Long Hill School. (2) It has this name because it is situated on the top of a very long, steep hill. (3) Ever since anyone can remember, the scholars of the Long Hill school have always had time to slide down the hill just once at recess in winter and get back to the schoolhouse before the bell rings to call them back again into school. They can go down very rapidly, but it takes a long time to walk back. (4) Last Monday morning Frank Lane appeared at school with a fine, new sled. {5) It was a double-runner which his uncle, who owns a carriage factory in the city, had given him. (6) He named his new sled the Simoon(7) and almost had a fight with Tom Smith, (8) who said it was foolish to put such a name on a sled, but he kept on calling it the Simoon. (9) 17 At recess that day Frank invited the whole school to go for a coast, and the twelve boys and girls got on to the sled and away they went down the steep hill.(io) When recess was over, Miss Black, the teacher, rang the bell but not a scholar appeared. Thinking that the children had stopped to play on the way back from their slide, Miss Black went to the door and looked down the hill and rang the bell again. But not a scholar was in sight.(ii) Then she was greatly astonished and began to be very angry, (12) for nothing like this had ever happened in all of her twenty-eight years as a teacher. (13) She waited and waited, but still no scholars appeared. (14) She stopped every team that came up the hill, but no one had seen anything of her school. (15) She stayed at the schoolhouse and wondered what had become of her children until it was time to let out school (16) and then she went over to John Reed's, who lives nearest to the schoolhouse (17) and whose son and daughter were among the missing scholars. (18) Mr. Reed was greatly frightened at what Miss Black told him about the disappearance of her school (19) and immediately hitched up his horse to go in search of the lost children. (20) Just as he was driving out of the dooryard, the school appeared far down the hill. (21) It was almost dark before they got back to the schoolhouse. (22) The idea which he considered expressed in each of these sec- tions has been condensed in a short statement. These form a key for scoring. The statements corresponding to the sections in the portion of the test reproduced above are given below: 1. Some children in the country attend school. 2. The schoolhotise is known as the I^ng Hill School. 3. It is situated on top of a long hill. 4. The pupils slide down hill once at recess in winter. 5. One day a boy brought to school a nezu sled. 6. His uncle had given it to him. 7. He named it the Simoon. 8. He almost had a fight with another boy. 9. This boy said the name was foolish. 10. At recess the pupils went for a slide. 11. At the end of recess no pupils appeared. 12. The teacher was astonished and angry. 13. Nothing like this had ever happened before. 14. After a long wait no scholars appeared. 15. No one in passing teams had seen her school. 16. She stayed at school until closing time. 17. Then she went to the nearest neighbor. 18. His children were among the scholars. 19. He was gieatly frightened. 20. He started to search for the children. 21. Just then they appeared down the hill. 22. They reached the schoolhouse just before dark. For using this key he gives the following directions :^^ "Brown's statement of these directions has been modified in order to make their meaning clear. 18 1. Each child's written reproduction should be carefully ex- amined, and the number of points in the key which are reproduced by him should be determined and expressed as a percent of the total number in that portion of the selection read. For example, in the part read by a certain child, there may have been forty-eight points, and he may have reproduced twelve of these. The amount repro- duced is, therefore, twenty-five percent of the amount read. This is called "quantity of reproduction". In arriving at a measure of quantity of comprehension, every idea reproduced by the child should be counted which, in most respects, is complete and which, in general, is correcdy stated, even though some of the less impor- tant details are lacking. Credit for quantity of comprehension is given only when all elements of the idea expressed by the words in italics in the key are either expressed or plainly implied in the child's reproduction. 2. The reproductions should be examined a second time and only those ideas counted which are entirely correct in every respect and of which every detail is reproduced. This is called "quality of reproduction". 2. The word-counting method. In applying this method, a pupil's reproduction is examined and the words which do not cor- rectly reproduce the selection read are crossed out. The pupil's score is the number of words remaining. The directions for cross- ing out words were essentially the same as those used by Starch in scoring his own silent reading tests. The scorers were directed to cross out the following classes of words: (a) Words which incompletely reproduce the thought. (b) Words which introduce new ideas. (c) Words which represent ideas reproduced elsewhere. (d) Superfluous connectives. The scorers were, also, directed to bear constantly in mind that the aim of this method is to ascertain the number of words which actually reproduce the thought contained in the passage read. In order to secure independence on the part of the scorers when using the word-counting method, the lines of the reproductions were num- bered. Sheets of ruled paper were then prepared with numbered lines. In scoring the reproductions, the words to be omitted in a line, when computing the pupil's score, were written on the corre- sponding line of the sheet of ruled paper. The number of words remaining in the line of the reproduction was then recorded in the right hand margin. The sum of these entries constituted a pupil's 19 score. No mark other than the numbers of the Hnes of the repro- ductions was made upon the pupil's test paper. Thus, the second scorer was not influenced in any way by the work of the first. The two independent scorings were reconciled by a third person, accord- ing to the rules given in the case of the idea-counting method, except that a difference of eight rather than of six was allowed before re- scoring was undertaken. This exception does not apply to the Memory Test. Subjectivity of describing reproductions. An examination of the records of scoring the reproductions shows many differences of opinion on the part of the scorers. One scorer gave credit for certain words or ideas which the other scorer rejected, while the second scorer gave credit for words and ideas rejected by the first scorer. These differences of opinion tend to balance each other in the resulting scores but not entirely. For some reproductions, two persons will give the same score. For others, the two scores will differ. In a few cases the difference will be marked. Whenever there is a difference, at least one score, and probably both, involve an error.-** Even when the two scores are identical both may in- volve an error. Constant errors and variable errors. The scoring of reproduc- tions even under favorable conditions, such as prevailed in this investigation, involves two types of errors — constant errors and vari- able errors. A constant error results in a scorer assigning scores which, in general, are too high or too low. A liberal attitude toward the reproductions will result in high scores. On the other hand, a conservative procedure will result in low scores. An indication of the presence of a constant error may be secured by comparing the averages of the two sets of scores assigned independently by two scorers to the same set of papers. Any differences in their general policy will be reflected by a difference between the averages of the two sets of scores. However, this difference cannot be considered to be an index of the magnitude of the constant error because both per- sons may be inclined to be liberal in their scoring, or both may be conservative, or one may be conservative and the other liberal. Variable errors are indicated by the fact that in scoring one reproduction Scorer A will assign a score of 90, and Scorer B a score of 75; but in scoring a second reproduction Scorer A may assign a score of 60, and Scorer B a score of 80. This may happen although ^A score is said to involve an error when it differs from the true score which is defined as the average of a large number of scores assigned by different persons 20 Scorer B is, in general, more liberal than Scorer A. In studying the variable erorrs it is necessary to isolate them from the constant er- rors. Constant errors which affect the average of the scores as- signed by either person do not affect the coefficient of correlation. Hence, it may be used as an index of the magnitude of the variable errors. Tables I and II give data relative to both the constant and variable errors involved in the word-counting and in the idea-count- ing methods. Table I shows the facts for the first method and Table II for the second. The scorers are represented by letters. The numbers in the column headed "Difference of Average Scores" were obtained by subtracting the average of the scores assigned by the second scorer from the average of the scores assigned by the first scorer. A positive difference means that the first scorer gave, on the average, higher scores than the second. A negative differ- ence has the opposite meaning. In some cases the difference closely approximates zero, but in others it is relatively large. This indi- cates that, for some scorers, the constant error is relatively large. One is justified in asserting that, on the basis of the possible con- stant error in the scores assigned to reproductions by a single scorer, no reliable inferences can be made concerning the differences in reading ability of two groups of pupils unless the differences between their average scores are large. TABLE I, SUBJECTIVITY OF SCORING REPRODUCTIONS BY THE WORD- COUNTING METHOD Test Memory Memory Memory Memory Memory Memory Reproduction.. Reproduction.. Reproduction.. Reproduction.. Reproduction.. Brown Brown Starch (No. 7) Starch (No. 6). Form Grade Num ber of scores Scor- ers Difference of average scores P.E. Est.ii P.E.Est.u Average IV IV IV VII VII VII IV IV IV VII VII IV IV VII VII 92 27 116 123 100 31 94 31 68 117 "3 III no 119 121 Y-C Y-K Y-C Y-K Y-C Y-K L-K L-C L-K M-F F-C T-Mj T-Mj M-C M-C —9.9 —5-1 — 2.0 —7-5 —8.2 +4.1 +6.8 —1.6 +4.7 —0.5 —6.0 + 12.8 +6.9 -5.8 — 2.0 4 -5 3 4 3-3 5-5 3-9 2.6 31 2-4 4-2 9.2 5-5 2.6 2.1 .06 .04 ■05 •05 .04 •03 .06 .06 .10 .06 •05 •15 .08 .07 •05 21 TABLE II. SUBJECTIVITY OF SCORING REPRODUCTIONS BY THE IDEA- COUNTING METHOD Test From Grade Num- ber of scores Scor- ers Difference of average scores Tit P.E. Est.it P.E.Est.ii Average Memory Memory Memory Memory Reproduction.. . . Reproduction... . Reproduction... . Reproduction.. . . Brown* '\ IV IV VII VII IV IV VII VII V V IV IV IV IV VII VII 121 116 122 128 94 100 116 112 77 75 112 116 113 118 122 124 Y-P Y-P Y-P Y-P F-P F-P F-P S-F Cl-S Cl-S P-C P-C P-C P-C S-Cl S-Cl +0.1 -t-0.6 + 1.0 4-0.6 —0.6 +0.7 —7-9 +0.7 +0.4 + 1-5 +8.7 +7-8 -6.7 +0.1 —2.3 —1.0 ■95 .84 .89 •85 •94 ■ 95 ■ 91 .88 .88 .85 .69 •75 .68 .56 .92 •95 I .1 I .1 1.6 1 .0 1.6 14 5-6 4^5 2-5 2-4 8.4 6.1 5-2 1.6 1-3 .04 •05 .04 .04 .07 .08 .08 .10 .10 Brown* 1 1 Brown, Quantity Brown, Quantity Brown, Quality... Brown, Quality... Starch (No. 7)... . Starch (No. 6)... . .18 .16 .24 •30 .08 .08 •Brown I is The Long Slide; Brown II, A Morning Adventure. It appears that a scorer is not always consistent with respect to his constant error. In Table I, Scorer Y and Scorer K show neg- ative differences for two sets of papers and a positive difference for a third set. The same condition is exhibited by Scorer P and Scorer C in Table II. This reversal of policy may be due in part to differ- ences in the character of the reproductions, but, doubdess, the in- stability of subjective judgment is also a factor. In the column headed "r^g", the coefficient of correlation be- tween the two sets of scores is given. In the next column the proba- ble error of estimate is given. This was calculated by the formula,^^ P. E. Est.,. =.6745 (J^/\^^ 2iThe probable error of estimate for two sets of related data is given by the formula P. E.Estii = -6745 CTi \/ I — r?2 (See Yule, Introduction to the Theory of Statistics, Page 177.) In this formula r,j is the coefficient of correlation between the two sets of data and CTj is the standard deviation of the corresponding distribution. The probable error of estimate for the first set of scores (P. E. Est.i) is a measure of the amount of change which would be necessary to bring these scores into perfect corre- lation with the other set of scores. Professor T. L. Kelley has shown that the corre- lation between one set of obtained scores and the corresponding true scores is given by the formula, rit = l/r,,. Therefore, the formula, P. E. Est.it =.6745 Cil/i— r,, gives the probable error of estimate of the first set of scores with respect to the cor- responding set of true scores. A similar formula would give the probable error of estimate for the other set of scores. Since both sets of scores were assigned to the same set of reproductions, the best measure is the average of the two formulae. Hence, the Courtis Silent Reading Test, No. 2, exhibits the least lack of discrimination. The Cross-Out, Pressey, Fordyce, and Form 3 of Monroe's tests exhibit such great departures from the normal dis- tribution that they must, obviously, fail to discriminate properly with respect to the rate of reading for a considerable number of pupils. In the case of comprehension, the distributions of scores for Monroe's Standardized Silent Reading Tests closely approximate the normal. The third form appears to have been a little too easy;, but, in other respects, the irregularities exhibited by the distribu- tions cannot be considered to indicate a serious lack of discrimina- tion. The index of comprehension for the Courtis Silent Reading Test, No. 2, fails to discriminate properly between a number of pupils. Both the number of questions answered and the number of questions answered correctly approach more nearly the normal distribution.. 37 TABLE X. CORRELATIONS WITH TEACHER RATING Test Rate Grade IV Grade VII Comprehension Grade IV Grade VII Monroe I. . . Monroe II. . Monroe III. Court Court Court Court Court: Court Court Court s I Index s I Questions sl Questions Correct., s I Words per minute. . sIII Index. s III Questions s III Questions Correct, s III Words per minute. .38 .34 • 43 •SI Brown Brown Brown Brown Brown Brown Brown Brown Brown Brown Brown Brown Brown Brown I Quantity I Quality I Average I Efficiency I Words I Ideas I Words per minute. II Quantity II Quality II Average II Efficiency II Words II Ideas II Words per minute. Starch I Words Starch I Ideas Starch I Words per minute. Starch II Words Starch II Ideas Starch II Words per minute. Reproduction I Questions Reproduction I Ideas Reproduction I Words Reproduction I Words per minute. Reproduction II Questions Reproduction II Ideas Reproduction II Words Reproduction II Words per minute. Cross-Out I Cross-Out I C-W C-W C+0 Cross-Out I Words per minute. Cross-Out II C-W C-W Cross-Out II c+o Cross-Out II Words per minute. Fordyce. Pressey I. . Pressey II. Composite AI . . Composite All. Composite BI. . Composite BII. Composite CI. . Composite CII. Composite I. . . Composite II. . .36 .32 .36 . 19 • 41 .26 ■ 55 • Sl .29 .08 .60 .64 .63 ■ 29 . 29 • 41 • 45 • 38 •SI .58 .60 .44 • 34 • 59 • 56 •32 .50 • 39 .46 • 46 • 51 .49 • 34 • 47 • 23 • SI • 49 .46 . 21 .27 .46 • 37 .40 • 55 •S3 .58 .51 .63 • 58 • 58 38 This is particularly true of the latter. The distributions for the Brown, Starch, and Experimental Reproduction Tests exhibit many- irregularities; but there is in all cases a distinct resemblance to the normal distribution. A few of the distributions approach very closely the normal one. Others contain rather marked departures from it. In the case of Brown's test, the distributions for the quality scores exhibit greater departures thant he distributions for the quan- tity scores. Comparison with teachers' ratings. All scores, both rate and comprehension, were correlated with the ratings in silent reading given by the teacher. The coefficients of correlation were cal- culated, also, for certain composite scores. These coefficients of correlation are given in Table X. With the exception of one coeffi- ient for the second form of Brown's test, all coefficients are positive and in general sufficiently large to indicate a distinct positive re- lationship between the test scores and the teachers' ratings. Rate of reading correlates more highly with the teachers' rating in the fourth grade than in the seventh. For rate, the average of the coefficients, not including the composite scores, is 43 in the fourth grade and 26 in the seventh. The average of the coefficients for comprehension, not including the composite scores, is 40 in the fourth grade and 44 in the seventh. In the fourth grade, comprehension, as measured by Monroe's Standardized Silent Reading Tests, correlates most highly with the teachers' ratings. In fact, the coefficients for the three forms of this test equal or exceed all of those for the composite scores. In the seventh grade this test does not exhibit as high correlations with teachers' ratings. Neither do its rate scores correlate as highly with teachers' ratings as the rate scores yielded by some other tests. It is interesting to note that the correlation between the second form of Brown's Test for "quantity of reproduction" and "quality of reproduction" is essentially zero. For Form i the correlations for these two scores are lower than the correlations for any other scores. This suggests that Brown's method for scoring his test is undesirable. The correlations of the composite scores with teachers' ratings in- dicate that, in the fourth grade, teachers judge silent reading ability more on the basis of the pupils' ability to answer questions than of their ability to reproduce. In the seventh grade, the teachers give greater weight to the pupils' ability to reproduce or to tell what has been read. Correlation of comprehension with memory. In those tests which require the pupil to answer questions from memory or to 39 TABLE Xr. CORRELATION OF COMPREHENSION WITH MEMORY Test Brown I Quantity Brown II Quantity Brown I Quality Brown II Quality Starch I Ideas Starch II Ideas Starch I Words Starch II Words Reproduction I Questions Reproduction II Questions Reproduction I Ideas Reproduction II Ideas Reproduction I Words Reproduction II Words. . . , Monroe I Monroe II Monroe III Maximum Minimum Average Grade IV Ideas •32 .27 .36 • 19 II 29 Words ■ 39 .23 .36 • 14 II 28 Grade VII Ideas II Words II •31 •25 ■47 •34 .26 .20 .36 •35 ■33 ■39 •35 •24 .26 ■47 .20 • 32 TABLE XII. CORRECTED COEFFICIENTS OF CORRELATION OF COMPREHENSION WITH MEMORY Test Grade IV Ideas Words Grade VII Ideas Words Brown Quantity Brown Quality Starch Ideas Starch Words Reproduction Questions Reproduction Ideas. . . . Reproduction Words. . . Monroe I-II Monroe I-III Monroe II-III .67 .68 .66 ■54 40 reproduce the passage read, it would seem that a pupil's ability to remember would materially affect his comprehension score. In order to ascertain the extent to which ability to remember does affect the comprehension score yielded by such tests, the pupils were given the memory test^^ described on page 7. In this test a selection was read to the pupils and they were asked to reproduce the story from memory. The coefficients of correlation between the memory scores and the comprehension scores for silent reading tests are given in Table XI. It is significant that none of these coefficients are large. The first three tests listed in this table require the pupil to give his pe rformances from memory. Monroe's Standardized Silent Reading Tests do not appear to make any considerable demand upon the pupil's memory; he has the passage before him and can read it and re-read it if he desires. If any memory is involved it is im- mediate in character. It is significant that the coefficients of cor- relation for this test closely approximate those for other tests. Corrected coefficients of correlation. The measures yielded by these tests involve variable errors. It has been shown in our consideration of the reliability of these tests that these errors are relatively large for the reproduction tests. The presence of these variable errors tends to reduce the coefficients of correlation, and it is possible that the coefficients of correlation given in Table XI do not represent the true relation between comprehension and memory. When two forms of both tests have been given to the same pupils it is possible to compute a corrected coefficient of correlation which is free from the effect of the variable errors of measurement. This has been done by means of the following formula :2^ 's/(rpiqj) (rpiqi) \/(tpiPi) (rqiqs) rpq here indicates the true correlation between two series of measures, p and q, of the facts A and B. Pi and P2 are two independent measures of A. qi and q2 are two independent measures of B. rpiq.is the correlation obtained from the first measure of A and the second measure of B. rpiqi is the correlation obtained from the second measure of A and the first measure of B. ^*It is assumed that this test measures ability to remember. "Thorndike, E. L. "An Introduction to Mental and Social Measurements." New York. Teachers College, Columbia University, 1916. Page 179. 41 rpiPs is the correlation between the two measures of A. rqiq2 is the correlation between the two measures of B. In applying this formula the factors of the numerator are ob- tained from Table XI. For example, in calculating the corrected coefficient of correlation for Brown's Silent Reading Test with memory, rpiqj is the coefficient of correlation of Brown I with Mem- ory II. This is given as .21. The coefficient of correlation of Brown II with Memory I, is rptqi. This is given as .27. The factors of the denominator are the reliability coefficients of the two tests. These are to be found in Table VIII. They are .36 for Brown's Silent Reading Tests and .35 for the Memory Tests. Substituting these values in the formula, V-2I X .27 Tpq = V-36 X .35 = V.45 = .67 This is the first entry of the first column of Table XII. A study of the corrected coefficients given in Table XII indi- cates that, in the case of the Experimental Reproduction Tests in the fourth grade, the correlation between Memory and the scores based upon the pupil's reproduction is very high. For ideas it is .97. For words it is .88. For Brown's Silent Reading Tests the correlation is not as high. In fact, it closely approximates that for Monroe's Standardized Silent Reading Tests. In the seventh grade the correlation of Memory with Monroe's Standardized Silent Read- ing Tests is higher than that for either Starch or the Experimental Reproduction Tests, although the difference is not marked in the case of the latter. It, therefore, appears that in the seventh grade memory is not a major factor in determining the comprehension scores of tests which require reproduction unless it is also the de- termining factor in the case of tests which do not appear to involve memory. The statement which has been made with reference to reproduction tests, that they measure the ability to read ayjd re- member^ does not appear to be justified by the facts which are pre- sented here. Correlation of comprehension with vocabulary. In Table XIII, we give the coefficients of correlation between the comprehension scores and the scores obtained from the vocabulary test. In the fourth grade most of the coefficients are negative, but all of them cluster closely around zero. This means that, measured by the tests used, 42 TABLE XIII. COEFFICIENTS OF CORRELATION BETWEEN VOCABULARY AND COMPREHENSION Test Monroe I . . . Monroe II. . Monroe III. Courtis I Index Courtis I No. of Questions. . Courtis I Questions Correct. Courtis III Index Courtis III No. of Questions. . Courtis III Questions Correct. Starch I Words . Starch I Ideas. . Starch II Words. Starch II Ideas. . Brown I Quantity. Brown I Quality. . Brown I Average. . Brown I Words. . . Brown I Ideas. . . . Brown II Quantity. Brown II Quality. . Brown II Average. Brown II Words. . . Brown II Ideas Reproduction I Questions. Reproduction I Ideas Reproduction I Words. . . . Reproduction I Questions. Reproduction I Ideas Reproduction I Words Cross-Out Cross-Out Cross-Out II C-W. Cross-Out II C-W . C+0 Fordyce Pressey I . Pressey II . Composite AI. . Composite All. Composite BI. . Composite BII. Composite CI. . Composite CII. Composite I. . . Composite II. . Grade IV Grade VII .02 -•03 -.02 -.20 • 19 .10 -.20 .06 ■15 -.11 -.12 •14 .01 -.04 -•23 -.21 -.16 -15 -.19 -•15 -.10 -.09 .12 -.04 -.04 -.07 -.05 .09 .02 .04 .22 .22 •13 •31 •31 .29 .22 •14 •17 •13 •19 •24 .26 .18 .08 .16 .01 •13 — .21 .00 -.02 ■23 .01 .20 -.08 •32 -.20 .28 -.05 .26 -13 •25 •30 .21 43 there is no relation between a pupil's vocabulary and his ability to read. It is, of course, obvious that, in order to read, a pupil must be acquainted with words. It is, therefore, impossible to believe that vocabulary is not a factor in the reading process. The facts presented here probably mean that, in the fourth grade, vocabulary is not a determining factor and the pupil's ability to read depends primarily upon abilities other than the extent of his acquaintance with words. In the seventh grade the coefficients are all positive but none of them are large. This probably means that, in the sev- enth grade, vocabulary is a minor factor in determining the pupil's comprehension. It is, of course, possible that the vocabulary test used does not measure the extent of a pupil's acquaintance with words. Correlation of cancellation scores with measures of rate of reading. In Table XIV, the coefficients of correlation for the scores yielded by the Cancellation Test with measures of rate of silent reading are given. With few exceptions, these coeificients are positive but small. In general, they are slightly smaller in the seventh grade than in the fourth grade. In most cases, there does not seem to be any marked relationship between ability to do the Cancellation Test and the rate of silent reading. One might expect a distinct positive re- lationship between the Cross-Out Silent Reading Tests and the Pressey Silent Reading Tests. It does, however, appear that the relationship which exists with respect to these tests is greater than that which exists for Monroe's Silent Reading Tests. The table also includes coefficients of correlation for the scores yielded by the Cancellation Test with the comprehension scores yielded by the Cross-Out Tests. The coefficients are, likewise, small, two of them being slightly negative. It appears, therefore, that the ability to strike out letters from words is not related to the ability called for by the Cross-Out Tests. Correlation of comprehension with written composition. An- other measure of a pupil's vocabulary is secured from his written composition. The pupils in the seventh grade were asked to write a composition on an exciting experience. (See page 10.) In Table XV, we give the coefficients of correlation between measures of com- prehension and two measures of these written compositions, the number of words written and the story value. The number of words which a pupil writes in such an exercise is, undoubtedly, an index of his writing vocabulary. It is, of course, possible that his writing 44 TABLE XIV. CORRELATION OF CANCELLATION SCORES WITH MEASURES OF RATE OF READING AND WITH THE CROSS-OUT TESTS Test Grade IV Cancellation IV Grade VII Cancellation II Monroe I . . . Monroe II. . Monroe III. Courtis I . . . Courtis III. Brown I . . Brown II . Starch I . . Starch II. Reproduction I . . Reproduction II. Cross-Out I . . Cross-Out II. Fordyce, No. of Words . Pressey I . . Pressey II . Cross-Out I C-W. Cross-Out I S=^. C-l-0 Cross-Out II C-W. C-W Cross-Out II C-hO .28 .26 ■23 .12 .20 ■25 • 07 .22 ■13 .20 •23 •30 .08 .02 •17 •14 • 14 • 15 .20 • 15 • 14 .16 .08 13 ■ 15 ■13 •07 .10 ■03 .20 .18 .22 .01 •03 .06 ■03 06 -.01 15 •03 OS 10 •25 .22 .18 ■14 •25 ■33 .08 .11 ■03 .06 .11 •15 .21 .18 .11 •05 .16 •17 .11 -.01 *In Cancellation Test I, the words containing both "a" and "t" were marked; in Test II, those containing both "e" and "r." vocabulary and his reading vocabulary are not closely related. The coefficients of correlation, in Table XV, show that there is little or no relation existing between measures of comprehension and the number of words which were written in these compositions. Even in the case of comprehension scores based upon the number of words and the number of ideas contained in reproductions, the coefficients of correlation fail to indicate the existence of any marked relation- ship. In fact, the coefficients of correlation for measures of com- prehension gained through reproduction are lower, in most cases, than the coefficients of correlation of the number of words written with the comprehension scores derived from Monroe's Standardized Silent Reading Tests. 45 A higher degree of correlation is indicated between the *'story value" and the measures of comprehension. Some of the coefficients of correlation are sufficiently large to indicate a distinct positive re- lationship between these two traits. It is not unlikely that this re- lationship can be explained in terms of a common general factor, such as general intelligence. Inter-correlation between tests. Since in each grade all of the tests were given to the same pupils, it is possible to calculate the coefficients of correlation between scores yielded by the different tests. These are given in the appendix. The magnitude of the co- efficients of correlation is influenced by the reliability of the scores and, therefore, does not truthfully reflect the relationship which exists between the scores yielded by the different tests. In order to secure more accurate indices of the relationship existing between traits measured by the different tests, the corrected coefficients of correlation have been calculated by means of the formula given on page 41. Since the factors of both numerator and denominator of the formula are square roots, it is impossible to calculate corrected co- efficients when one of the raw coefficients is negative. This .TABLE XV. CORRELATION OF COMPREHENSION WITH WRITTEN COMPOSITION, SEVENTH GRADE, 9O PUPILS Test Monroe I Monroe II Monroe III Starch I Words Starch I Ideas Starch II Words Starch II Ideas Reproduction I Questions Reproduction I Ideas Reproduction I Words Reproduction II Questions Reproduction II Ideas Reproduction II Words Cross-Out I C-W r n . T C-W Cross-Uut 1 — — - C+0 Cross-Out II C-W Cross-Out II — — C+0 Fordyce Percent Pressey I Pressey II 46 Number of words written Story value .18 • 24 .29 ■33 •31 .10 •07 ■14 .09 •31 .28 •36 ■33 .12 .11 .22 •24 14 .18 -.07 .26 .28 .11 •37 •43 •13 •23 .09 .11 .16 .06 .04 .11 .12 .12 .10 ■05 .29 .18 accounts for the fact that certain corrected coefficients are not given in Tables XVI and XVII. It will be noted in these tables that, occasionally, a coefficient greater than i.oo is given. This is due to chance errors in the raw coefficients of correlation which, in turn, are due to the fact that a sample of the total population was used in calculating them. The corrected coefficients are, in general, larger than the corresponding raw coefficients. Table XVI gives the corrected coefficients for the comprehen- sion scores. A significant characteristic of this table is the variation in the degree of intercorrelation between the tests. For example, Monroe's Standardized Silent Reading Test I correlates very highly with the number of questions answered correctly on the Courtis Silent Reading Test, No. 2. It correlates less highly with the other two scores of this test. The degree of its correlation with the other tests is moderately low. It is significant that the corrected coeffi- cients of correlation between the two tests requiring reproduction are not higher. For example, the highest coefficient of correlation between Brown's test and the Experimental Reproduction Test I is ,79. The lowest is .26. The corrected coefficient of correlation between the scores obtained by the word-counting method is .33; for the idea-counting method the coefficient of correlation is .62. The highest correlation between Brown's test and the Experimental Reproduction Test I is for the number of questions answered cor- rectly. In the seventh grade, the corrected coefficients of correlation between the question scores yielded by the Experimental Reproduction Test II and Starch's Silent Reading Test are as high as those obtained from the reproductions. Both Starch's test and the Experimental Reproduction Test correlate nearly as highly with Monroe's Standardized Silent Reading Test as with each other. A number of the coefficients of correlation for the Cross-Out Test are relatively high. It correlates most highly with Monroe's Standardized Silent Reading Test. In general, the coefficients are higher for the scores obtained by C — W than for C W t :^ . The former is probably the better plan of scoring. Table XVI appears to bear out the usual assumption that diff"er- ent silent reading tests measure different phases of silent reading ability. It is very obvious, in a number of cases, that the same traits are not measured by different tests. However, it should be noted that these differences exist for tests that are similar in struc- ture as well as for tests which possess marked differences in struc- 47 i > ? ^^" 1 ■* t-\o ' "- Os 00 Iv \j-t tv 0+3 AV-3 VO 1^^ M O "100 ■* ^ so SO so tv AV-D 00 00 o- 00 c^'O ^ O- <^so -* tv Ln tvoo o O+O M-3 >n - •* \^\0 w^ ^^ t^ un <^ ■* fvso ■* A\-3 00 *• 1^ — IN. tn 00 s? « 00 c^ tv loOOOO t- suoijsanf) t-~ " 00 m wi so tN ►- tv ■*• M OsiOOOOO SPJOM ■*^o 00 "S^J^^, C>vO t^ «n00 yr, OvOO SE3PI r^ Cvoo f'.vO t^ 00 OM/lOM tN t^ O O t^ so OsOO suopsanf) OOO so lo f^tv Os SO t-00 9B3pj r;'^:^ sF S, Os Os c^so O SO 00 OsOO spjoA\ I^VO 1^ I-- OS ■* U-, ■* m ■O « ■* IM OsOO N- tvOO 00 00 > -a 2 SESpi t> r^ O ■ri- \J-) t^ t^^OvO tN l^ 00 tv •- tvOO Os 8PJ0A\ KO t^F^ 00 r^ OsOO O tv\0 00 33EJ3AY 00 t^ Ul tN Os >n ■*00 00 ■<(• XlIIBtlf) ^- 00 1^ Os rn > „ "^ 2 6<3 a33JJ03 suoiisanf) O^ OCC 00 a. t^ I^ xi-wn s_ 5 'Soo O . rooo suousanf) }o -ON ir> t«.vO OVO >n o. ^ OsOO r) tvsoso •dtuo3 ;o xapuj i^^o r^ O U100 O O OtnOO so IS m ■* lo > o III-II Os so ^ IV « ■* Os Os IvOOOO III-I 00 o> 00 00 00 00 tN so SO O Os ■»*• t*1 Os ■* IVOO II-I tNI-~ LOOM> 00 so ^ so N 10-* OS wi tvoo > o IIl-II r^ moo SO 00 00 SO 1- ■* r^ yr\ tv ^ N Os -"too III-I ^ (-~ o t', Osso so imn R - tv NSO II-I c VC lO. li 1 -^ ^5 so O ■ s ■>! c W c 2 S c .2 c II ° V III o c U c o d > > c > < 1 1 1 K U c 6 48 Grade VII Comp. A so oo t SO o so CO Ov Grade VII Comp. OS C?N 0\ oo Oi CO oo CO On Grade IV Comp. o ^ o On Cs Os -J- On On oo O 0\ CO oo Grade VII Pressey co oo oo '^ 1^ ■>*■ O so Grade yii Cross Out so CO oo so Grade IV Cross Out On 0\ Os so oo so CO CO oo Grade VII Reprod. ro r^ cs oo oo ^ r^ so •^ ON o Grade IV Reprod. Tj- Tj-N£) r^oo so o ON On CO o On Grade VII Starch so so so C4 r^ r- i>-i ST) oo so oo Grade IV Brown On CO f^ SO r- r- oo 0\ oo so On oo Grade IV Courtis oo o ON so On Grade VII Monroe I- I- II- II III III NO SO so so C4 SO r- oo CO oo SO so SO cs oo •-0 oo oo ON ON On so CO Grade IV Monroe I- I- II- II III III c- oc c- r c ^ 1 so 1 so so '*• oo C c -1 c ^ c ^ c c ^ ^ ^ *1 HJ c u. c c 7 1 u u 3 U 2 m u a c/2 c _c u 3 "a s D C U > HJ CLi - 1 , < E o U u 49 ture. In fact, the variations in these corrected coefficients of corre- lation are so erratic that one is inclined -to be skeptical of any con- clusions which may be drawn from them with reference to the functions of the different tests. The corrected coefficients for the rate scores are given in Table XVII. These are, in general, higher than those for comprehension. In general, the correlation between tests in which the pupil reads continuously is higher than between one test in which the pupil reads continuously and another in which his reading is not contin- uous. However, the correlation between Monroe's Standardized Silent Reading Test I and the Cross-Out Test, in the fourth grade, is as high as that for any of the other tests. The fact that some of the tests were too short and failed to discriminate between a consid- erable number of pupils probably accounts for the fact that a num- ber of coefficients of correlation are not higher. An examination of this table indicates that the rate score secured by means of Monroe's Standardized Silent Reading Tests is a true measure of the pupil's rate of reading. Correlation of single tests with composites. In Tables XVI and XVII, the corrected coefficients of correlation for each test with cer- tain composite scores are given. These, in general, are larger than the coefficients of correlation between single tests. In the fourth grade, composite A for comprehension is the average of Monroe, comprehension, Courtis, answers correct, and Reproduction, answers to questions. In the seventh grade, the Courtis test was not given and this composite includes only the other two tests. Composite B for comprehension is the average of the comprehension scores de- rived from reproductions. In the case of Brown's Silent Reading Tests, both quality and quantity are used. In the other cases, the scores obtained by both the idea-counting method and the word- counting method are used. Composite C is the average of composite A and composite B. The general composite is formed by combining all of the scores obtained. Monroe's Standardized Silent Reading Tests are shown to cor- relate very highly with composite A. The correlation with com- posite B is very much less, as might be expected. The rate scores derived from this test also correlate very highly with the general composite scores. In fact, with the exception of Pressey's test, the correlation of single tests with the composite scores is very high. It appears, therefore, that each of the tests yields rate scores whicb 50 may be accepted as correlating very highly with the true rate of silent reading. The scores derived from the Experimental Repro- duction Tests in the fourth grade correlate more highly with com- posite B than those derived from Brown's Silent Reading Test. In the seventh grade, the correlations between Starch's test and com- posite B are slightly higher than those for the Experimental Repro- duction Tests. It appears, however, that the Experimental Repro- duction Tests yield approximately as valid measurements of ability to comprehend as are secured by means of the other tests which, presumably, have been devised with greater care. SUMMARY OF CONCLUSIONS. 1. The scoring of reproductions is so highly subjective that a silent reading test requiring reproduction of material read cannot be considered satisfactory. 2. Brown's Silent Reading Test is very unreliable for both comprehension and rate. This is true, even when the average of two independent scores is used as a measure of comprehension. 3. The correlation between scores yielded by the memory test and comprehension scores based upon reproductions is only slightly higher than that existing between the scores derived from the memory test and the comprehension scores yielded by Monroe's Standardized Silent Reading Test. This makes doubtful the usual assumption that measures of comprehension based upon reproduc- tions are affected by the pupil's ability to remember. 4. Correlation between extent of vocabulary and ability to read is surprisingly low. There is little, if any, relation between these two abilities. 5. The intercorrelations between tests indicate that different tests measure slightly different traits; but it is surprising to find, in a few instances, a high degree of correlation existing between scores yielded by tests which exhibit marked differences in structure. 6. There appears to be a higher degree of correlation between the story value of written compositions and comprehension than between the number of words written and the measures of compre- hension. This is true even when the measures of comprehension are based upon reproductions and the reproductions are described in terms of the number of words or number of ideas reproduced. 7. In the measurement of rate of silent reading, the Courtis Silent Reading Test No. 2, is shown to have the highest degree of reliability. Monroe's "Standardized Silent Reading Tests, which 51 were intended to yield only very crude measures of rate of silent reading, are shown to be among the most reliable tests. 8. In measuring comprehension, the Courtis Silent Reading Test, No, 2, is the most reliable. 9. The coefficient of reliability is shown not to be a satisfactory measure of reliability. 10. Comparisons with teachers' ratings indicate that, in the fourth grade, teachers tend to judge silent reading ability on the basis of the pupil's ability to answer questions. In the seventh grade, teachers give greater weight to the pupil's ability to reproduce or tell what they have read. Correlation with composites. In Tables XVI and XVII, the corrected coefficients of correlation of each test with the composite scores are given. These, in general, are larger than the correlations between single tests. Monroe's Standardized Silent Reading Test correlates very highly with composite A. This means that this test, which is very simple to administer, yields measures of essentially the same traits as are secured by means of this composite, which in the fourth grade involves three scores and in the seventh, two scores. The correlation with composite C and with the general com- posite is also high. In fact, with the partial exception of Starch's Test, no other correlations are as high as these two composites of the Monroe Silent Reading Tests. It, therefore, appears, as judged by composite scores, that this test yields measures of comprehen- sion which agree more closely with the composite measures secured from this group of tests than any other single test. The correla- tions for rate are also high. 52 THE UNIVERSITY OF ILLINOIS THE STATE UNIVERSITY URBANA DAVID KINLEY, Ph.D., LL.D., President The University Includes the Following Departments The Graduate School The College of Liberal Arts and Sciences (Ancient and Modern Languages and Literatures; History, Economics, Political Science, Sociology, Philosophy, Psychology, Education; Mathematics; Astronomy; Geology; Physics; Chemistry; Botany, Zoology, Entomology; Physiology, Art and Design) The College of Commerce and Business Administration (General Business, Banking, Insurance, Accountancy, Railway Administration, Foreign Commerce; Courses for Commercial Teachers and Commercial and Civic Secretaries) The College of Engineering (Architecture; Architectural, Ceramic, Civil, Elec- trical, Mechanical, Mining, Municipal and Sanitary, Railway Engineering, and General Engineering Physics) The College of Agriculture (Agronomy; Animal Husbandry; Dairy Husbandry; Horticulture and Landscape Gardening; Agricultural Extension; Teachers' Course; Home Economics) The College of Law (Three-year and four-year curriculums based on two years and one year of college work respectively) The College of Education The Curriculum in Journalism The Curriculums in Chemistry and Chemical Engineering The School of Railway Engineering and Administration The School of Music (four-year curriculum) The Library School (two=^year curriculum for college graduates) The College of Medicine (in Chicago) The College of Dentistry (in Chicago) The School of Pharmacy (in Chicago; Ph.G. and Ph.C. curriculums) The Summer Session (eight weeks) Experiment Stations and Scientific Bureaus: U. S. Agricultural Experiment Station; Engineering and Experiment Station; State Laboratory of Natural History; State Entomologist's OfBce; Biological Experiment Station on Illinois River; State Water Survey; State Geological Survey; U. S. Bureau of Mines Experiment Station. The library collections contain May i, 1922, 523,230 volumes and 120,131 pam- phlets. For catalogs and information address THE REGISTRAR Urbana, Illinois LIBRARV OF CONGRESS BULLETINS OF THE BUREAU OF EDUCATIONAL RE- SEARCH, COLLEGE OF EDUCATION, UNIVERSITY OF ILLINOIS, URBANA, ILLINOIS. Price. No. I. Buckingham, B. R. Bureau of Educational Research, Announcement, 1918-19 15 No. 2. First Annual Report 25 No. 3. Bamesberger, Velda C. Standard Requirements for Memorizing Literary Material 50 No. 4. Holley, Charles E. Mental Tests for School Use. (Out of print) 50 No. 5. Monroe, Walter S. Report of Division of Educational Tests for 1919-20 25 No. 6. Monroe, Walter S. The Illinois Examination 50 No. 7. Monroe, Walter S. Types of Learning Required of Pupils in the Seventh and Eighth Grades and in the High School 15 No. 8. Monroe, Walter S. A Critical Study of Certain Silent Reading Tests 50 No. 9. Monroe, Walter S. Written Examinations and Their Improvement. (In preparation) 50 kBiiKK.^^-^P.'^gress ^021 728 7145 HoUinger Corp. pH8.5