LIBRARY OF THE U N 1VERSITY OF ILLINOIS 370 l£ scores contain only 24 percent of the information retained in the C* scores. These data suggest that for maximum reproduction of the total criterion score -set, similarity measures should be used that treat all the information in the variate space. Operationally, however, this 7 does not mean that similarity scores of k - 1 or possibly even k - 2 dimensions must never be used. If the elevation dimension is clearly irrelevant to the similarity under investigation it would be poor judgment to in- clude this information just to increase the dimensions of the measuring space. The decision as to relevancy of the elevation score is sometimes difficult to make. For example, the elevation score may, in some tests, be determined primarily by a positive response set, i.e., the subject tends to reply positively more often than negative. This will create an artificial common factor that may or may not be relevant in the simi- larity under consideration. Many researchers would hold that such response sets are irrelevant and should be treated as error variance or when possible eliminated. On the other hand, it is possible to argue that this positive response set is an indication of an optimistic attitude and as such, is very important in certain similarity measures. It is much more difficult to build a case for eliminating scatter as well as elevation. The fact that some studies have reported positive results using test designs in which these variables have not been con- sidered is not evidence of the value of this procedure. In view of the huge loss of information that our data indicate occurs when scatter is equalised, one would suspect that in spite of the positive results reported other more important differences were probably overlooked due to the inefficiency of the approach. The implications of these extreme losses of information must be interpreted with some caution. One must keep in mind that this study has specified that each of the twenty persons can have scores only on the factors in the Cr space, i.e., it does not permit specific factor scores. 7 When referring to the general variate space of k dimensions, the k - 1 similarity score will be the similarity score with differences in elevation eliminated. Similarly, the k - 2 similarity score has differences due to both elevation and scatter eliminated. 15 It is difficult to predict how far one could generalize, if, for instance, each person had a score on one or more of five factors plus a score on a factor unique to him. Certainly the magnitude of effects in this 25 -space would be reduced but the same trends should be present. The effects under more complex conditions will need further study. Overview of the Total Data of this Study In addition to the three different criterion similarity score -sets discussed above, two obtained similarity score -sets, one without error and one with error, were computed for each comparable score for each of the three test designs. To allow for more adequate comparison of these configurations of similarity scores Table 2 lists these various score -sets and briefly summarizes the properties of each. It will be recalled that each score -set consists of 190 similarity measures. Each of these similarity measures is computed using the Cronbach-Gleser D formula. The D measures that constitute the DQ_ score -set for the total Q-sort and the block Q-sort are computed directly from Q correlations. In the computation of the DO, measures each subject's profile consists of the scores on the 60 items. The mean and variance of these profiles are held constant over all persons by the forcing requirements of the Q-sort design. Each subject, therefore, responds to the Q-sort items under two restrictive conditions and, as a consequence, loses two degrees of freedom in his selection of item responses. Since the total Q-sort contained 60 items, each response distribution has 58 degrees of freedom. Similarly, in the block Q-sort design the 60 items are divided into 12 blocks of five items in each block. Each block repre- sents a small Q-sort. Therefore, a subject loses two degrees of freedom in responding to each block and, hence, loses 24 degrees of freedom in responding to the entire 60-item test. His responses, therefore, carry 36 experimentally independent judgments. It must be kept in mind, however, that the entire item vector space is defined to have five orthogonal dimensions, i.e., a five -factor space. The 53 degrees of freedom or dimensions of the total DQ, similarity scores and the 36 dimensions of the block DQ~ scores represent sub- spaces that lie within the complete five-variate space. Thus, the specificed dimensions of the DQ^ scores of the total and block Q -sorts can not all be orthogonal dimensions. This is not unexpected since the CO O G CO G G -H G g u JH M M rl CD CD CD 0) CD CD 3 P P P P P P P Iw HI (1) o CC o CO cO cO cO cO •H o •rH o o o CJ o P OS P CO ■P CO CO CO CO CO Tj cO cO iH -H >j e «\ t= rv •\ «\ ♦» *\ CO CO P *H g g G G G G c G G G G ri G -H o o o o o o O o O O o O Tl o G cm X) •H •H cm T5 -rl C 0.1 -P •H •rH •H •H •H •H •H •H O CO 5 3 p p P P P P P P •P > rH cd CO •H +" id CO CO CO CO cO (0 cO •H -H CO !> {> CO }> > > r* > > t> > •c+i e H CD CD 0) rH CD CD CD CD CD 'D CD G O -H H G H i~» H CiH H rH r-\ H rH rH rH H C W -a! -P CD CD CO P CD CD CD a> CD a> CD CD CO CD o G CO •\ CD O CD G G G -P J-. CD CD CD 2 P P » r» CO •* X CO CO cd p p G C CO 3 G -H o *\ o ^ •\ •V T5 O G •H G •H G ^1 M •H O cO •P CD p CD CD CD > r-A cO CD -P CD CO CD +3 CD CD P CD CD P> •H p -H > a aJ Ph i> P--P ft ft -P ft ft P ft T5 td 6 CD CO CO cd CD V: cO cO CO cO cO cfl cO CO CXI-rl rH X3 O x hx; o jG A o X X o X H+> W CD CO to CO CD CO CO CO CO CO CO CO CO CO T3 CO CD CO P G 3 CO rH •H ca CO & CO •H CO CD cO -P > cO o G CC O o G cO C CO $h CD P X» «J -P H P o H -P O p tJ CO O t3 CO a? o CO CO o* O CO CO S* CO g «H CD O O CO T3 '. c M CD co MD « W CD "LA -3 (A 1A -H; m "LA _cr rA CA -G- CA O CD ^ S P &H 1 o~\ -^ CA _G "LA -=f ca UN _3" tH 16 G XI X p cm O G O •H P CO G cO H G X P G G cm M O «H P P> CD CO CO G O o to cy e p> •H XJ U P> to r, CJ e o rl scores. Thus, these data indicate that the amount of information losr in the DQ, scores due to forcing increases decidedly as the test items become less reliable. 34 Further research will be necessary to determine the losses of in- formation due to forcing under other test situations. Also, an investi- gation using several rates of error would be valuable to more adequate- ly describe the effect of error on the DQ., similarity scores. This error effect was considered in this study but to a limited degree since only two rates of error were considered. Moreover, one of these was for zero error, i.e., the perfectly reliable case. Information included in the DQ . scores and the effect of cluster scoring for the total Q-sort. The characteristics of the DQ. scores will reflect the advantages or disadvantages of cluster scoring. Briefly, the steps leading to the computation of a DQ. score were as follows: 1. The profile composed of Q scores on sixty items for each in- dividual was transformed into a profile composed of five cluster scores. Each cluster score consisted of the sum of the scores of the twelve items that had loadings of .70 or higher on a particular factor. Thus, each cluster score represented the individual's ob- tained score on one of the five factors that define the test space. 2. The DQ. similarity score was computed using the Cronbach- Gleser D measure for obtaining the similarity between pairs of these five -score cluster profiles. Since all the item score profiles had identical means, the cluster profile means were also equal for all persons. The scatter or variance within each item profile was constant over all persons, but after cluster- ing the scatter within the individual cluster profiles differed over persons. Cluster scoring had apparently made available certain individual differences in profile scatter that were not treated in the profiles consisting of the total Q-socr item scores. It remains to be shown that there is a correspondence between the scatter within these cluster profiles and the scatter within criterion pro- files. If the individual differences in cluster profile scatter have little or no relationship to individual differences in criterion profile scatter, then this cluster scatter is irrelevant information. On the other hand, suppose there is a relationship between the scattgr within these cluster profiles and the scatter within the individual criterion profiles. This would indicate that cluster scoring tends to reproduce some of the in- dividual differences in scatter that are included in the criterion score- set. It is possible to examine this relationship directly by actually com- paring the individual scatter within each criterion profile with the in- dividual scatter within each obtained cluster profile. Since it was not ■ 35 possible to claim any a priori knowledge as to the distribution of these scatters within the individual criterion profiles and obtained cluster pro- files, the method of rank order correlation was followed. The ranks of the scatters within each criterion profile were corre- lated with the ranks of the scatters within each obtained cluster profile for the twenty persons for each test design under both presence and absence of error. The correlations are presented in Table 6. The error free correlations indicate the magnitude of this relationship under ideal conditions. The correlations under the influence of error indicate the TABLE 6 RELATIONSHIP OF INDIVIDUAL CRITERION PROFILE SCATTER AND INDIVIDUAL OBTAINED PROFILE SCATTER AFTER CLUSTER S COKING (N = 20 Persons) Test Design Correlations* of Criterion Scatter and Obtained Profile Error-Free Data Error Data Unforced Test .90 .80 Total Q-sort .92 .76 Block Q-sort .50 .72** .64 '^Relationships here reported are rank order correlations. **This correlation was computed after eliminating one person from consideration. See text for explanation. magnitude of this effect for the more operational test conditions repre- sented by the rate of error used in this study. The error -free correlations will be considered first. The Dr simi- larity scores of the unforced test were computed between cluster profiles. Since the error-free D,- scores were highly efficient in reproducing the total criterion similarity scores, and permitted unrestricted individual . 36 scatter, the correlation of .90 between the criterion profile scatter and the cluster profile scatter for each person would be expected. The error D- scores, also, were relatively efficient in reproducing the Cr scores, hence, the comparable relationship of .80 under error conditions is also consistent with the prior data for the unforced test. The correlation of .92 between individual scatters within criterion profiles and individual scatters within cluster profiles from the error -free total Q-sort, however, is a different matter. This high relationship indi- cates that most of the scatter that results from cluster scoring closely approximates the original true scatter for each of the twenty subjects. The comparable correlation of .76 for the total Q-sort under the influence of error is an even stronger argument for the use of cluster scoring to measure differences in scatter that would not be treated by the Q or DQ., similarity scores. Correlations given in Table 6 for block Q -sorts will be discussed in a later section when the efficiency of the block C~sort is being considered. The efficiency of the DQ^ similarity scores. The correlations of the criterion score -sets and the DQ. score -sets for the total Q-sort will now be considered. The error -free DQ. correlations, as listed in Table 4 are C 5 ~DQ 4 = .68, C.-DQ 4 = .74, and C^-DQ. = .71. <\s would be expect- ed from the preceding discussion concerning the value of cluster scoring, the DQ. scores have a slightly higher correlation with the C. scores than either the Cr or C, scores. These three criterion-DQ. correlations are so nearly equivalent, however, that chance could account for any specific difference. These data indicate that the DQ. scores include information due to criterion differences in elevation and scatter. The DQ. scores, therefore, have some properties that are similar to the DQ- scores for the total Q-sort. It will be recalled that the DQ. scores were computed between cluster profiles that had constant means over all persons but unrestricted indi- vidual scatter within profiles. These DQ. scores represent similarity scores computed between persons' positions located in a k - 1 hyperplane. Likewise, the C. scores represent similarity distances in a k - 1 hyper- plane. As has been discussed, these two k - 1 hyperplanes will be identi- cal, if and only if, the factor structure of the item sample is equivalent to the factor structure of the criterion space. Since the factor weights of the item sample are not equivalent to the factor weights in the criterion space, the DQ. scores of this study are computed in a k - 1 hyperplane that is oblique to the corresponding criterion 37 k - 1 hyperplane. Thus, the information due to elevation that is eliminated in the C. score-set is not entirely eliminated in the DQ. score -set. This accounts for the fact that the DQ. score -set correlates approximately the same with all three criterion score-sets. In other words, by changing the weights of the factor loadings in the item sample the various criterion - DQ 4 correlations could be altered. As the item sample weights become more equivalent to the criterion weights, the C.-DQ. correlation should increase and the Cr"DQ 4 and C,-DQ 4 correlation should decrease since the criterion and obtained elevation dimension will tend to become more similar. A comparison of the error -free and error relationships of the DQ. similarity scores for the total Q-sort provides further evidence of the value of the cluster scoring technique. The DQ. correlations under the influence of error are C 5 ~DQ 4 = .65, C 4 -DQ 4 = .66, and C 3 ~DQ 4 = .38. Cluster scoring apparently reduces the effect of error since the C 4 ~DQ 4 and Cr'DQ, relationships remain relatively constant after the introduction of error. The nature of the clustering process suggests an explanation. The introduction o." error into the system will tend to cause the Q-r cores of some items within any cluster to increase and some Q scores to decrease. The summing process involved in obtaining each cluster score tends to nullify much of tb>° effect of error since the sum of random error: over many items approaches zero. Consequently, each indiv Jual's cluster profiles are considerably alike under the presence and absence of error. These data suggest that c l uster scoring methods may be of considerable value in imp r oving the efficiency of Q-sort designs und^r ope r ational error cond:l L \ ^is. Further studies will be necessary before the clustering effect can be more adequately quantified. The implications of the data of this study, however, strongly support the use of cluster scoring in situations wherein the test design is comparable to the total Q-sort as here described and the item sample contains homogeneous clusters. Th e ef fic iency of the D - similarity scores for the total Q-sort. The five-sr.are cluster profiles from which the L-Q 4 similarity scores were com- puted have been described above. Since the cluster profiles have unrestrict- ed five -score variances, it is possible to standardize these profiles by eliminating the individual differences in this variance. The D- simi- larity scores for the total Q-sort represent similarity distances between pairs of such standardized cluster profiles. The eno^-free correlations between the D, see; es ard the criterion scores as listed in Table 4 are as follows: C.-D, - .36, C 4 -D~ = .40, and 0,-D, = .8J. These data indi- cate that as expected the D~ measures contain little information due to 38 criterion differences in elevation and scatter. The error-free D, score- set, however, is quite efficient for reproducing the criterion information due to profile differences in shape. As error is added, the efficiency of the D, scores for reproducing the C, scores is decidedly diminished. The D~ correlations under the influence of error are C^-D^ = .24, C.-D, = .28, and C^-D, = ..38. as discussed above, the reduction of the information contained in the simi- larity measures by eliminating individual differences in scatter results in a magnification of the effect of error and, consequently, a reduction in the criterion-D, correlation. Empirical evidence of the effect of error on the total Q-sort simi- larity scor ej. As discussed for the unforced test design, the effect of error can be empirically examined by comparing nor -error similarity scores with corr 3 ipon ling similarity scores computed under the error condition. These relationships for the three obtained configurations of similarity scores for the total Q-sort are as follows: .45 for the DQ, scores, .70 for the DQ. scores, and .46 for the D_ scores. It is interesting to note that the effect of error on DQ, and D., scores was practically identical. It has been shown that the DQ~ scores have eliminated scatter differences in the sixty-score profile with differences due to the five-score scatter being further eliminated in the D, scores. These data, however, suggest that the aforementioned magnification of error upon reduction to the k - 2 similarity measures occurs regardless of the amourt of scatter information that has been eliminated. Of course, the above equality of correlation may be specific to this study; hence, further research is necessary to verify this suggestion. The correlation of .70 between error-free and error DQ. scores in- dicates that the similarity scores computed between cluster profiles are relatively stable under the influence of error. This tendency reinforces the argument presented earlier that the summing process involved in cluster scoring tended to nullify the error effect. The : *Q . scores, thus, appear to be of moderately high reliability and validity for measuring Q criterion scores. General Conclusions Concerning the Use of the Total Q-sort Design in Similarity Studies The data of this study indicate that the total Q-sort can be moderately efficient for obtaining similarity scores under certain conditions. For perfectly reliable items the D-, scores have high efficiency for reproducing 39 the C, criterion score -set. When error is introduced tfcis efficiency shows a decided decrease. The DQ? similarity scores are influenced by some specific conditions that are peculiar to this study. As a result, these scores have approxi- mately equal correlations with all three criterion scores, and hence, are not highly efficient for reproducing any one set of critericn information. Under the conditions of this study, some criterion information is apparently lost because of the forcing requirement of the Q-sort. As the amount of error in the system is increased, this information loss in- creases disproportionately. The DQ 4 score -set seems to be the most pro raising for reproducing criterion similarity relationships from item samples that contain clusters of highly intercorrelated items. These DQ, scores have relatively high correlation with the C„ and C- criterion scores under error-free con- 4 3 ditions and also under the rate of error imposed in this study. It has been shown that certain characteristics of the factorial structure of this item sample tend to decrease the C 4 ~DQ 4 correlation and increase the Cr-DQ. relationship. Thus, as the factor structure of the item sample becomes more similar to the criterion factor structure, the efficiency of the DQ. scores for reproducing the C 4 criterion scores increases. This same effect applies to the DQ., scores in that as the criterion and item factor weights become more similar, the Cr-DQ, correlation will tend to decrease and the C,~DC., correlation will tend to increase. In other words, the elimination or reduction of certain conditions specific to this study will cause an increase in the efficiency of the DQ, scores for reproducing the C, score -set under error -free conditions. However, the DQ- scores are subject to the effects of magnification of error and, as a result, the C,-DQ^ relationship will be seriously reduced under practical error conditions. The C 4 ~DQ 4 relationships suggest that total Q-sort item samples should be structured so that cluster scoring is feasible. Operationally, this would involve a logical or empirical factor analysis of suggested items in order that homogeneous clusters could be established. The data of this study indicate that the additional work involved in establish- ing such an item sample would be well repaid by increased validities and reliabilities. Further research in this area should lead to a more adequate understanding of the effects of cluster scoring on similarity measures obtained under the total Q-sort design. 40 The Efficiency of the Block Q-Sort Design The data relating to the efficiency of the block Q-sort design as specified in this study will now be considered. Information losses due to this particular design will be discussed. The effect of cluster scoring will be considered. The effect of the range of item popularity within the block structure will be examined. General conclusions concerning the use of the block Q-sort design in similarity studies will be developed as a consequence of the relations indicated in this study. A brief review of the structure of this block Q-sort may be helpful prior to the discussion of the criterion correlations for this design. Each block contains five items of equal popularity, with each item having a loading of .70 or higher on a different factor. In cluster scoring, therefore, each factor cluster contains one item from each of the twelve blocks. Each block of five items is treated as a small, complete Q-sort. Each subject chooses the item most positive for him -rid the item most nega- tive for him in each block. This design, therefore, is more comparable to a composite ot twelve similar fiva-item Q-sorts, than to a sixty-item Q-sort. The efficiency of this design will now be examined. Test-Criterion Relationships The correlations between the criterion and obtained similarity score-sets for the Mock Q-sort are presented in Table 7, The error- free correlations w?ll be considered first. These data indicate that for perfectly reliable data the block Q-sort is relatively inflexible. All three obtained similarity score -sets have moderately high correlations with the C, score -set and comparatively low relationships with the C. and Cr score-sets. These data indicate that the obtained scores of the block Q-sort have eliminated most of the criterion information due to differences in elevation i.nd scatter. As a result, these obtained scores are quite efficient for reproducing criterion information due to profile shape differences under error -free conditions. The criterion-obtained correlations of the block design reflect some of the specific conditions peculiar to this s*udy that are discussed above. In particular, the DQ-, and DC!^ scores include some criterion elevation because of the unequal factor weights of the item sample. This tends to cause some increase in the correlation of these scores with the C c scores. Since the DQ, scores are computed between profiles that have thirty-six degrees of freedom, i.e., permit thirty-six independent judg- ments in specifying the profile, this score-set contains some r.catter that 41 table 7 correlations of criterion score -sets and obtained similarity score -sets for the block q-sort design* Perfectly Reliable Items Moderat ely Reliable Items Obtained Similarity Score-Sets Criterion Score -Sets Crite rion Score -Sets S C 4 C 3 C 5 C 4 C 3 DQ 3 ** .1*7 • • • .1*2 • ••• £78 .12 • • • .10 • « • .28 DQ 4 D 3 .36 .S3 .39 .81 .8U 1 ^ .20 i Ji6 .20 .19 * The correlation between each criterion score-set and its most logically related, i.e., most relevant, obtained score-set is underlined. Correlations that represent semi-relevant relation- ships between criterion and obtained score -sets due to specific conditions of this study are underlined with dots. **The D measures for these configurations were computed from Q correlations. is eliminated in the five -score profiles from which the C, score -set is computed. This causes a slight decrease in the C-j-DQ, correlation and would suggest that the DQ, scores would be even more efficient for repro- ducing the C, score -set if these specific conditions were eliminated or reduced. The comparable correlations after error has been introduced how- ever, are considerably less encouraging. As was expected , the relative- ly high C~ correlations for the error -free case decrease to very low relationships due to the magnification of error in the k - 2 space. Only the DQ. correlations with the C 4 and Cr criterion scores are of suf- ficient magnitude to warrant further discussion. i; • ; ■ ■ »j 42 The Effect of Cluster Scoring for the Block Q-sort As has been discussed, the effect of cluster scoring can be empirically examined by correlating the criterion profile scatter of each individual with the corresponding cluster profile scatter obtained from the test design under consideration. These relationships for the test designs con- sidered in this study were presented in Table 6 for the perfectly reliable and less reliable items. The error -free correlation between the individual scatter within the criterion profiles and the individual scatter within the cluster profiles was .50 for the block Q-sort. This relationship was considerably lower than for the other test designs. In examining the computational processes, however, it was observed that this lower correlation was somewhat mis- leading since most of the reduced relationship was caused by an extreme shift in the rank of one person. In view of the unusual variation of this person's obtained profile scatter a second correlation was computed for nineteen persons with the atypical person eliminated. Under this condition the correlation between the criterion and cluster profile variances increased to .72. Just how many other undiscovered specific effects exist in this block Q-sort one cannot say, but in any case, the individual dif- ferences in variance within the five -score cluster profiles are moderate- ly related to the criterion variances. Under the influence of error the comparable correlation between cri- terion profile scatter and cluster profile scatter for the block Q-sort, as listed in Table 6, is .64. This supports the above inference that there is some value in cluster scoring in the block design. Since this relation- ship is considerably smaller than for either the unforced or total Q-sort tests under the same rate of error, however, it is further indication of the general inefficiency and inflexibility of the block design as compared with the other two experimental designs. The Characteristics of the DQ. Measures for the Block Q-Sort In view of the above data indicating the value of cluster scoring in this block Q-sort design, further consideration of the DQ. similarity measures is necessary. The Ce~DQ. correlations are .43 and .51 and the C.-DQ. correlations are .53 and .48 under error-free and error conditions, re- spectively. The relative stability of these relationships under the absence and presence of error suggests that the effect of error is reduced by the clustering process. A further discussion of the characteristics of the block DQ . score- sets is necessary. The correlations of Cr "DQ. = .51 and C.-DQ. = .48 under the influence of error indicate that these DQ. scores contain some criterion information due to difference in elevation and scatter. Since the ■■ 43 item sample has not been altered, the criterion elevation component in the DQ. scores results from the unequal factor weightings in the factor space and the item sample as already discussed. Consequently, cluster scoring seems of some value in this block design and would pro- bably be of more value if some of these specific effects could be reduced. Some additional information concerning the treatment of error in cluster scoring should be given by the correlation between the error-free and error block DQ. score-sets. This correlation was computed as .28. Since the DQ 4 scores do have some validity for the specified rate of error, this low reliability estimate is somewhat perplexing and probably misleading. However, since the error and non-error correlations of the DQ., and D, scores are also low, .34 and .45, respectively, the re- liability of any block Q-sort similarity, score is likely to be relatively low for any operational rate of error. General Co n clusions Concerning the Use of the 31ock Q-Sort Design in Similarity Studies The data of this study indicate that the block design as here described is a relatively inflexible and inefficient type of instrument for measuring the similarity between persons under practical conditions. If the items are highly reliable the k - 2 criterion scores can be reproduced with considerable accuracy. The DO. scores computed after cluster scoring offer some possibility for reproducing the C,- and C . criterion scores but the indicated relationships are rather low. Apparently, the disadvanta g es of the Q-sort are magnified and multiplied by th e s ma> . i subte ■ t Q- so rt d e s x gn. In par t i cular, sma l l idio s i r ncratic tende n cies that ar ■. ear in each s utbte st ass ,, : _ie large proporti o ns when com- bined sever '— tirr.s. This demonstrates the need for further theory relating to the forced-choice design test, in particular, those types that are composites of many small forced-choice subtests. This inference is supported by a study by Rudin and others (14). They investigated the reliability of a block Q-sort type instrument comparable to the one under discussion. Their conclusion as to the reliability of their block Q-sort instrument for measuring the similarity between persons was as follows: "Measures of Real Similarity on the present instrument were very unreliable and cannot be expected to correlate with criteria. The reliability of these measures increased slightly but not sufficiently by item selection. Profile scoring was of no help. However, it should be noted that the profile scoring procedure was not finally tested here, having been limited to five -item blocks and clusters of low internal consistency." (14, pp. 11) 44 Comparison of the Efficiency of the Three Experimental Test Designs The correlations of the criterion score -sets and the obtained simi- larity score -sets for the three experimental test designs are compared in Table 8. Since each criterion score-set should be reproduced most high- ly by those obtained score -sets that include comparable information, the correlations between criterion and logically related obtained score- sets are underlined. The largest correlation between each criterion score-set and a logically related obtained score -set is doubly underlined for both error -free and error conditions. Those semi-relevant cri- terion-obtained correlations, i.e., relationships between criterion and obtained score -sets that are logical because of specific conditions of this study, are indicated by underlining with dots. Inspection of these data indicates certain trends or patterns of relationships between the criterion scores and the obtained scores for different test designs. For example, in every case, i.e., for all criterion score-sets under both the presence and absence of error, the largest efficiency corre- lation is produced by an obtained score -set of the unforced test design. These data indicate that under the conditions specified in this study the unforced test design has some superiority in both efficiency and flexi- bility over the two experimental Q-sort designs. This superiority of the unforced design is greatest for reproducing the Ce score -set and reduces to practical insignificance for the C, scores. as discussed earlier, these relationships for the unforced test must be interpreted with caution since this experimental unforced design will be difficult to reproduce in practice. These data, however, do suggest that the unforced design has the greatest potential value for measuring similarity between persons. The degree to which this potential value is realized depends upon the extent to which response sets can be eliminated, or included if valid measures, and the clarity with which the multi-scale positions can be presented. Further inspection of the correlations of Table 8 indicates that the total Q-sort tends to be more efficient for reproducing the C. score-set than the block Q-sort. While some of the differences between these re- lationships are probably due to chance effects, the overall trends sug- gests some slight superiority for the total Q-sort. 45 TABLE 8 CORRELATIONS OF CRITERION SCORE -SETS AND OBTAINED SIMILARITY SCORE -SETS FOR THE THREE EXPERIMENTAL TEST DESIGNS* Type of j Perfectly Reliable Moderately Reliable Obtained Criterion Score-Set ) Items ■ Items Score-Set f Obtained Similarity Obtained Similarity Score-Sets Score-Sets D 5 D 4 DQ 3 DQ 4 D 3 D 5 D 4 DQ 3 DQ 4 D 3 Unforced Design _92 ,§5 •39 .81 .71 .26 • •• Total Q-sort c 5 :K :§§ .36 .22 .65 .21+ • • • . • . Block Q-sort •U7 .1*3 • . • • . .36 .12 .51 .20 • • • ... Unforced Design .58 .93 .10* .$5 .77 .31 Total % Q-sort i7Q -TU *Uo .18 .66 .28 • . . — — Block Q-sort .UO .53 • • • _____ .39 .10 ,kk .20 Unforced Design .27 .1*6 M .25 .31 __5 Total Q-sort c 3 .66 .71 .88 .38 .38 _38 Block Q-sort .78 .81 ____ .28 .19 __£ *The correlation between each criterion score -set and its most relevant obtained score -set is singly underlined; the largest such relationship for each criterion score-set is doubly underlined. Semi-relevant re- lationships due to the specific conditions of this study are underlined with dots. 46 The Implications of the C -, Correlations The matrix containing the correlations of C, scores with the various obtained similarity scores under the rate of error specified in this study is of considerable interest. In particular, note the generally low magnitudes and limited range of correlations for the most related obtained score- sets, i.e., from .45 to .28. These data indicate that with this rate of error all three test types are uniformily inefficient in reproducing this cri- terion configuration of similarity scores. The maximum C~ correlation of .45 indicates that approximately 20 percent of the C, information is being reproduced by this obtained configuration of similarity scores. Con- sidering that the C, criterion scores contain approximately 15 percent of the total O. criterion information, one realizes just how small a portion of the total criterion variance is being reproduced by obtained measures that operate in k - 2 dimensions. These data indicate that C, information is difficult to measure accurate- ly under practical rates of error. The general implications of this study have suggested the inad vis ability of using similarity scores that ignore all individual differences in elevation and scatter. Some test situations may arise, however, in which elevation information is completely irrelevant and information due to scatter can be eliminated on theoretical or em- pirical grounds. In such a case, the obtained similarity scores should operate in k - 2 dimensions and should not measure individual differences in elevation and scatter. If the test conditions are similar to those of this study, however, it will be necessary to use an extremely long test or to construct highly reliable items in order to accurately reproduce the de- sired C^ criterion information. Limitations of this Study It is important that the restricted nature of this test situation be kept in mind in interpreting the results of this study. Furthermore, it must be remembered that only twenty persons are included in the sample of persons under consideration. Thus, some sampling effects will be included in the computed data that are specific to this study. Consequently, statements as to the superiority of one test design over another cannot be generalized to other test situations from this study alone. This study, however, tends to support the position that properly structured unforced designs will be most valid for measuring similarity between persons. Many limiting assumptions concerning the number of factors, distribution of errors, assignment of obtained scores, etc., were made of necessity to keep this investigation operationally feasible. As a consequence, the implications of the empirical data presented here must be interpreted with caution. The major contribution of this study is seen as opening up a relatively new methodological approach to the study of 47 similarity measures. This claim seems justified since the theoretical model and computational techniques developed in this study and available elsewhere (19) are quite general and can be used to further investigate many of the unanswered problems that have arisen due to the limitations of these test conditions. Therefore, most of the implications of this in- vestigation should be valuable in the structuring of further research and must not be over -generalized as to their immediate application. One important specific limitation of this item structure that has implications for generalizing the efficiency of the total Q-sort to other studies should be discussed briefly. This limitation concerns the a priori specification of a positive direction for all factors and the restriction that all item vectors have this same positive direction (see page 7) . Our present understanding of the problem is not sufficient to permit us to predict how the total Q-sort structured under these conditions would differ from a total Q-sort structured under the conditions originally specified by Stephenson (17). He suggested that items should be selected for the Q- sort sample in pairs, such that for every positively oriented item vector there should be a comparable item vector orientated in the opposite direction. For example, if a Q-sort item sample contains an item positively load- ed on extroversion there should be a comparable item positively loaded on introversion. Since the results of this study were obtained under a specific design, additional studies must be completed before an exhaustive answer can be given as to the efficiency of the total Q-sort as a general technique for measuring similarities between persons. Conclusions and Recommendations With these limitations in mind, the following conclusions can be dis- cussed as developing from this study. 1. Under the conditions of this study, the experimental unforced test design is more efficient that the two experimental Q-sort test designs for measuring the similarity between persons. a) Advantages of the unforced test design are, (1) All criterion score-sets are highly reproduced by logically related unforced score-sets under error-free conditions. (2) The k and k - 1 criterion score -sets are highly reproduced by logically related unforced score -sets under the rate of error introduced in this study. (3) Each unforced score-set has higher efficiency for re- producing the related criterion score -set under the error conditions of this study than the comparable score -set of either of the Q-sort designs. 48 b) Disadvantages of the unforced test are, (1) It is subject to undesirable response -sets, (a) These response -sets can be reduced by carefully structuring the test items and the test instructions. (2) It is difficult to construct an unforced test that will give unforced, continuous responses. (a) A clearly defined multi -position scale for each item is probably a reasonable approximation. (1) Further research is needed to compare the ef- ficiencies of an unforced test that permits con- tinuous scoring with an unforced test that requires discontinuous scoring and, hence, permits tied scores. 2. Some criterion information is lost due to the forcing requirement when Q-sort designs are used to measure the similarity between persons. a) Test designs requiring smaller Q-sorts within the larger item sample, e.g., the block design, tend to increase this information loss by magnifying differences due to error or individual idiosycrasies. b) The efficiency of the scores from the Q-sort design de- creases as the factor structure of the item sample deviates from the factor structure of the specified criterion (1) When the factor weights of the item sample differ from those of criterion space, the obtained test scores include some criterion information due to individual differences in profile elevation, scatter, and shape. Under such con- ditions the Q-sort scores are not efficient for reproduc- ing any one criterion score -set. c) Cluster scoring tends to increase the efficiency of the O~sort test design for reproducing the criterion score -set that in- cludes information due to differences in profile scatter and shape. (1) This suggests that Q-sort item samples should be designed to contain relatively independent clusters of homogeneous items in order to permit cluster scoring. 49 d) Criterion scores that have eliminated information due to in- dividual differences in elevation and scatter can be highly reproduced by the most related scores of the Q-sort design for highly reliable items. (1) When error is introduced into the system, this efficiency shows a decided decrease. 3. The reduction of the dimensions of similarity measures by eliminating individual differences in elevation, or elevation and scatter, has extremely important implications for studies of similarity between persons. a) Under the conditions of this study, a considerable portion of the total criterion information is lost in such reductions. (1) When elevation is eliminated, 33 percent of the total cri- terion information is lost. (2) When elevation and scatter are eliminated, 85 percent of the total criterion information is lost. b) The reduction of the dimensions of similarity scores from k - 1 to k - 2 by eliminating differences in scatter causes a definite magnification of the effect of any error present in the system. (1) as a result, similarity scores that eliminate differences in scatter are generally inefficient under practical error conditions regardless of the test design from which they were obtained. These conclusions have many implications for researchers and practical testers who are interested in measuring the similarities between persons in a test structure as described in this study. While many of the above conclusions need additional support before wide generalization is possible, these recommendations seem reasonable at this time. 1. The results of this study suggest that similarity measures should be obtained using a carefully constructed unforced test design since this type seems to be the most efficient for measuring similarity relationships between persons. a) Considerable care should be taken to avoid irrelevant response - sets in the unforced test, but for some similarity scores certain relevant response -sets may actually contribute to the validity of the measure. 2. If forced-choice test designs are used to measure similarity relationships, the greatest amount of criterion information can be consistently reproduced by cluster scoring the homogeneous items before computing similarity scores. 50 a) The raw item scores should not be used to estimate simi- larity relationships unless the factor weights of the item sample have been carefully selected to be consistent with the factor weights of the criterion measures. (1) The use of such inefficient item scores, e.g., scores from inadequate item samples and/or from small forced-choice block arrangements, will consistently tend to obscure true relationships and may yield insignificant results where significant differences actually exist. Criterion similarity relationships that ignore differences in elevation and scatter are extremely difficult to reproduce by any of the test designs used in this study. a) When theory or empirical evidence requires the measurement of such relationships, it will probably be necessary to use very long tests or to construct highly reliable items if efficient measurement is to be accomplished. 51 BIBLIOGRAPHY i. Burt, C. Correlation between persons. Brit. J. Psychol. , 1937, 28, 59-95. 2. Burt, C. The factors of the mind. London: Univ. of London Press, 1940. 3. Cronbach, L. J. Essentials of psychological testing . New York: Harper, 1949. 4. Cronbach, L. J. Further evidence on response -sets and test design. Educ. psychol. Measmt., 1950, 10, 3-31. 5. Cronbach, J. J., and Gleser, Goldine C. Similarity between persons and problems of profile analysis. Champaign -Urbana, Illinois: Univ. of Illinois, 1952. (Mimeographed, Technical Report No. 2, Contract N6ori-07135 between the University of Illinois and the Office of Naval Research). 6. Eberman, P. W. Personal relationships: one key to instructional improvement. Educ. Leadership, 1952, 9 , 389-392. 7. Edelson, M., and Jones, A. E. The use of Q-technique and role-play- ing in an investigation of the self-concept. Unpublished manu- script, Univ. of Chicago, 1951. (private circulation). 8. Fiedler, F. k. a comparison of therapeutic relationships in psycho- analytic, non-directive, and Adlerian therapy. J. consult. Psychol. , 1950, 14, 436-446. 9. Fiedler, F. E., Hartmann, W., and Rudin, S. >v. The re lati onsh i p of interpersonal perception to effectiveness in bask et bal l toam.'i. Champaign -Urbana, Illinois: Univ. of Illinois, 1952. (Mimeo- graphed Technical Report No. 3, Contract N6ori-07135 between the University of Illinois and the Office of Naval Research). 10. Fiedler, F. E., Blaisrlell, F. J., and Warrington, W. G. Unconscious attitudes and the c./namics of sociometric choice in a social group. J. ab.iorm. soc. Psychol. , 1952, 47, 790-796. 11. Gordon, L. V. Validities of the forced-choice and questionnaire methods of personality measurement. _J. appl . Psychol. , 1951, 35, 407-412. 12. McNemar, Q. Psych o logical statistics . New York: Wiley, 1949. 13. Osgood, C. E., and Suci, G.J. A measure of the relation determined by both mean difference and profile information. Psychol. Bull. 1952, 49, 251-262. 52 14. Rudin, S. A., Lazar, I., Ehart, Mary E., and Cronbach, L,. J. Some empirical studies of the reliability of social perception scores. Champaign -Urbana, Illinois: Univ. of Illinois, 1952. (Mimeo- graphed Technical Report No. 4, Contract N6ori-07135 between University of Illinois and the Office of Naval Research.) 15. Stephenson, W. Correlating persons instead of tests. Character and Pers., 1935, 4, 17-24. 16. Stephenson, W. Methodological consideration of Jung's typology. J. ment. Sci., 1939, 85, 185-205. 17. Stephenson, W. Q -technique: variate -designs and propositional sets. Unpublished monograph, Univ. of Chicago, 1951. 18. Stephenson, W. A note on Professor R. B. Cattell's methodological adumbrations. J. clin. Psychol., 1952, 8, 206-207. 19. Warrington, W. G. The efficiency of the Q-sort and other test designs for measuring the similarity between persons. Unpublished Doctoral Thesis. University of Illinois. 1952. 20. Zuckerman, J. V. Interest item response arrangement as it affects discrimination between professional groups. J. appl. Psychol., 1952, 36, 79-85. f -J II UNIVERSITY OF ILLINOIS URBANA 3 0112 084224663 |l||||Pi