id author title date pages extension mime words sentences flesch summary cache txt work_v42zogqmhrdlnmk3ml2fjj2sj4 Kent Johnson A Lot of Data 2011 13 .pdf application/pdf 5510 583 71 estimate the size of a linguistic data set. a given linguistic data set. explicitly addressing this matter, by suggesting an operationalized characterization of an expression type. is similar to familiar cases involving multivariate data sets, with one exception: the irrelevance of negative associations. matter of estimating the size of a linguistic data set. some (explicit) means for estimating the size of a data set. a theory is that new, redundant data are all too easy to generate. Importantly, however, redundancy is a holistic affair, potentially involving most or all of the data set. be relative to both a given data set and the particular theory at hand. Instead, determining the relevant properties of expressions may be a matter of a "bootstrap" procedure (Glymour 1980) as the theory is developed linguist uses current theory to hypothesize some relevant structure thought set of n vectors could have the correlations of its nonnegative variant. ./cache/work_v42zogqmhrdlnmk3ml2fjj2sj4.pdf ./txt/work_v42zogqmhrdlnmk3ml2fjj2sj4.txt