id author title date pages extension mime words sentences flesch summary cache txt work_a734rm3mxbf4pn5ghgxadmhjja Quinn Dombrowski Preparing Non-English Texts for Computational Analysis 2020 9 .pdf application/pdf 4883 351 54 Most methods for computational text analysis involve doing things with "words": Most methods for computational text analysis involve doing things with 'words': counting Depending on the text analysis method, a sufficiently large corpus (on the scale of multiple millions of words) may sufficiently minimize issues caused by inflection, for instance at a different kind of understanding of a text using some form of word frequency analysis. application of these text analysis methods may not be as straightforward for students working in other languages. For languages with a non-Latin alphabet, text encoding problems will render Unicode (UTF-8) encoding is the best option when working with text in any language, but between 'words', before you can use it for computational text analysis. Fortunately, there is a growing community of scholars working on computational text analysis, and other digital humanities methods, as applied to languages other than English. ./cache/work_a734rm3mxbf4pn5ghgxadmhjja.pdf ./txt/work_a734rm3mxbf4pn5ghgxadmhjja.txt