id author title date pages extension mime words sentences flesch summary cache txt work_slf56v6zgnezzi3cayi4weg36y Linnea Frangen Digital Humanities Project: Comparing Language Complexity in Fact-Checked Fake and Real News 2020 6 .pdf application/pdf 1799 149 72 My research question is how language complexity differs between fact-checked fake although the result may be less distinct since both the real and fake news come from fact-checking The fact-checked news data come from a MisInfoText GitHub repository that contains Fake news dataset contains 33,712 total words and 6,526 unique word forms. Real news dataset contains 54,997 total words and 8,987 unique word forms. Average Words Per Sentence: fake news 21.8, real news 19.5. The normed rate of the 14 conjunctions in total is 5.76 per 100 words for real news The same code is repeated twice on different datasets (first for the real news data and then The analysis shows that the differences in language complexity between the real news dataset and of articles included in the data, the fake news dataset included, for example, an article from ./cache/work_slf56v6zgnezzi3cayi4weg36y.pdf ./txt/work_slf56v6zgnezzi3cayi4weg36y.txt