id author title date pages extension mime words sentences flesch summary cache txt ruebot-net-7157 Tweets to @realdonaldtrump; How many fucks are there to give? | Nick Ruest .html text/html 567 109 85 Tweets to @realdonaldtrump; How many fucks are there to give? Tweets to @realdonaldtrump; How many fucks are there to give? The data is updated by running a query on the Standard Search API every five days. donald_search_2018_06_06.jsonl donald_search_2018_06_06.jsonl donald_search_2018_06_06.jsonl donald_search_2018_06_06.jsonl donald_search_2018_06_06.jsonl donald_search_2018_06_06.jsonl donald_search_2018_06_06.jsonl donald_search_2018_06_06.jsonl donald_search_2018_06_06.jsonl $ twarc hydrate to_realdonaldtrump_20180606_ids.txt > 20180609.jsonl Once we have our full dataset, first thing we'll do is remove all of the retweets with noretweets.py, giving us just original tweets at @realDonaldTrump. Let's use tweet_text. $ tweet_text.py 20180612_no_retweets.jsonl >| 20180612_tweet_text.txt Now that we have just the text, we can count how many fucks there are with grep and wc! $ grep -i "fuck" 20180612_tweet_text.txt | wc -l That's a fuck to tweet ratio of 2.73%. For some more fun, let's take the last 1000 lines of the our new text file, and make an animated gif out of it. $ grep -i "fuck" 20180612_tweet_text.txt > fucks.txt cat /path/to/1000_fucks.txt | while read line; do ./cache/ruebot-net-7157.html ./txt/ruebot-net-7157.txt