GitHub - archivesunleashed/twut: An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark. Skip to content Sign up Why GitHub? Features → Mobile → Actions → Codespaces → Packages → Security → Code review → Project management → Integrations → GitHub Sponsors → Customer stories → Security → Team Enterprise Explore Explore GitHub → Learn & contribute Topics → Collections → Trending → Learning Lab → Open source guides → Connect with others The ReadME Project → Events → Community forum → GitHub Education → GitHub Stars program → Marketplace Pricing Plans → Compare plans → Contact Sales → Nonprofit → Education → In this repository All GitHub ↵ Jump to ↵ No suggested jump to results In this repository All GitHub ↵ Jump to ↵ In this organization All GitHub ↵ Jump to ↵ In this repository All GitHub ↵ Jump to ↵ Sign in Sign up {{ message }} archivesunleashed / twut Watch 3 Star 6 Fork 2 An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark. Apache-2.0 License 6 stars 2 forks Star Watch Code Issues 1 Pull requests 0 Actions Projects 0 Security Insights More Code Issues Pull requests Actions Projects Security Insights main 1 branch 3 tags Go to file Code Clone HTTPS GitHub CLI Use Git or checkout with SVN using the web URL. Work fast with our official CLI. Learn more. Open with GitHub Desktop Download ZIP Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Go back Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Go back Launching Xcode If nothing happens, download Xcode and try again. Go back Launching Visual Studio If nothing happens, download the GitHub extension for Visual Studio and try again. Go back Latest commit   Git stats 31 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time .github/workflows     config     docs     src     .codecov.yml     .gitignore     CODE_OF_CONDUCT.md     CONTRIBUTING.md     LICENSE     README.md     pom.xml     View code README.md Tweet Archives Unleashed Toolkit (twut) An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark. Dependencies Java 8 or 11 Python 3 Apache Spark Getting Started Packages Spark Shell $ spark-shell --packages "io.archivesunleashed:twut:0.0.4" Jars You can download the latest release files here and include it like so: Spark Shell $ spark-shell --jars /path/to/twut-0.0.4-fatjar.jar PySpark $ pyspark --py-files /path/to/twut-0.0.4.zip You will need the PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON environment variables set. Documentation! Or, how do I use this? Once built or downloaded, you can follow the basic set of recipes and tutorials here. License Licensed under the Apache License, Version 2.0. Acknowledgments This work is primarily supported by the Andrew W. Mellon Foundation. Other financial and in-kind support comes from the Social Sciences and Humanities Research Council, Compute Canada, the Ontario Ministry of Research, Innovation, and Science, York University Libraries, Start Smart Labs, and the Faculty of Arts and David R. Cheriton School of Computer Science at the University of Waterloo. Any opinions, findings, and conclusions or recommendations expressed are those of the researchers and do not necessarily reflect the views of the sponsors. About An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark. Topics spark apache-spark tweets twitter-data twitter-json spark-packages Resources Readme License Apache-2.0 License Releases 3 twut-0.0.4 Latest Dec 11, 2019 + 2 releases Packages 0 No packages published Contributors 3       Languages Scala 80.0% Python 20.0% © 2021 GitHub, Inc. Terms Privacy Security Status Docs Contact GitHub Pricing API Training Blog About You can’t perform that action at this time. You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.