Analysing Twitter data

Twitter is a well-known and widely used source for research. Academics and journalists use it to gauge what is on people's minds and how the general population think about certain topics.

For business and ethical reasons, Twitter limits (free) access to the live stream of all tweets and to the full archive of all tweets published since Twitter's start. However, other ways of getting tweets are available.

Get started¶

For a beginner's tutorial to getting tweets and analysing them, please see:

Brad Rittenhouse, Ximin Mi, and Courtney Allen, "Beginner's Guide to Twitter Data," The Programming Historian 8 (2019), https://doi.org/10.46430/phen0083.

Ethics and Terms of use¶

As a general note: please consider the ethics of using tweets, as well as Twitter's terms of use. Many Twitter users do not regard publishing tweets to include consenting to all forms of research. The University of Sheffield notes in Research Ethics Policy Note no. 14 that social media data is considered research on human subjects. For politicians there are probably other considerations to include.

Also, the free Twitter APIs do not provide access to all tweets, so the data set will not be a complete view of the 'Twittersphere'. The blog post Twitter’s Developer Policies for Researchers, Archivists, and Librarians discusses Twitter's August 2018 changes to their terms and conditions for research and archiving.

Software¶

In terms of software: the SOLO department (solo@fsw.leidenuniv.nl) at the Faculty of Social and Behavioural Sciences has a bulk license for ATLAS.ti, which is used for discourse analysis and includes functionality to get data from Twitter. Other tools include text mining software like Orange 3 with the text mining plugin or KNIME.