Skip to Main Content

Webscraping

Scraping, Visualizing, and Analyzing the Web

Twitter

Twitter is a micro-blogging site where users can broadcast status updates of 140 characters or less. If you aren't that familiar with the site, you can explore it here.

While there are many social networking sites that hold rich information for research, Twitter is an ideal space because:

1.     Most profiles are public: Other sites like Facebook and Instagram may have interesting data. However, depending on users' privacy setting, you will only be able to collect certain information, potentially skewing your findings.

2.     Tweets are more than just text: Each time you mine data from Twitter you are open to collecting a lot of information. This may be tweet-based content such as photos, links, and geo-locations. And it also may be user-based content such as profile picture, number of followers, and date of sign up.

3.     Tweets can only be 140 characters: The fact that each tweet entry can only be 140 characters may seem limiting at first, but it is actually quite helpful in the analysis process. Further, users are learning how to adapt to the smaller space; therefore you are not really missing out on any important data that 141+ characters would have allowed.

4.     Twitter users can have both friends and followers:  Unlike a site like Facebook where friending is reciprocal, on Twitter users can gain followers without adding them to their friends list. Because of this, potential audiences are better analyzed and network maps can be more dynamic, revealing more information.

Why Scrape Twitter?

Before getting started with your research, you want to be sure that your research question matches the types of research best served by Twitter data. Social media scraping, in general, is best utilized when you are trying to understand some phenomenon that is taking place online.

In particular, Twitter data allows you to:

  • Understand your own twitter network or the influence of your own tweets
  • Collect data about tweeters (followers, friends, signup date, favorites, profile picture, etc.)
  • Know who is mentioned through @usernames
  • See how information disseminates
  • See the influence/popularity of tweets and people
  • Examine networks and communities
  • Explore how trends develop and change over time

Getting an API Key

Getting an API Key

Enter access tokens:

  • You must have a twitter account.

  •  Go to https://apps.twitter.com/ and log in to register an app

  •  Once you have registered, it will generate the four keys requires

Some Scraping Tools

 There are a variety of tools for Twitter scraping that are easier to use than actual code-based scraping. These tools do not require any knowledge of coding programs, and instead have been set up as ready-to-use websites for easy collection of Tweets. I include a few of the best ones below. 

For visualizing Twitter, see our network and text analysis ​Temple guides.

Truthy

Truthy

University of Indiana's Observatory on Social Media (OSoMe) has developed several different tools for studying information diffusion on social media. 

  • Track how different memes trend over time 
  • Hoaxy visualizes the spread of claims versus fact-checked articles
  • Botometer checks the likelihood of a Twitter account being a bot 
  • Networks employs network analysis explores who is discussing a meme and how different memes are related 

Twitter Archiver

Twitter Archiver 

  • Google add-on that uses Twitter's Advanced Search feature 
  • Automatically collects Tweets into spreadsheet
  • Only allows one search running at once, but you can change search criteria as often as you want