2/22/2021
Get Twitter Information Utilizing R and Tableau
In the course of the 2020 Tableau Convention-ish this yr, I printed a viz referred to as #DATA20 BY THE MINUTE the place I visualized tweets with the #data20 hashtag counted by the minute. To do that, I used R to gather the information from Twitter. I assumed others may discover this convenient in monitoring their very own tweets, hashtags or different folks and subjects, so here’s a fast weblog put up on how you can get this information and produce it into Tableau.
Provoke the Code
To make use of this code, you’ll need a Twitter deal with and to set up a Twitter Developer App (free) here. After creating an app, you’ll get an API Key, API Secret Key and Bearer Token. We’d like these three to execute the code that downloads the information. Observe: scroll to the underside of this weblog put up if you wish to copy and paste all the code directly.
## Basic Data
vignette(« auth », bundle = « rtweet »)
## set up dev model of rtweet from github
remotes::install_github(« ropensci/rtweet »)
## set up httpuv if not already
if (!requireNamespace(« httpuv », quietly = TRUE)) {
set up.packages(« httpuv »)
}
##Set up these packages
set up.packages(« rtweet », dependencies=TRUE)
set up.packages(« jsonlite »)
set up.packages(« stringr »)
library(jsonlite)
library(rtweet)
## title of twitter app and API settings.
app_name <- « [Your Twitter App Name] »
consumer_key <- « [Your Consumer Key] »
consumer_secret <- « [Your Consumer Secret] »
bearer_token <- « [Your Bearer Token] »
## create token
token <- create_token(app_name, consumer_key, consumer_secret)
## print token (simply to ensure it is working)
token
Observe: Within the R code above, exchange the consumer_key, consumer_secret, and bearer_token with your personal contained in the quotes (with out the brackets). Each request despatched to Twitter should embody a token so you must retailer it as an setting variable.
## save token to residence listing
path_to_token <- file.path(path.broaden(« ~ »), « .twitter_token.rds »)
saveRDS(token, path_to_token)
## create env variable TWITTER_PAT (with path to saved token)
env_var <- paste0(« TWITTER_PAT= », path_to_token)
## save as .Renviron file (or append if the file already exists)
cat(env_var, file = file.path(path.broaden(« ~ »), « .Renviron »),
fill = TRUE, append = TRUE)
## refresh .Renviron variables
readRenviron(« ~/.Renviron »)
Get the #data20 Hashtag
After establishing the initiating codes, the subsequent step is the code to gather the information. The pattern code under will search tweets for the hashtag #data20 and return 25,000 outcomes, not together with retweets.
##https://public.tableau.com/profile/jeffrey.shaffer#!/vizhome/DATA20BYTHEMINUTE/DATA20Hashtag
rt <- search_tweets(
« #data20 », n = 25000, include_rts = FALSE
)
location <- users_data(rt)
After the information is collected, the subsequent little bit of code will write the information to a CSV. Substitute the trail and file title under to your required location.
## Change your file paths under as wanted
fwrite(rt, file = »D:DropboxData20 Tweets.csv »)
fwrite(location, file = »D:DropboxData20 Places.csv »)
The ultimate output is 2 CSV recordsdata, one with the tweets, and the opposite location file with the consumer data. You may create a relationship (noodle) or be a part of them collectively in Tableau utilizing the sphere consumer id.
Learn Twitter Standing IDs and Search for Tweets
One other useful gizmo is trying up particular Twitter Standing IDs and the tweets. For instance, I used this method to trace the exercise of my Tableau Suggestions final yr. I printed 194 suggestions and needed to see what probably the most favorited suggestions have been on the finish of the yr. To do that, I used a Google Sheet that had an inventory of all the Tweets, particularly the Tweet Standing ID.
Within the R code under, I learn these Standing IDs from a Google Sheet into R, then lookup every of them to collect the details about every Tweet. On this case, the Standing ID is on the finish of the URL, so there’s a line of code that parses the Status_ID from the URL hyperlink. Should you had a easy checklist of simply the Standing-IDs that you just needed to trace you then would not must parse them out of the URL.
## Ex.
https://docs.google.com/spreadsheets/d/15_ikKizR52ugsZW2G0pCcmVW1_2qWzOd6dkGg9bBJuU/edit?usp=sharing
#set up.packages(‘gsheet’)
library(gsheet)
library(stringr)
tipsheet <- gsheet2tbl(‘docs.google.com/spreadsheets/d/15_ikKizR52ugsZW2G0pCcmVW1_2qWzOd6dkGg9bBJuU’)
##Parse the Status_Id from the URL Hyperlink
tipsheet$StatusID <- substr(tipsheet$Hyperlink,43,str_length(tipsheet$Hyperlink)-5)
status_ids <- tipsheet$StatusID
##Lookup Tweets by Standing ID
twt <- lookup_tweets(status_ids,token = bearer_token())
##Save information desk to CSV (change your file path under as wanted)
library(information.desk)
fwrite(twt, file = »D:DropboxTableauTip.csv »)
The rtweet bundle in R
There are a variety of different instruments obtainable within the rtweet bundle. For instance, you will get followers, mentions, favorites and timelines of a consumer. You may obtain members or subscribers of an inventory. You may retrieve the direct messages you’ve despatched or obtained. You may obtain traits on Twitter, globally, utilizing a metropolis title, or perhaps a longitude and latitude. For extra data and pattern code for doing a few of these different issues, take a look at the documentation on the rtweet package here.
Beneath is all the code used for this undertaking as a fast reference so that you can copy and paste.
## set up dev model of rtweet from github
remotes::install_github(« ropensci/rtweet »)
## set up httpuv if not already
if (!requireNamespace(« httpuv », quietly = TRUE)) {
set up.packages(« httpuv »)
}
##Set up these packages
set up.packages(« rtweet », dependencies=TRUE)
set up.packages(« jsonlite »)
set up.packages(« stringr »)
library(jsonlite)
library(rtweet)
## title of twitter app and API settings
app_name <- « [Your Twitter App Name] »
consumer_key <- « [Your Consumer Key] »
consumer_secret <- « [Your Consumer Secret] »
bearer_token <- « [Your Bearer Token] »
## create token
token <- create_token(app_name, consumer_key, consumer_secret)
## print token (simply to ensure it is working)
token
## save token to residence listing
path_to_token <- file.path(path.broaden(« ~ »), « .twitter_token.rds »)
saveRDS(token, path_to_token)
## create env variable TWITTER_PAT (with path to saved token)
env_var <- paste0(« TWITTER_PAT= », path_to_token)
## save as .Renviron file (or append if the file already exists)
cat(env_var, file = file.path(path.broaden(« ~ »), « .Renviron »),
fill = TRUE, append = TRUE)
## refresh .Renviron variables
readRenviron(« ~/.Renviron »)
## Ex. https://docs.google.com/spreadsheets/d/15_ikKizR52ugsZW2G0pCcmVW1_2qWzOd6dkGg9bBJuU/
#set up.packages(‘gsheet’)
library(gsheet)
library(stringr)
tipsheet <- gsheet2tbl(‘docs.google.com/spreadsheets/d/15_ikKizR52ugsZW2G0pCcmVW1_2qWzOd6dkGg9bBJuU’)
##Parse the Status_Id from the URL Hyperlink
tipsheet$StatusID <- substr(tipsheet$Hyperlink,43,str_length(tipsheet$Hyperlink)-5)
status_ids <- tipsheet$StatusID
##Lookup Tweets by Standing ID
twt <- lookup_tweets(status_ids,token = bearer_token())
##Save information desk to CSV (change your file path under as wanted)
library(information.desk)
fwrite(twt, file = »D:DropboxTableauTip.csv »)
##This was the code to get the #data20 hashtag for my viz
##https://public.tableau.com/profile/jeffrey.shaffer#!/vizhome/DATA20BYTHEMINUTE/DATA20Hashtag
rt <- search_tweets(
« #data20″, n = 25000, include_rts = FALSE
)
location <- users_data(rt)
## Change your file paths under as wanted
fwrite(rt, file = »D:DropboxData20 Tweets.csv »)
fwrite(location, file = »D:DropboxData20 Places.csv »)
I hope you discover this data helpful. You probably have any questions be at liberty to e-mail me at Jeff@DataPlusScience.com
Jeffrey A. Shaffer
Comply with on Twitter @HighVizAbility