How can I create a loop which runs stock tickers trough an API?

Hi I have a data set with "initial public offerings"(IPOs) and their IPO date. One column has the stock tickers and the other has the IPO date.
I want to find out how many tweets with the ticker symbol in the hashtag there are 1, 2 and 3 days prior to the IPO date. The problem is that I have nearly 3000 tickers...
Does anybody know how I can let Rstudio run these 3000 tickers as the query for these 3 points in time trough the count_all_tweets command? That would help me out massively :slight_smile:
I'm sorry If i messed up the formatting of this post. I'm kind of new on here.

**Code: **```
library(academictwitteR)
library(devtools)
devtools::install_github("cjbarrie/academictwitteR", build_vignettes = TRUE)

set_bearer()
get_bearer()
#Load data into R
getwd()
setwd("C:/Users/Hidde/Documents/MASTER/Research Project")
dataFrame <-read.csv("IPO excel data.csv", header = TRUE, sep = ";")

ticker <- dataFrame$Ticker
IPO_date <- dataFrame$Offer.date

#counting the tweets
tweets <- count_all_tweets(
query = '#Ticker',
start_tweets = '2015-01-05T00:00:00Z',
end_tweets = '2015-05-05T00:00:00Z',
bearer_token = get_bearer())

Below is one potential solution. I don't have a token for academictwitteR, so I can't be certain of the outcome, but I think this general approach should work.

library(tidyverse)
library(academictwitteR)
library(lubridate)

# sample data
dataFrame = data.frame(
  Ticker = c('ticker1', 'ticker2', 'ticker3'),
  Offer.date = c('2020-01-01', '2022-03-20', '2021-05-15')
)

dataFrame
#>    Ticker Offer.date
#> 1 ticker1 2020-01-01
#> 2 ticker2 2022-03-20
#> 3 ticker3 2021-05-15

# reshape data; determine dates of prior 3 days
df = dataFrame %>%
  mutate(day1 = as.Date(Offer.date) - days(1),
         day2 = as.Date(Offer.date) - days(2),
         day3 = as.Date(Offer.date) - days(3)
         ) %>%
  pivot_longer(c(-'Ticker', -'Offer.date'), values_to = 'date') 

df
#> # A tibble: 9 × 4
#>   Ticker  Offer.date name  date      
#>   <chr>   <chr>      <chr> <date>    
#> 1 ticker1 2020-01-01 day1  2019-12-31
#> 2 ticker1 2020-01-01 day2  2019-12-30
#> 3 ticker1 2020-01-01 day3  2019-12-29
#> 4 ticker2 2022-03-20 day1  2022-03-19
#> 5 ticker2 2022-03-20 day2  2022-03-18
#> 6 ticker2 2022-03-20 day3  2022-03-17
#> 7 ticker3 2021-05-15 day1  2021-05-14
#> 8 ticker3 2021-05-15 day2  2021-05-13
#> 9 ticker3 2021-05-15 day3  2021-05-12

# function to get tweet counts
get_tweet_counts = function(i) {
  
  # format query and date
  d = df[i,] %>%
    mutate(query = paste0('#', Ticker),
           tweet_date = paste0(date, 'T00:00:00Z'))
  
  # get tweet counts
  tweet_count = count_all_tweets(
    query = d$query,
    start_tweets = d$tweet_date,
    end_tweets = d$tweet_date,
    bearer_token = get_bearer())
  
  # join counts back to d 
  #(assumes output of counts all tweets is a 2-column data frame: query, count)
  left_join(d, tweet_count)
  
}

# gather tweet counts for each row of df
tweets = map(1:nrow(df), get_tweet_counts) %>%
  bind_rows()

Created on 2023-01-16 with reprex v2.0.2

1 Like

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.