For loop error because of different row-lenghts (Rtweet)

statguy15 · December 21, 2020, 11:00am

I am trying to run the loop below where the get_friends function is applied to a dataset consisting of a lot of Twitter users. When running the function my_users2 <- find_f(my_users) below the loop it ends up stating that: "Mistake in$<-.data.frame(tmp, "faktor1", value = c(1.35714285714286, : ... replacement has 15 rows, data has 25" and therefore no data is saved/downloaded in the end. The data I am using consists of 25 Twitter users, but I don't know why "the replacement has 15 rows" then. The loop-code is shown below, the Sys.sleep() function seems to be doing it's job, and the loop is working fine with 14 or less people (since 15 is the rate_limit for get_friends function).

Nevertheless the loop won't work with larger datasets. Anyone know what to do about these different lenghts? I can post the Trackeback error if necessary.

find_f <- function(df){
  F1_mean <- c()
  F2_mean <- c()
  anyFriends <- c()
  
  for(i in 1:nrow(dataset)){
    #pick user1 of dataset
    user_friends <- get_friends(dataset$screen_name[i])
    
    #sys.sleep
    if (i %% 15 == 0){
      Sys.sleep(15*60) # must enter time in seconds
    }
    
    #intersect with candidate data
    ids_intersect = intersect(user_friends$user_id, kandidatdata$user_id)
    if(length(ids_intersect)==0){
      anyFriends <- c(anyFriends, FALSE) # User has no friends in the politicians df
    } else {
      
      #assign values to user based on intersect
      kandidat_friends = kandidatdata[kandidatdata$user_id %in% ids_intersect,]
      F1_mean <- c(F1_mean, mean(kandidat_friends$faktor1, na.rm=TRUE))
      F2_mean <- c(F2_mean, mean(kandidat_friends$faktor2, na.rm=TRUE))
      anyFriends <- c(anyFriends, TRUE) # user has friends in the politicians dataset
    }
  }
  dataset$faktor1 <- F1_mean
  dataset$faktor2 <- F2_mean
  dataset$anyFriends <- anyFriends
  return(dataset[dataset$anyFriends,])
  
}

my_users2 <- find_f(my_users)
View(my_users2)

If you need to know about the context, I am trying to measure political ideology on Twitter (by using Rtweet). I now have a dataframe consisting of +100 politicians user_id's along with two ideal point scores on 'factor 1' and 'factor 2' (both factors have a range of 1-4). It looks like this (called kandidat):

Navne	Faktor 1	Faktor 2
"Politician1"	3.5	1.0
"Politician2"	2.0	4.0
Etc...	X	X

I would then like to detect if random Twitter users follow one or more of the politicians from my dataset. If they e.g. follow two of the politicians in my dataset - "Politician1" and "Politician2" - I will then assign a mean of the two politicians ideal point scores on the two factors to the user. An example of a Twitteruser following these two politicians could then be factor 1 = (3.5+1.0)/2 = 2.25 and factor 2 = (2.0+4.0)/2 = 3.00.

system · January 11, 2021, 11:01am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.