Is there a more efficient way in dpylr or the tidyverse ecosystem to filter out multiple text items such as in the below example, or do I just need to compile a character vector and use a loop? The use case is to filter out political tweets not relevant to my analysis - text
is the tweet message text in a column of a dataframe created via twitteR library.
df <- df %>%
filter(!str_detect(text, fixed("squat when #Putin annexed Crimea", ignore_case = TRUE)),
!str_detect(text, fixed("environmental pol", ignore_case = TRUE)),
!str_detect(text, fixed("Republicans have meeting with the Russians", ignore_case = TRUE)),
!str_detect(text, fixed("impeach", ignore_case = TRUE)),
!str_detect(text, fixed("AboutStrzok", ignore_case = TRUE)),
!str_detect(text, fixed("Clinton", ignore_case = TRUE)),
!str_detect(text, fixed("Obama", ignore_case = TRUE)),
!str_detect(text, fixed("UraniumOne", ignore_case = TRUE)),
!str_detect(text, fixed("Mueller", ignore_case = TRUE)),
!str_detect(text, fixed("Hillary", ignore_case=TRUE)),
!str_detect(text, fixed("Brennan", ignore_case=TRUE)),
!str_detect(text, fixed("BUNDY", ignore_case=TRUE)),
!str_detect(text, fixed("MAGA", ignore_case=TRUE)),
!str_detect(text, fixed("realDonaldTrump", ignore_case=TRUE)),
!str_detect(text, fixed("Obame", ignore_case=TRUE)),
!str_detect(text, fixed("uranium 1", ignore_case=TRUE)),
!str_detect(text, fixed("contaminate", ignore_case=TRUE)),
!str_detect(text, fixed("munitions", ignore_case=TRUE)),
!str_detect(text, fixed("AngelaMerk", ignore_case=TRUE)),
!str_detect(text, fixed("TheMighty200", ignore_case=TRUE)),
!str_detect(text, fixed("uranium-free water", ignore_case=TRUE)))