help
September 23, 2022, 4:54pm
1
hi!! I am interested in seeing if a variable has words in it, besides "cats", "dogs", "cat", and "dog".. by making a new column that tests this called "check". below is what I want my output to be. thank u !! <3
number check
8 cats 9 dogs FALSE
8 cats 11 dogs FALSE
9 rats 0 dogs TRUE
8 cats 1 toy TRUE
1 cat 1 dog FALSE
Below is one approach. I added a mix of upper case letters and punctuation to the sample data to make it more generalizable.
library(dplyr)
library(stringr)
d = data.frame(
number = c('8 cats 9 dogs',
'8 Cats 11 DOgs',
'9 RATS, 0 dogs',
'8 cats - 1 toy!!',
'1 cat...1 dog9')
)
# add remove words in all lowercase
vector_of_words_to_remove = c('cats',
'dogs',
'cat',
'dog')
# transform for use in function below
words_to_remove = paste0(vector_of_words_to_remove, collapse = '|')
d %>%
# make everything lowercase
mutate(number = tolower(number)) %>%
# remove words listed in vector
mutate(check = str_replace_all(number, words_to_remove, '')) %>%
# remove everything except remaining letters (i.e. numbers, punctuation)
mutate(check = str_replace_all(check, '[^a-z]', ' ')) %>%
# collapse all spaces
mutate(check = str_squish(check)) %>%
# assign T/F
mutate(check = ifelse(check != '', TRUE, FALSE))
#> number check
#> 1 8 cats 9 dogs FALSE
#> 2 8 cats 11 dogs FALSE
#> 3 9 rats, 0 dogs TRUE
#> 4 8 cats - 1 toy!! TRUE
#> 5 1 cat...1 dog9 FALSE
Created on 2022-09-23 with reprex v2.0.2.9000
1 Like
help
September 27, 2022, 1:37am
3
hi thank u so much. I just ran this code and im not entirely sure I understand what this does
mutate(check = str_replace_all(check, '[^a-z]', ' ')) %>%
It replaces everything (except letters) with a space.
d %>%
mutate(number = tolower(number)) %>%
mutate(check = str_replace_all(number, words_to_remove, ''))
#> number check
#> 1 8 cats 9 dogs 8 9
#> 2 8 cats 11 dogs 8 11
#> 3 9 rats, 0 dogs 9 rats, 0
#> 4 8 cats - 1 toy!! 8 - 1 toy!!
#> 5 1 cat...1 dog9 1 ...1 9
Result when adding the following line.
# remove everything except remaining letters (i.e. numbers, punctuation)
mutate(check = str_replace_all(check, '[^a-z]', ' '))
#> number check
#> 1 8 cats 9 dogs
#> 2 8 cats 11 dogs
#> 3 9 rats, 0 dogs rats
#> 4 8 cats - 1 toy!! toy
#> 5 1 cat...1 dog9
system
Closed
October 18, 2022, 2:43am
5
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.