HI everyone. I'd like to filter instances where respondents have answered an open-ended text question and display those responses along with their case id. I'm able to do this on a column-by-column basis, but since I need to perform this operation on several text variables, I'd like to create a function to minimize repetition. My attempt isn't working because it also shows cases that are "missing" on the text question. How can I write a function along the lines of example 2 (below) to produce the results of example 1? Thanks.
# a reproducible dataset
case_id <- c('case1', 'case2', 'case3', 'case4', 'case5')
text_q <- c('words', "", "", 'more words', 'other words')
other_q <- rep(1:5)
data <- tibble(case_id, text_q, other_q)
data
#example 1: the result I'm aiming for, keeping only cases with text responses
data %>%
filter(text_q != "") %>%
select(case_id, text_q)
#example 2: the code below lists all of the cases, but I'd like to filter out non-responses, as
#in the above example.
text_detect <- function(df, x){
df %>%
filter(x != "") %>%
select(case_id, x)
}
text_detect(data, "text_q")
The filter and select function use non-standard evaluation, you pass column names to them without quotation marks as if the names were variables. To pass the desired column name into a function, you can use the {{ }} syntax from rlang and then not use quotation marks for the argument of your function.
library(rlang)
library(dplyr, warn.conflicts = FALSE)
case_id <- c('case1', 'case2', 'case3', 'case4', 'case5')
text_q <- c('words', "", "", 'more words', 'other words')
other_q <- rep(1:5)
data <- tibble::tibble(case_id, text_q, other_q)
text_detect <- function(df, x){
df %>%
filter({{x}} != "") %>%
select(case_id, {{x}})
}
text_detect(data, text_q)
#> # A tibble: 3 x 2
#> case_id text_q
#> <chr> <chr>
#> 1 case1 words
#> 2 case4 more words
#> 3 case5 other words