Help adapting a function

Hello,
I took a code I viewed on this forum.
It just list the datasets that contains a variable equal to some string:

df1 <- tibble(score_a=1:20,
              sex_a=rep(c("M", "F"), 10))
df2 <- tibble(score_b=1:20,
              sex_b=rep(c("M", "F"), 10))
df3 <- tibble(score_c=1:20,
              sex_c=rep(c("M", "F"), 10))
df4 <- tibble(score_c=1:20,
              sex_c=rep(c("M", "F"), 10))

data_list<-list(df1,df2,df3,df4)
names(data_list)<-c("df1","df2","df3","df4")
names(data_list)

mifx <- function(data, text) {
  data %>%
    map_df(~names(.x)) %>%
    pivot_longer(cols = everything(), names_to = "dataframe", values_to = "variables") %>%
    filter(variables == text)
}

mifx(data_list,"score_a")

mifx list the dataframe and the variables with the match.
I rewrite the code in order to perfome a partial match, but It fails.

mifx2 <- function(data, text) {
  data %>%
    map_df(~names(.x)) %>%
    pivot_longer(cols = everything(), names_to = "dataframe", values_to = "variables") %>%
    filter(str_match(variables,text))
}

mifx2(data_list,"score")

What did I do wrong?
Thanks for your time and interest.
Have a nice day.

Nice reprex.

Brought a full set of golf clubs to a miniature golf course?

df1 <- data.frame(score_a=1:20,
              sex_a=rep(c("M", "F"), 10))
df2 <- data.frame(score_b=1:20,
              sex_b=rep(c("M", "F"), 10))
df3 <- data.frame(score_c=1:20,
              sex_c=rep(c("M", "F"), 10))
df4 <- data.frame(score_c=1:20,
              sex_c=rep(c("M", "F"), 10))

data_list<-list(df1,df2,df3,df4)
names(data_list)<-c("df1","df2","df3","df4")

# interest in names of variables
sapply(data_list,colnames)
#>      df1       df2       df3       df4      
#> [1,] "score_a" "score_b" "score_c" "score_c"
#> [2,] "sex_a"   "sex_b"   "sex_c"   "sex_c"
# interest in only the names of the first variable
sapply(data_list,colnames)[1,]
#>       df1       df2       df3       df4 
#> "score_a" "score_b" "score_c" "score_c"
# interest in only names of first variable ending in a or b
grep("a|b$",sapply(data_list,colnames)[1,], value = TRUE)
#>       df1       df2 
#> "score_a" "score_b"
# interest in sizes
sapply(data_list,dim)
#>      df1 df2 df3 df4
#> [1,]  20  20  20  20
#> [2,]   2   2   2   2

Created on 2023-07-02 with reprex v2.0.2

Most questions that I see here appear to come from people who had my experience in using {ggplot2} as a gateway drug and then went on to collect the whole set of the apps designed along "tidy" principles. Over years, as more and more rolled out I found myself with cheatsheets to my collection of cheatsheets because it began to seem to me that every problem had its own unique combination of tidy tools. I began to suffer severe syntax fatigue.

So, I went back to the parts I had skipped over originally, and took a spell of doing everything in {base}. What I found is that once over the residual school trauma of punctuation, it is easier to remember how to do things because it keeps me tightly focused on exactly what it is that I'm trying to do at every step of the way and not trying to invoke some sequence of steps that work in the right order but not always for obvious reasons.

1 Like

str_match returns the strings that were found, then filter doesnt know what to do with that. filter wants to consume logicals (TRUE/FALSE) rather than strings. Therefore simply replace str_match() with str_detect(); str_detect gives logical (TRUE/FALSE) returns rather than characters

2 Likes

Thanks for your help.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.