How to use grep to select column indexes containing phrase from another data frame?

How do I select all column indexes that contain the date from another data frame? I am trying to select all column indexes from dataset_rep that are contained in dates_data$dates. The below code isn't working. Only index 1 is listed, but I expected the other returns to not be NULL. I would appreciate help and if you could explain why this isn't working that would be great! I'm trying to keep the output the same format as it is now but accurately have the expected indexes. Something is wrong with the loop I think.

I expected not just 1 but 1,3,4,5 indexes.

Reproducible example:

library(tidyverse)
library(lubridate)


dataset_rep <-
  data.frame(`2022-01-01_adoptions` = c(122, 122,133), 
             `2021-12-11_adoptions` = c(111,111,133), 
             `2022-03-08_adoptions` = c(444,222,111), 
             `2022-04-01_adoptions` = c(444,389,122), 
             `2022-04-11_adoptions` = c(333,542,123), 
             `2022-05-01_adoptions` = c(421,555,211))

dataset_rep <-
  dataset_rep %>%
  rename(`2022-01-01_adoptions` = `X2022.01.01_adoptions`)

dataset_rep <-
  dataset_rep %>%
  rename(`2021-12-01_adoptions` = `X2021.12.11_adoptions`)

dataset_rep <-
  dataset_rep %>%
  rename(`2022-03-08_adoptions` = `X2022.03.08_adoptions`)

dataset_rep <-
  dataset_rep %>%
  rename(`2022-04-01_adoptions` = `X2022.04.01_adoptions`)

dataset_rep <-
  dataset_rep %>%
  rename(`2022-04-11_adoptions` = `X2022.04.11_adoptions`)

dataset_rep <-
  dataset_rep %>%
  rename(`2022-05-01_adoptions` = `X2022.05.01_adoptions`)

dates_data <-
  data.frame(dates = c('2022-01-01', '2022-03-08', '2022-04-01', '2022-04-11'))


cols_relevant <- vector("list", nrow(dates_data))
for (i in seq_along(dates_data)) {
  cols_relevant[[i]] <- grep(dates_data$dates[[i]], colnames(dataset_rep))
} #1,3,4,5 indexes would have been expected



Thank you

When you do seq_along(dates_data) you are iterating over the columns of dates_data. There is one column, so the for loop runs once. You want to do seq_along(dates_data$dates).

dataset_rep <-
  data.frame(`2022-01-01_adoptions` = c(122, 122,133), 
             `2021-12-11_adoptions` = c(111,111,133), 
             `2022-03-08_adoptions` = c(444,222,111), 
             `2022-04-01_adoptions` = c(444,389,122), 
             `2022-04-11_adoptions` = c(333,542,123), 
             `2022-05-01_adoptions` = c(421,555,211),
            check.names = FALSE)

dates_data <-
  data.frame(dates = c('2022-01-01', '2022-03-08', '2022-04-01', '2022-04-11'))

cols_relevant <- vector("list", nrow(dates_data))

for (i in seq_along(dates_data$dates)) {
  cols_relevant[[i]] <- grep(dates_data$dates[[i]], colnames(dataset_rep))
}
cols_relevant
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 3
#> 
#> [[3]]
#> [1] 4
#> 
#> [[4]]
#> [1] 5

Created on 2022-05-27 by the reprex package (v2.0.1)

By the way, you can get the same result more compactly with lapply.

dataset_rep <-
  data.frame(`2022-01-01_adoptions` = c(122, 122,133), 
             `2021-12-11_adoptions` = c(111,111,133), 
             `2022-03-08_adoptions` = c(444,222,111), 
             `2022-04-01_adoptions` = c(444,389,122), 
             `2022-04-11_adoptions` = c(333,542,123), 
             `2022-05-01_adoptions` = c(421,555,211),
             check.names = FALSE)

dates_data <-
  data.frame(dates = c('2022-01-01', '2022-03-08', '2022-04-01', '2022-04-11'))

col_relevant <- lapply(dates_data$dates, function(DATE) grep(DATE, colnames(dataset_rep)))
1 Like

Thank you! I will look into lapply. Appreciate you explaining what I was doing wrong and the help!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.