Read multiple files and track file names using vroom::vroom_lines()

library(dplyr)
library(vroom)
library(fs)

# Create Toy Data--------
writeLines("FirstLine\nAnotherLine\nThirdLine", "data/file_01.txt")
writeLines("1st\n2nd\n3rd", "data/file_02.txt")

# Read--------
df <- vroom_lines(fs::dir_ls(path = "data", glob = "data/file_*txt")) %>% 
  as_tibble()
> df
# A tibble: 6 × 1
  value      
  <chr>      
1 FirstLine  
2 AnotherLine
3 ThirdLine  
4 1st        
5 2nd        
6 3rd  

How can I read the files as above but, in addition, include a column that will track the file names? Something like this:

> df_wanted
# A tibble: 6 × 2
  value       file_name  
  <chr>       <chr>      
1 FirstLine   file_01.txt
2 AnotherLine file_01.txt
3 ThirdLine   file_01.txt
4 1st         file_02.txt
5 2nd         file_02.txt
6 3rd         file_02.txt

This is an option

df <- fs::dir_ls(path = "data", glob = "data/file_*txt") %>%
    purrr::map_dfr(~ vroom_lines(.x) %>% as_tibble, .id = "file")
1 Like

@andresrcs Many thanks for the elegant solution!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.