Loop through all dfs in environment

Hi all. I have written a function to label a df column based on the df name like so

label_source <- function(object){
  col_entries <- substitute(object)
  object %>% 
    mutate(source = as.character(col_entries))
  }

I have 30 + csvs loaded and I would like to write a loop to go through each one and add the new column using my function but don't know howto. I guess I need to create a list from the dataframes first, then loop through this overwriting the orignal objects to create the new dfs with the extra column

you can make lists of files with dir() and list.files() functions <- if its csv's you want
otherwise the ls() function looks at named objects in your environment ('loaded csv's i.e. dataframes)
you can iterate nicely with map or walk functions from purrr package, which is part of the tidyverse.

Thanks for your quick reply

So when I create the list and try to loop through I get "Error in eval(lhs, parent, parent) : object 'i' not found"

list_of_dfs <- dir()

for i in list_of_dfs{
  label_source(i)
}

dir() would be for your csv's rather than your data.frames... so it would be better to call it list_of_csvs or change to ls funtion and filter the result based on the class containing data.frame.
I note that the syntax for the for loop isnt right, as its missing round brackets like
for (i in list_of_dfs){
but I would also recommend purrr functions over for loops, something to think about

further note, all the functions such as dir() or ls() whether they are listing files on your drive, or frames in your r environment only list the names of the objects (as character vectors), so if you want to treat them as symbols , or get() the contents you would need an extra step

Thanks again :slight_smile:

I'll have to look into purrr more

Right now I'm getting

Error in UseMethod("mutate_") :
no applicable method for 'mutate_' applied to an object of class "character"

list_of_csv <- dir()

for (i in list_of_dfs){
  label_source(i)
}

yes, so before you mutate a df, you need to get() the df object represented by the name of the df object.
for example

  label_source(get(i))

thank you again! that's perfect

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.