Looping over various lists of dataframes to create a variable in each dataframe

I have a four lists, each of which contain 12 dataframes. Something like this: `

for (i in 1:10) {
assign(paste0("df", i), data.frame(c=c(1,2,3), d=c(1,2,3), 
                                   e=c(1,2,3), e=c(1,2,3),
                                   g=c(1,2,3), h=c(1,2,3), 
                                   i=c(1,2,3), j=c(1,2,3),
                                   k=c(1,2,3), l=c(1,2,3)))
}

for (i in 1:4) {
  assign(paste0("list_", i), lapply(ls(pattern="df"), get))
}


rm(list=ls(pattern="df"))

`

Each list corresponds to a year and each of its elements (the dataframes) correspond to a month. Conveniently, the position of the each element of the list (of each dataframe) is equivalent to its month. So, in the first list, the first dataframe corresponds to january 2020, the second to february 2020, and so on. In the second list, the first dataframe corresponds to january 2021, the second to february 2021, and so on.

What i need to do is to create a new variable that indicates the month of each dataframe.

I have been trying different things, including this:

for(j in 20:21) {
for(i in 1:12) {
assign(get(paste0("df_20", j))[[i]], 
       get(paste0("df_20", j))[[i]] %>% 
       mutate(month=i)) ## se le suma 1 al mes porque comienza desde febrero
}
}

But nothing works. The problem seems to be the left hand of the assignment. When I use the get() function, the software returns an error ("Error in assign(get(paste0("mies_20", j))[[1]], get(paste0("mies_20", : invalid first argument"). If if don't include this function, paste0("df_20", j))[[i]] doest not reconize the "[[i]]".

Any ideas?

Welcome to the community @svaldivieso! Is this what you are trying to do?

library(tidyverse)

# setup
for (i in 1:12) {
  assign(paste0("df", i), data.frame(c=c(1,2,3), d=c(1,2,3), 
                                     e=c(1,2,3), e=c(1,2,3),
                                     g=c(1,2,3), h=c(1,2,3), 
                                     i=c(1,2,3), j=c(1,2,3),
                                     k=c(1,2,3), l=c(1,2,3)))
  }

for (i in 1:4) {
  assign(paste0("list_", i), lapply(ls(pattern="df"), get))
}

rm(list=ls(pattern="df")); rm(i)

lists = ls()

# function to walk through; pass lists as "i"
walk_this = function(i) {
  d = eval(parse(text = i))
  
  add_month = function(j) {
    d[[j]]$month <<- j
  }
  
  walk(1:length(d), add_month)
  assign(i, d, globalenv())
}

# execute over all lists
walk(lists, walk_this)

# first 4 data frames in list_1 (with month added)
list_1[1:4]
#> [[1]]
#>   c d e e.1 g h i j k l month
#> 1 1 1 1   1 1 1 1 1 1 1     1
#> 2 2 2 2   2 2 2 2 2 2 2     1
#> 3 3 3 3   3 3 3 3 3 3 3     1
#> 
#> [[2]]
#>   c d e e.1 g h i j k l month
#> 1 1 1 1   1 1 1 1 1 1 1     2
#> 2 2 2 2   2 2 2 2 2 2 2     2
#> 3 3 3 3   3 3 3 3 3 3 3     2
#> 
#> [[3]]
#>   c d e e.1 g h i j k l month
#> 1 1 1 1   1 1 1 1 1 1 1     3
#> 2 2 2 2   2 2 2 2 2 2 2     3
#> 3 3 3 3   3 3 3 3 3 3 3     3
#> 
#> [[4]]
#>   c d e e.1 g h i j k l month
#> 1 1 1 1   1 1 1 1 1 1 1     4
#> 2 2 2 2   2 2 2 2 2 2 2     4
#> 3 3 3 3   3 3 3 3 3 3 3     4
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.