Using a for loop in R to loop through the name of dataframes

I have data on mergers for 20 years for various firms. I have used a "for" loop in R to separate data for each year which gives me 20 data frames in the global environment. Each data frame is identified by its year: Merger2000 to Merger2019 for 20 years. Now I want to write another for loop to find the unique companies in each data frame (that is, unique firms in each year). Each company is identified by a unique company code (co_code). I know how to do this for each year separately. For example, for the year 2000, I would do something like:

uniquemerger2000 <- Merger2000 %>% distinct(co_code, .keep_all = TRUE)

How do I run a for loop to enable this operation for all years (that is from 2000-2019)? There is some indexing required in the code but I am not sure how to operationalise this in a loop.

Any help would be appreciated. Thanks!

Hi, welcome!

There is nothing fundamentally wrong with using a for loop but this is rarely the best way of doing things in R, if you could ask this with a minimal REPRoducible EXample (reprex) illustrating your issue, very likely someone will come up with a better (or at least more idiomatic) solution.

If you've never heard of a reprex before, you might want to start by reading this FAQ:

Thanks. I will keep that in mind when I post a question next time.


#making example data ...
(example_data <- dplyr::storms %>% filter(day==28) %>%
                                           select(name,year,month,day,wind,pressure) %>%
           stat_1 = wind,
           stat_2 = pressure)  %>% distinct())

# write a for loop to split the example data by year, 
# and then write more for loops to process those?

# or query the data directly
# looking for unique name year combinations

example_data %>% 
  group_by (company_name,year) %>% 
  count(name = "entries_per_year") %>% 
  filter(entries_per_year == 1)