Custom function to split data frame, apply a function then recombine

I want to create a custom function that breaks apart large data sets, applies a function to them, then recombines. The datasets will all be different lengths so I'm trying to use a loop but can't figure out how to create these multiple temporary objects as the number will vary based on the input data set.

For example, if I want to slice the data into chunks of 25, how do I get it to iterate over the entire data set n number of times, and recombine? Below is an example of where I'm at. I can split and re-combine the iris data set but want it to, programmatically, apply the function 6 times (for 150 observations) but then adjust if there are 78, or 1200 observations

my_function <- function (df){

  library(tidyverse)

  x <- seq(1:nrow(df))
  b <- seq (0,max (x), by = 25)
  
  # Quick, silly function
  cust_function <- function (df){
    df |>
      mutate (new_col = Sepal.Length + Sepal.Width)
  }

  for (i in b){
    temp_1 <- df |>
      slice (1:b[2]) |>
      cust_function()

    temp_2 <- df |>
          slice ((b[2]+1):b[3]) |>
      cust_function()

    temp <- temp_1 |>
      bind_rows(temp_2)
  }
  temp
}

my_function(iris)

You can try this sort of approach.

my_function <- function (df){
  
  library(tidyverse)
  
  dfx <- mutate(df,
                group = (-1+row_number()) %/% 25)
  
  # Quick, silly function
  cust_function <- function (df){
    df |>
      mutate (new_col = Sepal.Length + Sepal.Width)
  }
  

  split_df <- group_by(dfx,
               group) |> group_split(.keep = FALSE)
  

  map_dfr(split_df,
      cust_function)
    
}

my_function(iris)

#works on ragged frames
shiris <- slice_head(iris,n=66)
my_function(shiris)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.