What should I do if the length of the list used in furrr::future_pmap is different?

What should I do if the length of the list used in furrr is different?

I am using my own function with the following code using for().

I want to connect the outputs, but the number of records in the data does not match and I cannot join them.


library(tidyverse)

my_iris <- iris %>% head()


iris_df_make <- function(x=NULL,empty_df=NULL,new_col=NULL,new_col2=NULL,chr_col=NULL){
  
  inside_df <- x %>% 
    mutate(new_col=new_col,
           new_col2=new_col2,
           chr_col=ifelse(chr_col=="to","TO",chr_col))
  
  empty_df <- bind_rows(empty_df,inside_df)
  
}
  
df_1 = NULL
df_2 = NULL
df_3 = NULL
    
for(i in 1:3){
    for(o in 1:2){
      df_1 <- iris_df_make(x=my_iris,empty_df=df_1,new_col=i,new_col2=o,chr_col = "how")
      df_2 <- iris_df_make(x=my_iris,empty_df=df_2,new_col=i,new_col2=o,chr_col = "to")
      df_3 <- iris_df_make(x=my_iris,empty_df=df_3,new_col=i,new_col2=o,chr_col = "make")
    }
}

My own function is not as simple as the above, but is a complex and time-consuming process.
For this reason, I decided to use furrr's pmap and learned from the following website.

future_pmap(df, gsub)

The code has been improved based on the above.


library(furrr)
plan(multisession, workers = 2)

df_1 = NULL
df_2 = NULL
df_3 = NULL

for(i in 1:3){
  for(o in 1:2){

    param <- data.frame(
      x=list(my_iris,my_iris,my_iris),
      empty_df=c(df_1,df_2,df_3),
      new_col=c(i,i,i),
      new_col2=c(o,o,o),
      chr_col=c("how","to","make")
    )
      
    pmap_res <- future_pmap(param,iris_df_make)
  }
}

However, I get an error.
I am interested in hearing ideas on how to improve it.

If this improvement code is inefficient, I would like to hear about other implementation methods as well as executing future_pmap.

The error code is as follows
I'm translating with google

   The argument contains a data frame with different number of columns: 6, 0, 3 

thank you!

from my interpretation your function is somewhat misbehaved because it relies on its own previous ouputs for the required results, this makes it nontrivial to parallelise, but it seems that this can be achieved if seperating the function out and putting it on rails so to speak. Here is an example


library(tidyverse)

my_iris <- iris %>% head()


iris_df_make_inner <- function(x=NULL,empty_df=NULL,new_col=NULL,new_col2=NULL,chr_col=NULL){
  
  inside_df <- x %>% 
    mutate(new_col=new_col,
           new_col2=new_col2,
           chr_col=ifelse(chr_col=="to","TO",chr_col))
  
  empty_df <- bind_rows(empty_df,inside_df)
  
}

iris_df_make_outer <- function(chr_col){
df_x = NULL
for(i in 1:3){
  for(o in 1:2){
    df_x <- iris_df_make_inner(x=my_iris,empty_df=df_x,new_col=i,new_col2=o,chr_col = chr_col)
  }
}
df_x
}

library(furrr)
plan(multisession, workers = 3)
chr_col<-c("how","to","make")
 future_map(chr_col,
            iris_df_make_outer)

You could further parametrise the iris_df_make_outer function to take set values to loop over (imax, omax) if needed.

1 Like

your code solve my ploblem !
and I've learned a lot about your way.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.