Help with workflow: function to facilitate adding new columns from a list of names (or equivalent)

Hello. I have a workflow that currently looks as follows:

*extract posterior draws from a model into a tibble that gets a unique name (basically just make a tibble)
*do some wrangling of the tibble w/ dplyr; last step in wrangling is to add a mutate() with a unique identifier ("model_1, model_2, etc) to indicate which model it came from
*bind_rows() of all the tibbles in anticipation of ggplot
*ggplot with facets mapped to the unique identifier.

My question is: in reality the wrangling takes up quite a few lines of code and with many models I"m copying an pasting a lot. I would like to use purrr or an equivalent for loop to turn the wrangling into a function. I cannot figure out how, within the function, to do a mutate with a unique identifier.

Is there a way to use purr to operate on many tibbles and add a unique name with a mutate statement for each one? I'm open to other options for achieving the same endpoint.

Example (this does what I want but has a lot of copy/paste)

library(tidyverse)

# get a tibble of observations from a model
model_b154_posterior_draws_tbl <- tibble(
  param_draw = rnorm(15, 1, 1),  
  model = as_factor("model_b154")) %>%
  mutate(adjusted_draws = param_draw * 2)

# another tibble with unique name  
model_65kj_posterior_draws_tbl <- tibble(
  param_draw = rnorm(15, 5, 2),
  model = as_factor("model_65kj")) %>%
  mutate(adjusted_draws = param_draw * 2)

#another tibble with unique name
model_4hwe_posterior_draws_tbl <- tibble(
  param_draw = rnorm(15, 10, 3),
  model = as_factor("model_4hwe")) %>%
  mutate(adjusted_draws = param_draw * 2)

#bind together
bind_rows(
  model_b154_posterior_draws_tbl,
  model_65kj_posterior_draws_tbl,
  model_4hwe_posterior_draws_tbl
) %>%
  select(-param_draw) %>%
  
#visualize and facet on model unique identifier
  ggplot(aes(x = 0, y = adjusted_draws)) +
  geom_point() +
  facet_wrap(~model)

Thank you!

1 Like

Perhaps your example is too simplified compared to what you actually want to be able to do. What you showed above can probably be simplified to:

library(tidyverse)
  
# get a tibble of observations from a model
model_b154_posterior_draws_tbl <- 
  tibble(
    param_draw = rnorm(15, 1, 1)
  )

# another tibble with unique name  
model_65kj_posterior_draws_tbl <- 
  tibble(
    param_draw = rnorm(15, 5, 2)
  )

#another tibble with unique name
model_4hwe_posterior_draws_tbl <- 
  tibble(
    param_draw = rnorm(15, 10, 3)
  )

bind_rows(
  model_b154 = model_b154_posterior_draws_tbl,
  model_65kj = model_65kj_posterior_draws_tbl,
  model_4hwe = model_4hwe_posterior_draws_tbl,
  .id = 'model'
) %>% 
  mutate(adjusted_draws = param_draw * 2)
#> # A tibble: 45 x 3
#>    model      param_draw adjusted_draws
#>    <chr>           <dbl>          <dbl>
#>  1 model_b154      2.12           4.25 
#>  2 model_b154      1.62           3.23 
#>  3 model_b154      0.357          0.713
#>  4 model_b154      1.50           2.99 
#>  5 model_b154      0.497          0.995
#>  6 model_b154      0.905          1.81 
#>  7 model_b154     -0.118         -0.235
#>  8 model_b154      1.48           2.97 
#>  9 model_b154      0.806          1.61 
#> 10 model_b154      1.50           3.00 
#> # … with 35 more rows

Created on 2020-01-17 by the reprex package (v0.3.0)

1 Like

Great reprex

Using @mattwarkentin's suggestion, I suggest creating a function

p_draw <- function(x,y) {
  x  <- 
tibble(
  param_draw = rnorm(15, 10, 3),
  model = as_factor(y)) %>%
  mutate(adjusted_draws = param_draw * 2) )
}

and another function for the rowbinds,

Then purrr:map the functions to a vector of model names and tibble names (which you can probably construct easily from the model name with stringr).

1 Like

Thank you, this is elegant and should work nicely

1 Like

If it does, please post your working code and mark as the solution for the benefit of those to follow (no false modesty!).

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.