Hello. I created a function that returns multiple values. I then used the function in a mutate with map - it returns a column that is of type list. I want to mutate another new column that does a simple function on all the values stored in the cell to the left but I'm getting errors. Could you help me to understand how to access the values stored in the list for additional mutates? I have tried various versions of unnest() and indexing but haven't found the proper answer. Thank you!
library(tidyverse)
example_seq_tbl <- tibble(var_a = seq(from = 1, to = 5, by = 1))
#a function that returns 10 values
x_function <-
function(x){
rnorm(n = 10,
mean = x
)
}
#map function over var_a, now each observation in new_col has 10 values
another_tbl <- example_seq_tbl %>%
mutate(new_col = map(var_a, x_function))
#this code below doesn't work but shows what I am trying to do
#i want to add a new column that operates on the 10 values in new_col like, max, mean, quantile, etc
another_tbl %>%
mutate(another_col = max(new_col))
library(purrr)
#> Warning: package 'purrr' was built under R version 3.5.3
library(tibble)
#> Warning: package 'tibble' was built under R version 3.5.3
library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.5.3
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
#> Warning: package 'tidyr' was built under R version 3.5.3
example_seq_tbl <- tibble(var_a = seq(from = 1, to = 5, by = 1))
#a function that returns 10 values
x_function <-
function(x){
rnorm(n = 10,
mean = x
)
}
#map function over var_a, now each observation in new_col has 10 values
another_tbl <- example_seq_tbl %>%
mutate(new_col = map(var_a, x_function))
third_tbl <- another_tbl %>%
mutate(another_col = map_dbl(new_col, ~max(.)))
third_tbl
#> # A tibble: 5 x 3
#> var_a new_col another_col
#> <dbl> <list> <dbl>
#> 1 1 <dbl [10]> 2.09
#> 2 2 <dbl [10]> 3.19
#> 3 3 <dbl [10]> 3.84
#> 4 4 <dbl [10]> 5.52
#> 5 5 <dbl [10]> 5.96
Again <- another_tbl %>% unnest(cols = new_col) %>%
group_by(var_a) %>% summarize(Max = max(new_col))
Again
#> # A tibble: 5 x 2
#> var_a Max
#> <dbl> <dbl>
#> 1 1 2.09
#> 2 2 3.19
#> 3 3 3.84
#> 4 4 5.52
#> 5 5 5.96
Created on 2019-12-13 by the reprex package (v0.3.0.9000)
@ramirabal In purrr, ~ is shorthand for defining anonymous functions while . is used to refer to the current element of the iterable (kind of how you might define an iterator i in a for loop.
Hi,
can we use .x as well for refering to the current element of the iterable (current subset of dataset) ?
Here, it gives the same result, but will it always be the case ?