Transferring values from unnamed vectors to result using purrr

dkane · October 4, 2019, 4:36pm

I understand how use named vectors with purrr, as in this example:

library(tidyverse)
plain_vector <- c(2, 3)
named_vector <- set_names(plain_vector, plain_vector)

map_df(named_vector, ~ sample_n(mtcars[, 1:2], size = .), .id = "size")
#>   size  mpg cyl
#> 1    2 17.8   6
#> 2    2 19.7   6
#> 3    3 26.0   4
#> 4    3 13.3   8
#> 5    3 21.4   4

^{Created on 2019-10-04 by the reprex package (v0.3.0)}

But that seems a bit of a hack. I often just have a simple unnamed vector, as in plain_vector above. How can I ensure that the values in the input vector get transferred to the resulting object, even in situations in which the vector has no names? The obvious approach gives me:

library(tidyverse)
plain_vector <- c(2, 3)
map_df(plain_vector, ~ sample_n(mtcars[, 1:2], size = .), .id = "size")
#>   size  mpg cyl
#> 1    1 10.4   8
#> 2    1 15.2   8
#> 3    2 13.3   8
#> 4    2 15.2   8
#> 5    2 15.5   8

^{Created on 2019-10-04 by the reprex package (v0.3.0)}

Note that, instead of keeping the values of size --- 2 and 3 --- I get the index of their position in the plain_vector input --- 1 and 2. This is not a bug. The documentation for .id reports:

Either a string or NULL . If a string, the output will contain a variable with that name, storing either the name (if .x is named) or the index (if .x is unnamed) of the input.

But it still seems like there should be a better way than just giving every vector names which correspond to the values of that vector. Am I missing something obvious?

pieterjanvc · October 6, 2019, 1:52am

Hi,

I know this is probably not the best way of solving it, but I can get the same output without having to use a named vector with this implementation:

library(tidyverse)

plain_vector <- c(2, 3)

map_df(plain_vector, function(x) {
  sample_n(mtcars[, 1:2], size = x) %>% mutate(size = x)
})

   mpg cyl size
1 18.1   6    2
2 21.4   4    2
3 30.4   4    3
4 32.4   4    3
5 15.0   8    3

Kind regards,
PJ

cderv · October 6, 2019, 7:32am

set_names seems to me a very good solution. Know that by default, it will use the value of the vector to name it. This is useful in your case.

library(tidyverse)
plain_vector <- c(2, 3)
map_df(set_names(plain_vector), ~ sample_n(mtcars[, 1:2], size = .), .id = "size")
#>   size  mpg cyl
#> 1    2 15.8   8
#> 2    2 26.0   4
#> 3    3 15.2   8
#> 4    3 22.8   4
#> 5    3 30.4   4

^{Created on 2019-10-06 by the reprex package (v0.3.0)}

Creating the size column without .id but explicitely after like @pieterjanvc suggested is also a very good solution.

dkane · October 7, 2019, 1:03pm

Thanks to pieterjanvc and cderv for the help! Especially appreciate the tip about the default behavior of set_names().

system · October 28, 2019, 1:03pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.