# A tibble: 560 x 6
cut color clarity table data new
<ord> <ord> <ord> <dbl> <list> <list>
1 Ideal E SI2 55 <tibble [3 x 6]> <dbl [560]>
2 Premium E SI1 61 <tibble [4 x 6]> <dbl [560]>
3 Good E VS1 65 <tibble [1 x 6]> <dbl [560]>
4 Premium I VS2 58 <tibble [2 x 6]> <dbl [560]>
5 Good J SI2 58 <tibble [1 x 6]> <dbl [560]>
6 Very Good J VVS2 57 <tibble [1 x 6]> <dbl [560]>
7 Very Good I VVS1 57 <tibble [1 x 6]> <dbl [560]>
8 Very Good H SI1 55 <tibble [2 x 6]> <dbl [560]>
9 Fair E VS2 61 <tibble [1 x 6]> <dbl [560]>
10 Very Good H VS1 61 <tibble [1 x 6]> <dbl [560]>
# ... with 550 more rows
Where each value of sum(.x$x) has been added to the entire table column instead of just the appropriate value for the row. I expected the following lines to be equivalent:
but they are not. How do I specify just the appropriate row values of table in map()?
Note this is a toy example for a custom function that requires this approach. I can't just extract the mean(.x$x) into a new column and add it to table. I've also tried nesting table within data but that doesn't work with my function either.
A prose description plus code snippets isn't enough, you also need to make a simple reprex that:
Builds the input data you are using.
The function you are trying to write, even if it doesn't work.
Usage of the function you are trying to write, even if it doesn't work.
Builds the output data you want the function to produce.
You can learn more about reprex's here:
Right now the is an issue with the version of reprex that is in CRAN so you should download it directly from github.
Until CRAN catches up with the latest version install reprex with
devtools::install_github("tidyverse/reprex")
The reason we ask for a reprex is that it is the easiest and quickest way for someone to understand the issue you are running into and answer it and to see the results you are seeing.
Nearly everyone here who is answering questions is doing it on their own time and really appreciate anything you can do to minimize that time.
In any case the issue you are running into is that table is a column, i.e. a vector, as you are using it in the map function. It looks like map should just pass in the value of table for each row but that isn't how map interprets table
The reprex below I think will help you see what is happing and how to fix it.
suppressPackageStartupMessages(library(tidyverse))
df <- diamonds %>%
head(n=1000) %>%
nest(-cut, -color, -clarity, -table)
# map returns a list not
# an atomic value. map_dbl might be a better for what
# you are trying to do because it nicely prints out a value
# and can be easier to work with
df %>%
mutate(new = map_dbl(data, ~ sum(c(.x$x, .x$y, .x$z))))
#> # A tibble: 560 x 6
#> cut color clarity table data new
#> <ord> <ord> <ord> <dbl> <list> <dbl>
#> 1 Ideal E SI2 55. <tibble [3 × 6]> 41.2
#> 2 Premium E SI1 61. <tibble [4 × 6]> 55.9
#> 3 Good E VS1 65. <tibble [1 × 6]> 10.4
#> 4 Premium I VS2 58. <tibble [2 × 6]> 27.2
#> 5 Good J SI2 58. <tibble [1 × 6]> 11.4
#> 6 Very Good J VVS2 57. <tibble [1 × 6]> 10.4
#> 7 Very Good I VVS1 57. <tibble [1 × 6]> 10.4
#> 8 Very Good H SI1 55. <tibble [2 × 6]> 26.2
#> 9 Fair E VS2 61. <tibble [1 × 6]> 10.1
#> 10 Very Good H VS1 61. <tibble [1 × 6]> 10.4
#> # ... with 550 more rows
# when you pass in table here you are passing in
# whole column, i.e. a vector
df %>%
mutate(new = map(data, ~ mean(.x$x) + table ))
#> # A tibble: 560 x 6
#> cut color clarity table data new
#> <ord> <ord> <ord> <dbl> <list> <list>
#> 1 Ideal E SI2 55. <tibble [3 × 6]> <dbl [560]>
#> 2 Premium E SI1 61. <tibble [4 × 6]> <dbl [560]>
#> 3 Good E VS1 65. <tibble [1 × 6]> <dbl [560]>
#> 4 Premium I VS2 58. <tibble [2 × 6]> <dbl [560]>
#> 5 Good J SI2 58. <tibble [1 × 6]> <dbl [560]>
#> 6 Very Good J VVS2 57. <tibble [1 × 6]> <dbl [560]>
#> 7 Very Good I VVS1 57. <tibble [1 × 6]> <dbl [560]>
#> 8 Very Good H SI1 55. <tibble [2 × 6]> <dbl [560]>
#> 9 Fair E VS2 61. <tibble [1 × 6]> <dbl [560]>
#> 10 Very Good H VS1 61. <tibble [1 × 6]> <dbl [560]>
#> # ... with 550 more rows
# you can see that more clearly by just passing in table
df %>%
mutate(new = map(data, ~ table ))
#> # A tibble: 560 x 6
#> cut color clarity table data new
#> <ord> <ord> <ord> <dbl> <list> <list>
#> 1 Ideal E SI2 55. <tibble [3 × 6]> <dbl [560]>
#> 2 Premium E SI1 61. <tibble [4 × 6]> <dbl [560]>
#> 3 Good E VS1 65. <tibble [1 × 6]> <dbl [560]>
#> 4 Premium I VS2 58. <tibble [2 × 6]> <dbl [560]>
#> 5 Good J SI2 58. <tibble [1 × 6]> <dbl [560]>
#> 6 Very Good J VVS2 57. <tibble [1 × 6]> <dbl [560]>
#> 7 Very Good I VVS1 57. <tibble [1 × 6]> <dbl [560]>
#> 8 Very Good H SI1 55. <tibble [2 × 6]> <dbl [560]>
#> 9 Fair E VS2 61. <tibble [1 × 6]> <dbl [560]>
#> 10 Very Good H VS1 61. <tibble [1 × 6]> <dbl [560]>
#> # ... with 550 more rows
# you need to do is to pass into map the table column as
# the first argument so that it iterates each row in that column
df %>% mutate(new = map_dbl(.$table, ~ .))
#> # A tibble: 560 x 6
#> cut color clarity table data new
#> <ord> <ord> <ord> <dbl> <list> <dbl>
#> 1 Ideal E SI2 55. <tibble [3 × 6]> 55.
#> 2 Premium E SI1 61. <tibble [4 × 6]> 61.
#> 3 Good E VS1 65. <tibble [1 × 6]> 65.
#> 4 Premium I VS2 58. <tibble [2 × 6]> 58.
#> 5 Good J SI2 58. <tibble [1 × 6]> 58.
#> 6 Very Good J VVS2 57. <tibble [1 × 6]> 57.
#> 7 Very Good I VVS1 57. <tibble [1 × 6]> 57.
#> 8 Very Good H SI1 55. <tibble [2 × 6]> 55.
#> 9 Fair E VS2 61. <tibble [1 × 6]> 61.
#> 10 Very Good H VS1 61. <tibble [1 × 6]> 61.
#> # ... with 550 more rows
# but you actually need two columns so use map2
# (or pmap is more than two columns)
# note that `.` in the second arg of map2 is a different
# variable than in the third argument
df %>% mutate(new = map2_dbl(data, .$table, ~ sum(c(.x$x, .x$y, .x$z) + .y)))
#> # A tibble: 560 x 6
#> cut color clarity table data new
#> <ord> <ord> <ord> <dbl> <list> <dbl>
#> 1 Ideal E SI2 55. <tibble [3 × 6]> 536.
#> 2 Premium E SI1 61. <tibble [4 × 6]> 788.
#> 3 Good E VS1 65. <tibble [1 × 6]> 205.
#> 4 Premium I VS2 58. <tibble [2 × 6]> 375.
#> 5 Good J SI2 58. <tibble [1 × 6]> 185.
#> 6 Very Good J VVS2 57. <tibble [1 × 6]> 181.
#> 7 Very Good I VVS1 57. <tibble [1 × 6]> 181.
#> 8 Very Good H SI1 55. <tibble [2 × 6]> 356.
#> 9 Fair E VS2 61. <tibble [1 × 6]> 193.
#> 10 Very Good H VS1 61. <tibble [1 × 6]> 193.
#> # ... with 550 more rows
I was looking for a generalisable solution for more than one column and your suggestion of pmap() fits the bill.
library(tidyverse)
#data
df <- diamonds %>%
head(n=1000) %>%
nest(-cut, -color, -clarity, -table)
#function
df %>%
mutate(new = pmap_dbl(list(data, .$table), ~ sum(c(..1$x, ..1$y, ..1$z) + ..2)))
#> # A tibble: 560 x 6
#> cut color clarity table data new
#> <ord> <ord> <ord> <dbl> <list> <dbl>
#> 1 Ideal E SI2 55 <tibble [3 x 6]> 536.22
#> 2 Premium E SI1 61 <tibble [4 x 6]> 787.91
#> 3 Good E VS1 65 <tibble [1 x 6]> 205.43
#> 4 Premium I VS2 58 <tibble [2 x 6]> 375.21
#> 5 Good J SI2 58 <tibble [1 x 6]> 185.44
#> 6 Very Good J VVS2 57 <tibble [1 x 6]> 181.38
#> 7 Very Good I VVS1 57 <tibble [1 x 6]> 181.40
#> 8 Very Good H SI1 55 <tibble [2 x 6]> 356.18
#> 9 Fair E VS2 61 <tibble [1 x 6]> 193.14
#> 10 Very Good H VS1 61 <tibble [1 x 6]> 193.44
#> # ... with 550 more rows