Adding .id for pluck

@jennybryan Is there a native way in purrr to add an .id to the pluck output so the list objects get named by some other column in the output list elements?

I put together this solution which changes how pluck works a bit, but creates the functionality that seems to be lacking (imo).

In this solution I am assuming .id columns are singletons that can be pasted together

library(purrr)
library(tidyr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

#create of function copy

  mypluck <- pluck

#add formal for .id columns

  fpluck <- formals(mypluck)
  
  fpluck <-  append(fpluck,list(.id=NULL))
  
  formals(mypluck) <- fpluck

#add to body new .id functionality

  bpluck <- body(mypluck)
  
  bpluck[[2]] <- quote(out <- .Call(extract_impl, .x, dots_splice(...), .default))
  
  bpluck[[3]] <- quote(if(!is.null(.id)) names(out) <- .x[.id]%>%tidyr::unite_('key',.id)%>%pull(key))
  
  bpluck[[4]] <- quote(return(out))

#update new function body

  body(mypluck) <- bpluck

  
  
test_data <- data_frame(id=c(1,2,3),method=c('a','b','c'),x=list(1,2,3))

test_data%>%pluck('x')
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 2
#> 
#> [[3]]
#> [1] 3

test_data%>%mypluck('x',.id = c('method'))
#> $a
#> [1] 1
#> 
#> $b
#> [1] 2
#> 
#> $c
#> [1] 3

test_data%>%mypluck('x',.id = c('method','id'))
#> $a_1
#> [1] 1
#> 
#> $b_2
#> [1] 2
#> 
#> $c_3
#> [1] 3
Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.3.3 (2017-03-06)
#>  system   x86_64, darwin13.4.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2018-03-08
#> Packages -----------------------------------------------------------------
#>  package    * version    date       source                      
#>  assertthat   0.2.0      2017-04-11 CRAN (R 3.3.2)              
#>  backports    1.1.2      2017-12-13 CRAN (R 3.3.2)              
#>  base       * 3.3.3      2017-03-07 local                       
#>  bindr        0.1        2016-11-13 CRAN (R 3.3.2)              
#>  bindrcpp     0.2        2017-06-17 CRAN (R 3.3.2)              
#>  datasets   * 3.3.3      2017-03-07 local                       
#>  devtools     1.13.5     2018-02-18 CRAN (R 3.3.3)              
#>  digest       0.6.13     2017-12-14 CRAN (R 3.3.2)              
#>  dplyr      * 0.7.4      2017-09-28 CRAN (R 3.3.2)              
#>  evaluate     0.10.1     2017-06-24 CRAN (R 3.3.2)              
#>  glue         1.2.0      2017-10-29 CRAN (R 3.3.2)              
#>  graphics   * 3.3.3      2017-03-07 local                       
#>  grDevices  * 3.3.3      2017-03-07 local                       
#>  htmltools    0.3.6      2017-04-28 CRAN (R 3.3.2)              
#>  knitr        1.18       2017-12-27 CRAN (R 3.3.2)              
#>  magrittr     1.5        2014-11-22 CRAN (R 3.3.0)              
#>  memoise      1.1.0      2017-04-21 CRAN (R 3.3.2)              
#>  methods    * 3.3.3      2017-03-07 local                       
#>  pkgconfig    2.0.1      2017-03-21 CRAN (R 3.3.2)              
#>  purrr      * 0.2.4      2017-10-18 CRAN (R 3.3.2)              
#>  R6           2.2.2      2017-06-17 CRAN (R 3.3.2)              
#>  Rcpp         0.12.14    2017-11-23 CRAN (R 3.3.2)              
#>  rlang        0.2.0      2018-02-20 CRAN (R 3.3.3)              
#>  rmarkdown    1.8        2017-11-17 CRAN (R 3.3.2)              
#>  rprojroot    1.3-2      2018-01-03 CRAN (R 3.3.3)              
#>  stats      * 3.3.3      2017-03-07 local                       
#>  stringi      1.1.6      2017-11-17 CRAN (R 3.3.2)              
#>  stringr      1.2.0      2017-02-18 CRAN (R 3.3.2)              
#>  tibble       1.3.4      2017-08-22 CRAN (R 3.3.2)              
#>  tidyr      * 0.7.2      2017-10-16 CRAN (R 3.3.2)              
#>  tidyselect   0.2.3      2017-11-06 CRAN (R 3.3.2)              
#>  tools        3.3.3      2017-03-07 local                       
#>  utils      * 3.3.3      2017-03-07 local                       
#>  withr        2.1.1.9000 2018-03-03 Github (r-lib/withr@5d05571)
#>  yaml         2.1.16     2017-12-12 CRAN (R 3.3.2)

Please look at the FAQ about name mentioning.

In general asking a particular person to answer your question will not get you an answer any sooner.

So you don't have an answer to the actual question in the post?

one unintended consequence of no names is the inability to bind after a pluck

test_data <- data_frame(id=c(1,2,3),method=c('a','b','c'),x=list(1,2,3))

test_data%>%pluck('x')
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 2
#> 
#> [[3]]
#> [1] 3

test_data%>%pluck('x')%>%bind_cols()
#> Error in cbind_all(x): Not compatible with STRSXP: [type=NULL].

test_data%>%mypluck('x',.id = c('method'))
#> $a
#> [1] 1
#> 
#> $b
#> [1] 2
#> 
#> $c
#> [1] 3

test_data%>%mypluck('x',.id = c('method'))%>%bind_cols()
#> # A tibble: 1 x 3
#>       a     b     c
#>   <dbl> <dbl> <dbl>
#> 1     1     2     3

test_data%>%mypluck('x',.id = c('method','id'))
#> $a_1
#> [1] 1
#> 
#> $b_2
#> [1] 2
#> 
#> $c_3
#> [1] 3

test_data%>%mypluck('x',.id = c('method','id'))%>%bind_cols()
#> # A tibble: 1 x 3
#>     a_1   b_2   c_3
#>   <dbl> <dbl> <dbl>
#> 1     1     2     3

Using set_names is more readable:

library(tidyverse)

test_data <- data_frame(id = c(1, 2, 3), 
                        method = c('a', 'b', 'c'), 
                        x = list(1, 2, 3)) 

test_data %>% 
    mutate(x = set_names(x, method)) %>% 
    pull(x)
#> $a
#> [1] 1
#> 
#> $b
#> [1] 2
#> 
#> $c
#> [1] 3

Instead of using bind_cols, use spread:

test_data %>% 
    unite(key, method, id) %>% 
    spread(key, x) %>% 
    unnest()
#> # A tibble: 1 x 3
#>     a_1   b_2   c_3
#>   <dbl> <dbl> <dbl>
#> 1    1.    2.    3.
1 Like

I haven't digested everything that's above, but note I think I've requested something very related to this.

Therefore this issue thread should be interesting:

2 Likes

both look great (i started with first option before the post). I noted that i would need to mutate all columns i would of liked to pluck, which gets old after a while.

imo it would make my life easier to not have to call 2 or 3 functions instead of just adding an .id param to pluck.

interesting thread, thank you for sharing. Did anything come of it in the end?