Adding .id for pluck


#1

@jennybryan Is there a native way in purrr to add an .id to the pluck output so the list objects get named by some other column in the output list elements?

I put together this solution which changes how pluck works a bit, but creates the functionality that seems to be lacking (imo).

In this solution I am assuming .id columns are singletons that can be pasted together

library(purrr)
library(tidyr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

#create of function copy

  mypluck <- pluck

#add formal for .id columns

  fpluck <- formals(mypluck)
  
  fpluck <-  append(fpluck,list(.id=NULL))
  
  formals(mypluck) <- fpluck

#add to body new .id functionality

  bpluck <- body(mypluck)
  
  bpluck[[2]] <- quote(out <- .Call(extract_impl, .x, dots_splice(...), .default))
  
  bpluck[[3]] <- quote(if(!is.null(.id)) names(out) <- .x[.id]%>%tidyr::unite_('key',.id)%>%pull(key))
  
  bpluck[[4]] <- quote(return(out))

#update new function body

  body(mypluck) <- bpluck

  
  
test_data <- data_frame(id=c(1,2,3),method=c('a','b','c'),x=list(1,2,3))

test_data%>%pluck('x')
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 2
#> 
#> [[3]]
#> [1] 3

test_data%>%mypluck('x',.id = c('method'))
#> $a
#> [1] 1
#> 
#> $b
#> [1] 2
#> 
#> $c
#> [1] 3

test_data%>%mypluck('x',.id = c('method','id'))
#> $a_1
#> [1] 1
#> 
#> $b_2
#> [1] 2
#> 
#> $c_3
#> [1] 3
Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.3.3 (2017-03-06)
#>  system   x86_64, darwin13.4.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2018-03-08
#> Packages -----------------------------------------------------------------
#>  package    * version    date       source                      
#>  assertthat   0.2.0      2017-04-11 CRAN (R 3.3.2)              
#>  backports    1.1.2      2017-12-13 CRAN (R 3.3.2)              
#>  base       * 3.3.3      2017-03-07 local                       
#>  bindr        0.1        2016-11-13 CRAN (R 3.3.2)              
#>  bindrcpp     0.2        2017-06-17 CRAN (R 3.3.2)              
#>  datasets   * 3.3.3      2017-03-07 local                       
#>  devtools     1.13.5     2018-02-18 CRAN (R 3.3.3)              
#>  digest       0.6.13     2017-12-14 CRAN (R 3.3.2)              
#>  dplyr      * 0.7.4      2017-09-28 CRAN (R 3.3.2)              
#>  evaluate     0.10.1     2017-06-24 CRAN (R 3.3.2)              
#>  glue         1.2.0      2017-10-29 CRAN (R 3.3.2)              
#>  graphics   * 3.3.3      2017-03-07 local                       
#>  grDevices  * 3.3.3      2017-03-07 local                       
#>  htmltools    0.3.6      2017-04-28 CRAN (R 3.3.2)              
#>  knitr        1.18       2017-12-27 CRAN (R 3.3.2)              
#>  magrittr     1.5        2014-11-22 CRAN (R 3.3.0)              
#>  memoise      1.1.0      2017-04-21 CRAN (R 3.3.2)              
#>  methods    * 3.3.3      2017-03-07 local                       
#>  pkgconfig    2.0.1      2017-03-21 CRAN (R 3.3.2)              
#>  purrr      * 0.2.4      2017-10-18 CRAN (R 3.3.2)              
#>  R6           2.2.2      2017-06-17 CRAN (R 3.3.2)              
#>  Rcpp         0.12.14    2017-11-23 CRAN (R 3.3.2)              
#>  rlang        0.2.0      2018-02-20 CRAN (R 3.3.3)              
#>  rmarkdown    1.8        2017-11-17 CRAN (R 3.3.2)              
#>  rprojroot    1.3-2      2018-01-03 CRAN (R 3.3.3)              
#>  stats      * 3.3.3      2017-03-07 local                       
#>  stringi      1.1.6      2017-11-17 CRAN (R 3.3.2)              
#>  stringr      1.2.0      2017-02-18 CRAN (R 3.3.2)              
#>  tibble       1.3.4      2017-08-22 CRAN (R 3.3.2)              
#>  tidyr      * 0.7.2      2017-10-16 CRAN (R 3.3.2)              
#>  tidyselect   0.2.3      2017-11-06 CRAN (R 3.3.2)              
#>  tools        3.3.3      2017-03-07 local                       
#>  utils      * 3.3.3      2017-03-07 local                       
#>  withr        2.1.1.9000 2018-03-03 Github (r-lib/withr@5d05571)
#>  yaml         2.1.16     2017-12-12 CRAN (R 3.3.2)

#2

Please look at the FAQ about name mentioning.

In general asking a particular person to answer your question will not get you an answer any sooner.


#3

So you don’t have an answer to the actual question in the post?


#4

one unintended consequence of no names is the inability to bind after a pluck

test_data <- data_frame(id=c(1,2,3),method=c('a','b','c'),x=list(1,2,3))

test_data%>%pluck('x')
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 2
#> 
#> [[3]]
#> [1] 3

test_data%>%pluck('x')%>%bind_cols()
#> Error in cbind_all(x): Not compatible with STRSXP: [type=NULL].

test_data%>%mypluck('x',.id = c('method'))
#> $a
#> [1] 1
#> 
#> $b
#> [1] 2
#> 
#> $c
#> [1] 3

test_data%>%mypluck('x',.id = c('method'))%>%bind_cols()
#> # A tibble: 1 x 3
#>       a     b     c
#>   <dbl> <dbl> <dbl>
#> 1     1     2     3

test_data%>%mypluck('x',.id = c('method','id'))
#> $a_1
#> [1] 1
#> 
#> $b_2
#> [1] 2
#> 
#> $c_3
#> [1] 3

test_data%>%mypluck('x',.id = c('method','id'))%>%bind_cols()
#> # A tibble: 1 x 3
#>     a_1   b_2   c_3
#>   <dbl> <dbl> <dbl>
#> 1     1     2     3

#5

Using set_names is more readable:

library(tidyverse)

test_data <- data_frame(id = c(1, 2, 3), 
                        method = c('a', 'b', 'c'), 
                        x = list(1, 2, 3)) 

test_data %>% 
    mutate(x = set_names(x, method)) %>% 
    pull(x)
#> $a
#> [1] 1
#> 
#> $b
#> [1] 2
#> 
#> $c
#> [1] 3

Instead of using bind_cols, use spread:

test_data %>% 
    unite(key, method, id) %>% 
    spread(key, x) %>% 
    unnest()
#> # A tibble: 1 x 3
#>     a_1   b_2   c_3
#>   <dbl> <dbl> <dbl>
#> 1    1.    2.    3.

#6

I haven’t digested everything that’s above, but note I think I’ve requested something very related to this.

Therefore this issue thread should be interesting:


#7

both look great (i started with first option before the post). I noted that i would need to mutate all columns i would of liked to pluck, which gets old after a while.

imo it would make my life easier to not have to call 2 or 3 functions instead of just adding an .id param to pluck.


#8

interesting thread, thank you for sharing. Did anything come of it in the end?