Using select with a list


#1

I have a collection of data sets that all consist of a common set of variables and some variables specific to that data, and I need both sets of variables for the analysis. The data sets have a larger number of variables so writing out the variables is not feasible. What I am trying to achieve is something like this (obviously non-functioning) code:

library(tidyverse)

common <- c("mpg", "disp:wt", starts_with("a"))

base <- mtcars %>% 
  select(common, gear)

Common would then be included in each selection together with whatever dataset-specific variable I need. Any suggestions how I can get there?


#2

I started on this track at one point, but I couldn't get it working because the method select.list() would be registered correctly as an s3 method, but then at run time it would be masked by utils::select.list(). (meaning, when you called select(mylist, ...), then instead of your own select.list() being called, R would call utils::select.list(). In any case, a working version of this function can be found here, and you can just call it directly select.list(...) or give it another name that doesn't conflict, like pick() or take()


#3

In your specific example, something like this will work. This uses tidyeval, i.e. you capture the instructions for getting the common variables in a couple of expressions, then unquote them with !!! inside select().

library(tidyverse)

common <- exprs(mpg, disp:wt, starts_with("a"))

mtcars %>% 
  select(!!!common, gear) %>% 
  head(6)
#>                    mpg disp  hp drat    wt am gear
#> Mazda RX4         21.0  160 110 3.90 2.620  1    4
#> Mazda RX4 Wag     21.0  160 110 3.90 2.875  1    4
#> Datsun 710        22.8  108  93 3.85 2.320  1    4
#> Hornet 4 Drive    21.4  258 110 3.08 3.215  0    3
#> Hornet Sportabout 18.7  360 175 3.15 3.440  0    3
#> Valiant           18.1  225 105 2.76 3.460  0    3

Created on 2018-04-08 by the reprex package (v0.2.0).


#4

6 is the default for head() :slight_smile:
So head() does the same


#5

Thank you, Jenny. That is perfect.

One quick additional question: Why is it "!!!" rather than "!!"?


#6

Thank you for the suggestion, but I think Jenny's answer is closer to what I was looking for. And easier to implement.


#7

expr() and !! anticipate the capture and unquoting of an expression. Singular. Although it could ultimately produce a select() statement with multiple variables.

exprs() and !!! anticipate the capture and unquoting or one or more expressions. Plural.

library(tidyverse)

common_many <- exprs(mpg, disp:wt, starts_with("a"))
common_one_sw <- expr(starts_with("a"))
common_one_dw <- expr(disp:wt)

mtcars %>% 
  select(!!!common_many, gear) %>% 
  head(2)
#>               mpg disp  hp drat    wt am gear
#> Mazda RX4      21  160 110  3.9 2.620  1    4
#> Mazda RX4 Wag  21  160 110  3.9 2.875  1    4

mtcars %>% 
  select(!!!common_one_sw, gear) %>% 
  head(2)
#>               am gear
#> Mazda RX4      1    4
#> Mazda RX4 Wag  1    4

mtcars %>% 
  select(!!!common_one_dw, gear) %>% 
  head(2)
#>               disp  hp drat    wt gear
#> Mazda RX4      160 110  3.9 2.620    4
#> Mazda RX4 Wag  160 110  3.9 2.875    4

mtcars %>% 
  select(!!common_one_sw, gear) %>% 
  head(2)
#>               am gear
#> Mazda RX4      1    4
#> Mazda RX4 Wag  1    4

mtcars %>% 
  select(!!common_one_dw, gear) %>% 
  head(2)
#>               disp  hp drat    wt gear
#> Mazda RX4      160 110  3.9 2.620    4
#> Mazda RX4 Wag  160 110  3.9 2.875    4

Created on 2018-04-09 by the reprex package (v0.2.0).

I used the "multiple" form (!!! and exprs()) on purpose, based on your example. You showed two literal expressions re: variables to include, plus one use of a select() helper. But as you see, !!! can also be used to unquote a single expression. I would use the most restrictive form that "should work", so your thing works, but is not too forgiving of peculiar inputs.


#8

But do you know why? :smirk:


#9

Oh man, I get an F here for reading comprehension. I skimmed your post and walked away thinking you were asking about how to do select( my_list_of_data_not_a_dataframe, ...), instead of select( my_dataframe, my_list_of_column_names). Sorry for answering the wrong question, and I'm glad Jenny was able to help you.


#10

:smile: Yes, that is one of the first threads I read when I joined the community. It is a cool piece of history!

Speaking of history: when reading the CRAN documentation of packages, I often wish there would be some trace of their history in it. If nothing else, at least the date at which a package was first launched. But all there is is the date of the last update. Maybe I should open a thread about this actually :thinking:


#11

There has been discussion, elsewhere, of a select method for lists, so I thought the same thing at first. It would be nice!


#12

Ohh, please do let me know if there is any progress on a select.list() S3 method. As I mentioned, I coded up a working prototype, but couldn't get the S3 dispatch working properly because of the conflicting existence of utils::select.list(). There is probably some fancy esoteric thing you can do with callbacks or hooks or shims to get S3 dispatch to work as desired here, but I didn't go down that path.


#13

Just in case something has the same problem as me. Turns out that you also need:
library(rlang)
for the code to work.