Using select with a list

I have a collection of data sets that all consist of a common set of variables and some variables specific to that data, and I need both sets of variables for the analysis. The data sets have a larger number of variables so writing out the variables is not feasible. What I am trying to achieve is something like this (obviously non-functioning) code:

library(tidyverse)

common <- c("mpg", "disp:wt", starts_with("a"))

base <- mtcars %>% 
  select(common, gear)

Common would then be included in each selection together with whatever dataset-specific variable I need. Any suggestions how I can get there?

3 Likes

I started on this track at one point, but I couldn't get it working because the method select.list() would be registered correctly as an s3 method, but then at run time it would be masked by utils::select.list(). (meaning, when you called select(mylist, ...), then instead of your own select.list() being called, R would call utils::select.list(). In any case, a working version of this function can be found here, and you can just call it directly select.list(...) or give it another name that doesn't conflict, like pick() or take()

In your specific example, something like this will work. This uses tidyeval, i.e. you capture the instructions for getting the common variables in a couple of expressions, then unquote them with !!! inside select().

library(tidyverse)

common <- exprs(mpg, disp:wt, starts_with("a"))

mtcars %>% 
  select(!!!common, gear) %>% 
  head(6)
#>                    mpg disp  hp drat    wt am gear
#> Mazda RX4         21.0  160 110 3.90 2.620  1    4
#> Mazda RX4 Wag     21.0  160 110 3.90 2.875  1    4
#> Datsun 710        22.8  108  93 3.85 2.320  1    4
#> Hornet 4 Drive    21.4  258 110 3.08 3.215  0    3
#> Hornet Sportabout 18.7  360 175 3.15 3.440  0    3
#> Valiant           18.1  225 105 2.76 3.460  0    3

Created on 2018-04-08 by the reprex package (v0.2.0).

11 Likes

6 is the default for head() :slight_smile:
So head() does the same

Thank you, Jenny. That is perfect.

One quick additional question: Why is it "!!!" rather than "!!"?

Thank you for the suggestion, but I think Jenny's answer is closer to what I was looking for. And easier to implement.

expr() and !! anticipate the capture and unquoting of an expression. Singular. Although it could ultimately produce a select() statement with multiple variables.

exprs() and !!! anticipate the capture and unquoting or one or more expressions. Plural.

library(tidyverse)

common_many <- exprs(mpg, disp:wt, starts_with("a"))
common_one_sw <- expr(starts_with("a"))
common_one_dw <- expr(disp:wt)

mtcars %>% 
  select(!!!common_many, gear) %>% 
  head(2)
#>               mpg disp  hp drat    wt am gear
#> Mazda RX4      21  160 110  3.9 2.620  1    4
#> Mazda RX4 Wag  21  160 110  3.9 2.875  1    4

mtcars %>% 
  select(!!!common_one_sw, gear) %>% 
  head(2)
#>               am gear
#> Mazda RX4      1    4
#> Mazda RX4 Wag  1    4

mtcars %>% 
  select(!!!common_one_dw, gear) %>% 
  head(2)
#>               disp  hp drat    wt gear
#> Mazda RX4      160 110  3.9 2.620    4
#> Mazda RX4 Wag  160 110  3.9 2.875    4

mtcars %>% 
  select(!!common_one_sw, gear) %>% 
  head(2)
#>               am gear
#> Mazda RX4      1    4
#> Mazda RX4 Wag  1    4

mtcars %>% 
  select(!!common_one_dw, gear) %>% 
  head(2)
#>               disp  hp drat    wt gear
#> Mazda RX4      160 110  3.9 2.620    4
#> Mazda RX4 Wag  160 110  3.9 2.875    4

Created on 2018-04-09 by the reprex package (v0.2.0).

I used the "multiple" form (!!! and exprs()) on purpose, based on your example. You showed two literal expressions re: variables to include, plus one use of a select() helper. But as you see, !!! can also be used to unquote a single expression. I would use the most restrictive form that "should work", so your thing works, but is not too forgiving of peculiar inputs.

4 Likes

But do you know why? :smirk:

Oh man, I get an F here for reading comprehension. I skimmed your post and walked away thinking you were asking about how to do select( my_list_of_data_not_a_dataframe, ...), instead of select( my_dataframe, my_list_of_column_names). Sorry for answering the wrong question, and I'm glad Jenny was able to help you.

:smile: Yes, that is one of the first threads I read when I joined the community. It is a cool piece of history!

Speaking of history: when reading the CRAN documentation of packages, I often wish there would be some trace of their history in it. If nothing else, at least the date at which a package was first launched. But all there is is the date of the last update. Maybe I should open a thread about this actually :thinking:

1 Like

There has been discussion, elsewhere, of a select method for lists, so I thought the same thing at first. It would be nice!

2 Likes

Ohh, please do let me know if there is any progress on a select.list() S3 method. As I mentioned, I coded up a working prototype, but couldn't get the S3 dispatch working properly because of the conflicting existence of utils::select.list(). There is probably some fancy esoteric thing you can do with callbacks or hooks or shims to get S3 dispatch to work as desired here, but I didn't go down that path.

1 Like

Just in case somebody has the same problem as me. Turns out that you also need:
library(rlang)
for the code to work.