left_join only add selected variables from y, but still natural join using all variables

I would like to add only one variable (or maybe a list of variables based on some pattern) from a data frame 'y' to the data frame 'x' (both containing information on the same individuals), without having to explicitly specifying 'by =' (hence, I want a natural join based on all variables in x and y). I currently do this in this way, but I think it would be reasonable to build this into the '*_join()' functions:

x <- starwars %>% select(height, mass, hair_color, birth_year); y <- starwars %>% select(height, mass, hair_color, birth_year, homeworld, species); vars <- c(names(x), "species"); x <- x %>% left_join(y, na_matches = 'never') %>% select(vars)

I am happy to discuss if there is already a better solution for this or if there is really a need to add this. Thank you.

Hi,

Welcome to the RStudio community!

I'm a bit confused on what it is you like to be different. When you use any join function and do not specify by = ... dplyr will automatically do a natural join and provide you with a message to denote which columns were used. In your example the message is:

Joining, by = c("height", "mass", "hair_color", "birth_year")

Kind regards,
PJ

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.