I believe that pull and select from the dplyr package do basically the same thing. When or why would I use pull instead of select?
Pull returns a single column as a vector; select returns one or more columns as a data.frame; it can be also used to rename columns (think
select something as something_else from wherever; in SQL speak).
Both have their uses; so make your decision based on the desired output format.
To illustrate the point:
library(dplyr) animals <- data.frame(cats = 1:10, dogs = 10:1) animals %>% select(cats) %>% str() 'data.frame': 10 obs. of 1 variable: $ cats: int 1 2 3 4 5 6 7 8 9 10 animals %>% pull(cats) %>% str() int [1:10] 1 2 3 4 5 6 7 8 9 10
I understand why
pull was introduced but why can't I just use the following
animals %>% .$cats %>% str()
You can, but one of the goals of the tidyverse is to make human readable code, so
pull() is more readable and has more meaning than
In R there are many ways to skin a cat...
animals %>% .[ , "cats"] %>% str()
is also a legit syntax, giving the same outcome as
The question about preferring one over the other is more about your (or your team's) coding style than about one approach being right and the other wrong.
So do what you will, but please, please, be consistent
library(dplyr) animals <- data.frame(cats = 1:10, dogs = 10:1) base1 <- animals %>% .[, "cats"] base2 <- animals %>% .$cats dplyr <- animals %>% pull(cats)
Wow I didn't know the pull-function!!! And also didn't know that this works: ```
base2 <- animals %>% .$cats
So far I modified the dataframe first and then extracted the column I wanted to have using the $ notation, but this doesn't seem necessarry anymore. Awesome!
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.