It would be good to know the chronology of development of tidyverse packages (not versions within packages). This will ensure that everyone, especially those who are haven’t worked on older packages, use current packages rather than using older ones. Just a simple mapping would do.
Like, I believe:
tidyr <reshape2 < reshape <?
dplyr < plyr <?
The two “clumps” of packages above and their order reflect how I see things, where leftmost – tidyr and dplyr – are where current development is happening.
Thanks for confirming that the order is correct. Also, it would be nice if we could know the chronology of development e.g. which package/s did plyr supersede, what was the package before that, so on. Actually wanted this information for all tidyverse packages.
I am listing down all the tidyverse packages so that everyone can contribute where possible.
Want help in understanding the chronology of development e.g. in case dplyr: which package/s did dplyr supersede, plyr supersede, what was the package before that, so on.
[@Site Admin- if you feel that this post is irrelevant or is not meant to be posted on this Community site, please delete it. If this be the case, Apologise, in advance.]
dplyr < plyr <?
tidyr < reshape2 < reshape <?
magrittr is not a tidyverse package (or at least was not unless it’s been adopted).
Apart from ggplot2 < ggplot I don’t think there are many other predecessors.
Sorry, my mistake. I hadn’t realised that magrittr was part of the 'verse, now, too.
Just a small comment: Newer tidyverse packages are a bit more focused than old ones (that means they do less).
So i would rather say plyr split into dplyr and purrr (and purrrlyr)
tidyr does not do everything reshape2 does (but some other sings). reshape2 still has some uses in my opionion.
@hoelk Care to give an example for
reshape2? I’ve never used
reshape2 for anything other than the exact
cast/spreading I’ve been using
tidyr for now.
plyr, I had been looking for ages to find a replacement for
dplyr, but then
purrr::map_df came along and all was well
With regards to the OP: I too would be curious about a “history lesson”, with a package evolution, timeline, and seeing where influences come from.
Maybe tidyverse authors have some spare time and can put together a blogpost / some viz for curious folk like us?
EDIT: Just thinking, but is
rlang related to
lazyeval? I never quite wrapped my head around
lazyeval, but it seems the new way of doing things is the
rlang stuff with the
quo() and the
!!! and whatnot.
I use tidyr very little since I like melting/casting a lot, so I might not be the best authority on this.
As far as I know you cannot do aggregation with tidyr, which you can do with
reshape2::dcast(cars, dist ~ speed, fun.aggregate = sum)
Huh, interesting. And yes, I don’t think
tidyr can do that. If I had to do the same it would probably be a significantly longer chain of
tidyr::spread. I’ll keep that in mind, seems super handy, thanks
data.table::dcast() can use several aggregation functions which is even nicer
Here’s a few additions, based largely on my fallible memory:
ggplot2 < ggplot < lattice < graphics
One could also do ggplot2 < grid, as ggplot2 is built on grid, but is not a complete replacement
dplyr < plyr < base (apply family), though plyr has some tricks that dplyr does not, largely because dplyr takes and returns data frames, while plyr works with other data structures.
tidyr < reshape2 < reshape < base, here, again, reshape2 has some tricks that tidyr does not
readr < base
tibble < dplyr (< 0.5.0; tibble was split out of dplyr)
readxl < partial overlap with various packages, including XLConnect, xlsx and xlsReadWrite
haven < foreign (partially)
rvest < XML and RCurl (usually)
stringr < base
lubridate < base (mostly strptime())
forcats < base
Here’s a fun little exercise: for each package that’s on CRAN you can access its archives (e.g. ggplot2 archive), the first record in the archive (AFAIK) is the date on which the package was first put on CRAN. The urls all share a common schema: