Tidyverse alternative to reshape2::dcast

dplyr
tidyr

#1

So the alternative to dcast is generally spread. However I want to combine 2 columns at the same time, and I know that tidyr moved away from reshaping data. How would I achieve something like below, without using the reshape2 package?

iris %>% reshape2::dcast(Sepal.Length ~ Petal.Width + Species)

#2

You would probably use tidyr::unite for joining two columns together and then tidyr::spread


#3

I am not sure what you mean by this. Both gather and spread are exported by tidyr and perform the same reshaping operations that dcast does. In fact, these two functions are in the main description of tidyr: "Easily Tidy Data with 'spread()' and 'gather()' "


#4

My wording was inspired from this blog comparing reshape and tidyverse ( http://www.milanor.net/blog/reshape-data-r-tidyr-vs-reshape2/)

" There is no function in the tidyr package that allows us to perform a similar operation, the reason is that tidyr is designed only for data tidying and not for data reshaping."

dcast allows you to use a function at the same time as reshaping, spread, I believe, does not.


#5

This may work, will give it a go, thanks for the idea!


#6

From the tidyr github page:

tidyr replaces reshape2 (2010-2014) and reshape (2005-2010). Somewhat counterintuitively each iteration of the package has done less. tidyr is designed specifically for tidying data, not general reshaping (reshape2), or the general aggregation (reshape).

So the idea is that you want to separate what each step is doing. Applying function while spreading may be useful, but it put together too many eggs, I think. So depending on the use-case, you may want to apply function before/after spreading instead.


#7

Solved by a colleague on Slack, posting answer here.

iris %>%
mutate(Species.Petal.Width = paste(Species, Petal.Width, sep = "_")) %>%
count(Species.Petal.Width, Sepal.Length) %>%
spread(Species.Petal.Width, n)