Tidyverse successor to adply()?


#1

I’m sure this has been asked before in other forums, but is there anyone working on the successor to plyr::adply? I’m just starting to work on some array data structures, and was looking for some examples of how people are working with this data currently. My initial thought is that grouped dataframes mostly accomplish the same tasks as a lot of array data, but I’m not sure that’s correct.

Thanks!


#2

Not officially tidy, but this package works amazingly fast and helped me a lot when switching between arrays and data frames in the past: arrayhelpers.


#3

tbl_cube can act as an intermediary, e.g.

library(dplyr)

cubetanic <- Titanic %>%    # high-dimensional table
    as.array() %>%    # not necessary, but to show it works with pure arrays
    as.tbl_cube() 

cubetanic
#> Source: local array [32 x 4]
#> D: Class [chr, 4]
#> D: Sex [chr, 2]
#> D: Age [chr, 2]
#> D: Survived [chr, 2]
#> M: Freq [dbl]

cubetanic %>% glimpse()
#> List of 2
#>  $ dims:List of 4
#>   ..$ Class   : chr [1:4] "1st" "2nd" "3rd" "Crew"
#>   ..$ Sex     : chr [1:2] "Male" "Female"
#>   ..$ Age     : chr [1:2] "Child" "Adult"
#>   ..$ Survived: chr [1:2] "No" "Yes"
#>  $ mets:List of 1
#>   ..$ Freq: num [1:4, 1:2, 1:2, 1:2] 0 0 35 0 0 0 17 0 118 154 ...
#>  - attr(*, "class")= chr "tbl_cube"

# works pretty much like as.data.frame.table
cubetanic %>% as_data_frame()
#> # A tibble: 32 x 5
#>    Class Sex    Age   Survived  Freq
#>    <chr> <chr>  <chr> <chr>    <dbl>
#>  1 1st   Male   Child No          0.
#>  2 2nd   Male   Child No          0.
#>  3 3rd   Male   Child No         35.
#>  4 Crew  Male   Child No          0.
#>  5 1st   Female Child No          0.
#>  6 2nd   Female Child No          0.
#>  7 3rd   Female Child No         17.
#>  8 Crew  Female Child No          0.
#>  9 1st   Male   Adult No        118.
#> 10 2nd   Male   Adult No        154.
#> # ... with 22 more rows

# or aggregate first
class_survival <- cubetanic %>% 
    group_by(Class, Survived) %>% 
    summarise(Freq = sum(Freq)) 

class_survival
#> Source: local array [8 x 2]
#> D: Class [chr, 4]
#> D: Survived [chr, 2]
#> M: Freq [dbl]

class_survival %>% as_data_frame()
#> # A tibble: 8 x 3
#>   Class Survived  Freq
#>   <chr> <chr>    <dbl>
#> 1 1st   No        122.
#> 2 2nd   No        167.
#> 3 3rd   No        528.
#> 4 Crew  No        673.
#> 5 1st   Yes       203.
#> 6 2nd   Yes       118.
#> 7 3rd   Yes       178.
#> 8 Crew  Yes       212.

Documentation is sparse, but it’s potentially a powerful tool if you can figure out how to use it.