mlevy
January 27, 2018, 10:24pm
1
Is there a reason some dplyr
functions strip classes from data frames and others preserve them? I could see stripping for functions that are more "create a new data frame" than "change an existing data frame", but it's tough to nail down exactly where that line is, and it really doesn't seem to be between select
and mutate
.
d <- data.frame(x = 1:5) %>% structure(., class = c("my_df", class(.)))
select(d, x) %>% class()
#> [1] "my_df" "data.frame"
mutate(d, x = x) %>% class()
#> [1] "data.frame"
mlevy
January 27, 2018, 10:37pm
2
Same question for attributes. This seems crazy to me:
select
: preserves class and attributes
mutate
: preserves neither
slice
: strips class, coerces to tbl_df
, preserves attributes
d <- data.frame(x = 1:5) %>% structure(., class = c("my_df", class(.)), my_attr = "does it persist?")
select(d, x) %>% attributes()
#> $row.names
#> [1] 1 2 3 4 5
#>
#> $class
#> [1] "my_df" "data.frame"
#>
#> $my_attr
#> [1] "does it persist?"
#>
#> $names
#> [1] "x"
mutate(d, x = x) %>% attributes()
#> $class
#> [1] "data.frame"
#>
#> $names
#> [1] "x"
#>
#> $row.names
#> [1] 1 2 3 4 5
slice(d, 1:2) %>% attributes()
#> $row.names
#> [1] 1 2
#>
#> $class
#> [1] "tbl_df" "tbl" "data.frame"
#>
#> $my_attr
#> [1] "does it persist?"
#>
#> $names
#> [1] "x"
1 Like
mara
January 29, 2018, 1:08pm
3
Re. attributes: this is something that's known and being worked on right now. There are a few issues /threads in the dplyr github repo that you can peruse for more detail. And there's ~related discussion in this thread here:
dat1 <- mtcars
dat2 <- mtcars
attr(dat2, "description") <- "# MTCARS\nData description ..."
all.equal(dat1, dat2) ## not TRUE
Returns:
[1] "Attributes: < Names: 1 string mismatch >"
[2] "Attributes: < Length mismatch: comparison on first 2 components >"
[3] "Attributes: < Component 2: Lengths (32, 1) differ (string compare on first 1) >"
[4] "Attributes: < Component 2: 1 string mismatch >"
dat1 <- tibble::as_tibble(mtcars)
dat2 <- tibble::as_tibble(mtcars)
attr(dat2, "description") <- "# MTCARS\nData description ..."
all.equal(dat1, dat2) ## TRUE, ignores attribute
dplyr::all_equal(dat1, dat2) ## TRUE, ignores attribute
Bo…
Re. classes, I don't have a general answer, but the discussion between Hadley and Kiril re. S4 (linked to below) might be useful:
opened 03:46PM - 27 Oct 17 UTC
closed 02:30PM - 11 Mar 20 UTC
vctrs ↗️
Would be great to have formal tested support. Some initial explanations below:
…
```R
library(methods)
# install_github("tidyverse/tibble")
library(tibble)
# Silly minimal vector example --------------------------------------------
# Shouldn't need to contain a vector class, but tibble currently
# checks - need someway to determine if we have a "vector" S4
# class. Maybe just see if it has [ and length methods?
.rando <- setClass("rando", contains = "integer", slots = list(n = "integer"))
rando <- function(n) {
.rando(n = as.integer(n))
}
setMethod("[", "rando", function(x, i, j, ..., drop = TRUE) {
if (is.logical(x)) {
new_n <- sum(i, na.rm = TRUE)
} else {
new_n <- length(i)
}
rando(new_n)
})
setMethod("length", "rando", function(x) x@n)
setMethod("show", "rando", function(object) {
print(runif(object@n))
invisible(object)
})
x <- rando(10)
x
x[1]
length(x)
# Can we put in a tibble? -------------------------------------------------
tb <- tibble(x = x)
tb$x
tb[1:5, ]$x
# doesn't print right
tb
# from colformat code I thought this would be sufficient
# but I'm missing something from a very quick glance
setMethod("as.character", "rando", function(x, ...) {
format(runif(x@n), digits = 2)
})
tb
```
3 Likes
davis
January 29, 2018, 2:11pm
4
Additional reference for those that want to formally extend tibble
s. Especially useful will be sloop::reconstruct()
for retaining custom attributes and classes.
opened 02:00PM - 05 Jul 17 UTC
closed 03:33PM - 23 Feb 23 UTC
documentation
By following the principles defined at http://adv-r.hadley.nz/s3.html#inheritanc… e
* Provide a constructor that can also be used to create subclassed objects
```R
new_tibble <- function(data, ..., subclass = NULL) {
stopifnot(is.data.frame(data))
structure(
data,
...,
class = c(subclass, "tibble", "tbl_df", "data.frame")
)
}
```
* Provide a reconstruct method
```R
reconstruct.tibble <- function(new, old) {
new_tibble(new)
}
```
* Call `S3::reconstruct()` in all methods that return a tibble
This supersedes #155, #211, #218.
3 Likes