using `vctrs` with sub classes? attribute handling?

We're in the process of updating an existing package to work with vctrs. This package information here implements a new data class, and was written using pre-vctrs conventions; many of these worked with tidyverse tools until recent releases, and we're trying to understand what features will and won't work moving forward.

We're struggling with a few points, which we think are related to the use of subclasses -- although they might come from somewhere else.

The code below introduces a new class (tfd). The underlying data is a list of numeric vectors, and theres and arg attribute. When two tfd vectors are concatenated, the result is a list of all the vectors, and the arg attribute for the output vector combines the arg attribute for the input vectors.

library(vctrs)
library(purrr)

new_tfd <- function(x = list(), arg = numeric()) {
  
  new_vctr(
    x, 
    arg = arg,
    class = "tfd")
  
}

tfd <- function(x = list(), arg = numeric()) {
  new_tfd(x, arg)
}

vec_cast.tfd.tfd <- function(x, to, ...) { x }

vec_ptype2.tfd <- function(x, y, ...) UseMethod("vec_ptype2.tfd")

vec_ptype2.tfd.tfd = function(x, y, ...) {
  
  new_list = list(x, y)
  ret = new_tfd(flatten(new_list), arg = c(attr(x, "arg"), attr(y, "arg")))
  ret
  
}

x = tfd(rerun(3, rnorm(7)), arg = seq(0, 1, l = 7))
y = tfd(list(rnorm(7)), arg = seq(0, 1, l = 5))

c(x, y)
#> <tfd[4]>
#> [1] 0.05250473, 0.40196569, 0.62787204, -1.38274215, -0.47712215, 0.03787450, -0.89042175  
#> [2] -0.1864547, -1.4595283, 3.0452626, -1.7587642, -0.2601292, -0.9577118, 0.1502292       
#> [3] -1.41784738, 0.03580967, -1.42221662, -0.02058816, -0.92669073, -1.46536514, 1.53163060
#> [4] -1.1728330, 1.0751828, -1.3334823, 1.5398515, 2.3656481, 0.3532686, 0.2648951

attr(x, "arg")
#> [1] 0.0000000 0.1666667 0.3333333 0.5000000 0.6666667 0.8333333 1.0000000
attr(y, "arg")
#> [1] 0.00 0.25 0.50 0.75 1.00

attr(c(x, y), "arg")
#>  [1] 0.0000000 0.1666667 0.3333333 0.5000000 0.6666667 0.8333333 1.0000000
#>  [8] 0.0000000 0.2500000 0.5000000 0.7500000 1.0000000
attr(c(y, x), "arg")
#>  [1] 0.0000000 0.2500000 0.5000000 0.7500000 1.0000000 0.0000000 0.1666667
#>  [8] 0.3333333 0.5000000 0.6666667 0.8333333 1.0000000

Created on 2020-05-03 by the reprex package (v0.3.0)

This code "works", even though the arg attribute isn't the same for the inputs. (The arg attribute of the output depends on the order of the inputs in this example -- that's not what we do in the package, but that's a bit beside the point for now.)

If you update the new_tfd() function so that return objects have class = c("tfd_reg", "tfd) but don't change anything else, you get an error when concatenating inputs with different args but what you'd expect when they have the same args:

library(vctrs)
library(purrr)

new_tfd <- function(x = list(), arg = numeric()) {
  
  new_vctr(
    x, 
    arg = arg,
    class = c("tfd_reg", "tfd"))
  
}

tfd <- function(x = list(), arg = numeric()) {
  new_tfd(x, arg)
}

vec_cast.tfd.tfd <- function(x, to, ...) { x }

vec_ptype2.tfd <- function(x, y, ...) UseMethod("vec_ptype2.tfd")

vec_ptype2.tfd.tfd = function(x, y, ...) {
  
  new_list = list(x, y)
  ret = new_tfd(flatten(new_list), arg = c(attr(x, "arg"), attr(y, "arg")))
  ret
  
}

x = tfd(rerun(3, rnorm(7)), arg = seq(0, 1, l = 7))
y = tfd(list(rnorm(7)), arg = seq(0, 1, l = 5))

c(x, x)
#> <tfd_reg[6]>
#> [1] -1.3367518, 0.7205532, 0.9525022, -1.7270019, -1.1147538, -0.8703902, 0.2954438 
#> [2] -0.2231775, -1.1659598, 1.2668644, -0.3892944, -0.7048431, -0.4917158, 0.6738741
#> [3] -1.1566465, 0.5580849, 1.0650625, -0.3217491, -1.2022741, 0.2559766, -1.9605327 
#> [4] -1.3367518, 0.7205532, 0.9525022, -1.7270019, -1.1147538, -0.8703902, 0.2954438 
#> [5] -0.2231775, -1.1659598, 1.2668644, -0.3892944, -0.7048431, -0.4917158, 0.6738741
#> [6] -1.1566465, 0.5580849, 1.0650625, -0.3217491, -1.2022741, 0.2559766, -1.9605327
c(x, y)
#> Error: Can't combine `..1` <tfd_reg> and `..2` <tfd_reg>.
#> x Some attributes are incompatible.
#> ℹ The author of the class should implement vctrs methods.
#> ℹ See <https://vctrs.r-lib.org/reference/faq-error-incompatible-attributes.html>.

Created on 2020-05-03 by the reprex package (v0.3.0)

Ideally, we'd like to be able to concatenate inputs with different args, using a concatenate function than combines data and attributes in a reasonable way, but don't understand vctrs well enough to do that.

A brief update -- it seems that these examples differ based on the handling of classes. In the second example, c(x,x) (eventually) uses vec_default_ptype2() successfully while c(x,y) stops at vctrs:::is_same_type() in that same function. This isn't exactly what I'd expected, but maybe double dispatch isn't so intuitive yet. And if that's not a correct diagnosis, any input will be helpful.

This has raised another question, which is what process vctrs does behind the scenes when concatenating objects. Internally, do the arguments into vec_ptype2.tfd.tfd contain any data? Or are those stripped (maybe using vec_slice(x, 0)), so that this function is acting only on the attributes and returning the desired prototype? It seems that calling vec_ptype2.tfd.tfd(x, y) can produce something different than c(x, y), especially if you intend concatenation to perform some operation on the data -- I'm guessing that's where vec_cast() comes into play, although again the internal sequence of calls vctrs uses isn't clear to me.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.