Dear Folks--
I have been using tibbles for a while now, and I have not found any situation where they are not at least as useful as data frames. Well, maybe if I happen to have an 11- to 19-row object, but that is a minor case. I have pretty much stopped using plain data frames. And I'm happy about that.
Lately I have been reading the documentation for the vctrs package, as suggested in Advanced R 2nd chapter "OO Field Guide." I can not clam to have fully understood it yet, but consistency in casting and coercion seems like an unconditional advantage, the only obvious problem being that replacing vectors with vctrs throughout will break some existing code.
But it is not clear to me that vctrs's are intended to be superior general replacements for vectors in the sense that tibbles are a superior replacement for dataframes. He implies in several places that they are useful primarily to package developers who are creating new S3 classes. On the other hand, at one point Hadey described them as "a type system for the tidyverse," which suggests to me that vector objects created within the tidyverse, such as the vectors inside of tibbles or returned by dplyr, are going to be vctrs by default. I don't believe that this has happened yet -- the major tidyverse packages do not seem to import or depend on vctrs, unless they do so only indirectly, and few other packages depend on them -- contrast with tibbles. But it seems like that is the goal. Does anyone know if that is right?
I am writing a package that create some pretty large S3 objects, with millions of rows, and hundreds of columns--and I am trying to figure out if there is any advantage or disadvantage to changing all of those columns into vctrs by default. Here are some things I don't know. I don't know if there are performance issues with vctrs. I don't know if they are still rapidly changing, so that reliance on them will cause your code to break if you don't keep up. I don't know if existing packages that do complicated things to base R vectors (e.g. survey) will always or almost always work if handed vctrs instead. I don't feel any need for vctrs. But I didn't feel any need for tibbles either, and now I would hate to be without them.
Do folks have thoughts on this?
Oh, one last point of confusion: In AR 2nd, Hadley says that at one point he used to think prototype-based programming was a good direction for R, and now he thinks it is not. But in the vctrs documentations, he repeatedly refers to thee advantages of vctrs over base R vectors as prototypes. So has he changed his mind back? Or decided they are useful only in this special case, or with this infrastructural support? I am guessing that there is some way to read these things consistently given that Hadley is working on both of them at the same time, but I don't know what it is. Can anybody else see it?