will rownames on tibbles become impossible in future?

I often use non-tidyverse functions that work better with rownames, such as PCA or distance matrices. It's not impossible to avoid using rownames, but it's often handier (or I have just gotten into the habit :slight_smile: ).

Now, when I wrangle parts of that data with tidy functions I get constantly reminded that rownames on tibbles are deprecated. I know and fully understand that:

* While a tibble can have row names (e.g., when converting from a regular data frame), they are removed when subsetting with the [ operator. A warning will be raised when attempting to assign non- NULL row names to a tibble. Generally, it is best to avoid row names, because they are basically a character column with different semantics to every other column.*

This isn't a problem for me, yet. However, if in future tibbles entirely refuse to have rownames at all and just drop them, that would be a problem. Is this a likely future?

Well, the future is a potentially long time and by design tibbles are column-centric and always wlll be.

That said, you can still explicitly convert a row name to a column with rownames_to_columns.

But if that goes away, just create your vector of rownames, enframe it, and bind_cols(), the use

my_tib <- my_tib %>% select(my_rownames, everything()

From the tibble 2.0.1 release notes/blog post:

Row names

Row name handling is stricter. Row names were never supported in tibble() and new_tibble() , and are now stripped by default in as_tibble() . The rownames argument to as_tibble() supports:

  • NULL : remove row names (default),
  • NA : keep row names,
  • A string: the name of the new column that will contain the existing row names, which are no longer present in the result.

The old default can be restored by calling pkgconfig::set_config("tibble::rownames", NA), this also works for packages that import tibble .

Also some more detail in the NEWS file in the 2.0.0 breaking changes section:

My interpretation of the above is that there are workarounds for dealing with rownames in tibbles, but that rownames are sort of antithetical to the design of tibbles. I can't speak for the future in its entirety, but hopefully this info will give you some sense of how to keep your workflow robust.

3 Likes

Thank you Mara that's good to know and a very helpful response!

I understand how the tidy philosophy doesn't make sense with rownames, all the same I do hope that column_to_rownames() will still be maintained. For certain kinds of analysis, especially distance matrices, rownames really do make sense.

Thanks.

That's kind of what I'm already doing .I just wanted to check that rownames_to_column() and column_to_rownames() won't go away sometime in the future. I'm worried that rownames on Tibbles won't just be not encouraged but strictly prohibited. I know that as_tibble() strips them, but just running a df through any random tidyverse function (like select()) and getting a tibble output currently doesn't strip them off.. but the constant message I get about them being deprecated sort of makes it seem like they might start stripping them off always soon :frowning:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.