Working with categorical data in R without losing your mind - Amelia McNamara - rstudio::conf(2019L) Video


Categorical data, called “factor” data in R, presents unique challenges in data wrangling. R users often look down at tools like Excel for automatically coercing variables to incorrect datatypes, but factor data in R can produce very similar issues. The stringsAsFactors=HELLNO movement and standard tidyverse defaults have moved us away from the use of factors, but they are sometimes still necessary for analysis. This talk will outline common problems arising from categorical variable transformations in R, and show strategies to avoid them, using both base R and the tidyverse (particularly, dplyr and forcats functions).


This topic was automatically closed after 21 days. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.