Hello,
I understand that, in R, the categorical variables in a dataset can and should be converted to factors using the factor() function. Once a factor, the data in the categorical variable gets "reorganized" into a more efficient type of data structure that R can work with when performing statistical analysis and creating graphs.
In general, R aside, for categorical variables to be used in a regression or ML model, they must be first converted into dummy variables following the rule that if there are N levels, N-1 dummy variables (1s and 0s) must be created.
Does this apply also to factors in R, i.e. do factors need to be converted to dummy variables before applying the variables into a statistical model? For example, Python does not have the concept of factor so there is no choice but converting them into dummy variables....but factors in R are a better version of a regular string categorical variable in Python...
Thank you!