 # Trying to understand the factor function and ordering

``````# Temperature
temperature_vector <- c("High", "Low", "High","Low", "Medium")
factor_temperature_vector <- factor(temperature_vector, order = TRUE, levels = c("Low", "Medium", "High"))
factor_temperature_vector
``````

I searched the function factor in RDocumentation. I feel confused since I didn't see the "order" argument. I only see "ordered" argument in factor function.

There are two kinds of "ordering" going on with the `factor` function. Let's start with this example:

``````temp_vec <- c("High", "Low", "High", "Low", "Medium")
temp_vec <- factor(temp_vec, levels = c("Low", "Medium", "High"))

class(temp_vec)
 "factor"

sort(temp_vec)
 Low    Low    Medium High   High
Levels: Low Medium High

max(temp_vec)
Error in Summary.factor(c(3L, 1L, 3L, 1L, 2L), na.rm = FALSE) :
‘max’ not meaningful for factors
``````

`temp_vec` is now a `factor` (that is, its class is "factor"). Also, its levels have the order you gave it with the `levels` argument of the `factor` function. If you sort `temp_vec` it will be sorted in the order of the levels (so you can use `factor` to set a sorting order that is different from alphabetical). And if you create a regression model using `temp_vec`, the first level will be treated as the reference level.

But note in the example above that the levels are not treated as if any level is less than or greater than another level. That is, `temp_vec` is not an "ordinal" variable. It has three categories, but they don't have a natural ordering in terms of their "magnitude". With `factor` we've just changed the order for sorting purposes.

To give `temp_vec` an order in terms of magnitude, we turn it into an ordered factor. Either of these will work (note that there is an `ordered` argument, but not an `order` argument):

``````temp_vec <- factor(temp_vec, levels=c("Low", "Medium", "High"), ordered=TRUE)
temp_vec <- ordered(temp_vec,  levels = c("Low", "Medium", "High"))

class(temp_vec)
 "ordered" "factor"

max(temp_vec)
 High
Levels: Low < Medium < High
``````

Note how the levels now have a magnitude order with "Low" less than "Medium" and "Medium" less than "High". An ordered factor is different from a non-ordered factor, because now R's modeling functions (such as `lm` or `glm`) will treat "Low" as being less than "Medium" and "High".

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.