There are two kinds of "ordering" going on with the
factor function. Let's start with this example:
temp_vec <- c("High", "Low", "High", "Low", "Medium")
temp_vec <- factor(temp_vec, levels = c("Low", "Medium", "High"))
 Low Low Medium High High
Levels: Low Medium High
Error in Summary.factor(c(3L, 1L, 3L, 1L, 2L), na.rm = FALSE) :
‘max’ not meaningful for factors
temp_vec is now a
factor (that is, its class is "factor"). Also, its levels have the order you gave it with the
levels argument of the
factor function. If you sort
temp_vec it will be sorted in the order of the levels (so you can use
factor to set a sorting order that is different from alphabetical). And if you create a regression model using
temp_vec, the first level will be treated as the reference level.
But note in the example above that the levels are not treated as if any level is less than or greater than another level. That is,
temp_vec is not an "ordinal" variable. It has three categories, but they don't have a natural ordering in terms of their "magnitude". With
factor we've just changed the order for sorting purposes.
temp_vec an order in terms of magnitude, we turn it into an ordered factor. Either of these will work (note that there is an
ordered argument, but not an
temp_vec <- factor(temp_vec, levels=c("Low", "Medium", "High"), ordered=TRUE)
temp_vec <- ordered(temp_vec, levels = c("Low", "Medium", "High"))
 "ordered" "factor"
Levels: Low < Medium < High
Note how the levels now have a magnitude order with "Low" less than "Medium" and "Medium" less than "High". An ordered factor is different from a non-ordered factor, because now R's modeling functions (such as
glm) will treat "Low" as being less than "Medium" and "High".