Using `sparse.model.matrix`

from the `Matrix`

package you can get dummy-variables (now more trendily called one-hot encoding) for factor or factor-like columns of a data frame.

I found some useful commentary on Stack Exchange:

When you have "K" dummy variables then your resulting model will have a.) the intercept term (which is a column of ones) and b.) "K-1" additional columns. The reason is because otherwise the columns of the resulting matrix would not be linearly independent (and, as a result, you wouldn't be able to do

OLS). – Steve S Oct 1 '15 at 5:34

Skip a few, then:

@SteveS: In fact R's so friendly that if you try remove the intercept

`-1`

when you have a single categorical predictor represented as a factor (as in this question), it'll assume you don't really mean that & switch to using sum-to-zero coding; which is of course just a different parametrization. Too friendly, if you ask me. – Scortchi♦Oct 1 '15 at 8:56

My purpose is not regression and I want to get the full set of dummy-variables, without the inserted intercept variable. Can anyone tell me how to do that with `sparse.model.matrix`

?

Here's a `reprex`

illustrating the two options. `Da`

is missing from the final example.

```
library(Matrix)
library(magrittr)
# Two numeric variables
# Two factor-like variables, three factors each (k=6)
df <- data.frame(
stringsAsFactors = FALSE,
A = c(1, 1, 1, 2, 2, 2, 3, 3, 3),
B = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
C = c("a", "a", "a", "b", "b", "b", "c", "c", "c"),
D = c("a", "b", "c", "a", "b", "c", "a", "b", "c")
)
str(df)
#> 'data.frame': 9 obs. of 4 variables:
#> $ A: num 1 1 1 2 2 2 3 3 3
#> $ B: num 1 2 3 1 2 3 1 2 3
#> $ C: chr "a" "a" "a" "b" ...
#> $ D: chr "a" "b" "c" "a" ...
# All-ones intercept variable inserted
# k-2 dummy-variables
df %>% sparse.model.matrix(~., .)
#> 9 x 7 sparse Matrix of class "dgCMatrix"
#> (Intercept) A B Cb Cc Db Dc
#> 1 1 1 1 . . . .
#> 2 1 1 2 . . 1 .
#> 3 1 1 3 . . . 1
#> 4 1 2 1 1 . . .
#> 5 1 2 2 1 . 1 .
#> 6 1 2 3 1 . . 1
#> 7 1 3 1 . 1 . .
#> 8 1 3 2 . 1 1 .
#> 9 1 3 3 . 1 . 1
# No intercept variable inserted
# k-1 dummy-variables
df %>% sparse.model.matrix(~.-1, .)
#> 9 x 7 sparse Matrix of class "dgCMatrix"
#> A B Ca Cb Cc Db Dc
#> 1 1 1 1 . . . .
#> 2 1 2 1 . . 1 .
#> 3 1 3 1 . . . 1
#> 4 2 1 . 1 . . .
#> 5 2 2 . 1 . 1 .
#> 6 2 3 . 1 . . 1
#> 7 3 1 . . 1 . .
#> 8 3 2 . . 1 1 .
#> 9 3 3 . . 1 . 1
```

^{Created on 2019-01-15 by the reprex package (v0.2.1)}