Hello, first I would like to apologize if I am asking this in the wrong category or if this question has been posted before. I'm a college student and we need to do our econometrics term paper using R. However, our professor never taught us how we can run a regression for categorical variables...

The trouble I am having is in order to avoid perfect multicollinearity you need to n-1 variables. How would I do this? For example, I have 5 categories and I want R to only include 4 in the regression and use the excluded one as the base group. This is how my data is set up

I am analyzing the impact of the height of NBA players on their salary while controlling for position. I want shooting guard (SG) to be my reference group.

So my regression formula is:

Reg2 <- lm(SALARY~ Height+PG+PF+SF+Center)

summary(Reg2)

Call:

lm(formula = SALARY ~ Height + PG + PF + SF + Center)

Residuals:

Min 1Q Median 3Q Max

-4845982 -2412751 -346673 2224250 7143761

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 82417024 27550269 2.992 0.00514 **

Height -767197 351353 -2.184 0.03599 *

PG 6282111 2304350 2.726 0.01005 *

PF 5353895 2482486 2.157 0.03819 *

SF 5428714 2335014 2.325 0.02618 *

Center 6404938 3096554 2.068 0.04628 *