Hi. Suppose I have one continuous predictor X1 and one categorical predictor X2, I do a linear regression, and now I want a prediction for a particular value of X1, averaged of all values of X2. I am not sure how to handle the X2.

df <- data.frame(salary=c(10,20,30,40,50,5,10,15,20,25),

years=c(1,2,3,4,5,1,2,3,4,5),

gender=c("M","M","M","M","M","F","F","F","F","F"))

df$gender <- ifelse(df$gender=="F",0,1)

df$gender <- factor(df$gender)

model <- lm(salary ~ years + gender, df)

summary(model)

newdata <- data.frame(years=1, gender=mean(as.numeric(df$gender)))

predict(model, newdata)

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -7.5000 3.4069 -2.201 0.063600 .

years 7.5000 0.9449 7.937 9.58e-05 ***

gender1 15.0000 2.6726 5.612 0.000805 ***

I get the following error:

Error: variable 'gender' was fitted with type "factor" but type "numeric" was supplied

In addition: Warning message:

In model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) :

variable 'gender' is not a factor.

I realize I can't really average men and women ...