# lm interaction categorical/continue variables

Hi there! I've got a question concerning the mathematical formulation of a model containing an intercation between a categorical and a continue variables using lm function.
it's built this way :
lm(Y~X1*X2, data=data)
with
Y being the variable to predict
X1 being a continuous variable
X2 being a categorical variable.
I obtained estimates :
a - intercept
b1- estimate of the continuous effect
b2- estimate of the categorical variable having a different value for each level of X2
b3-estimate of the interaction having a different value for each level o X2.

Now the fitted values obtained with this model are pretty good and within the range of observed data. The thing is, when I try to apply this model in an other software to estimate Y, I formulated this way :

Y=a+b1X1+b2+b3X1

And this gives values which are not even realistic about what should be expected. Do you know how I should translate this interaction in my formula to get a correct estimation of Y?

because of data privacy, I can't produce a reprex to help, but in my mind, it is more a question related to the matematical formulation of the interaction within the lm function that something related to coding... Anyway, thanks for your help and I'll be happy to give you any possible precision can you duplicate this in your other software ?

``````set.seed(42)
rv1 <- sample.int(3,1000,replace=TRUE)
rv2 <- sample.int(6,1000,replace=TRUE)
X1 <- 1:1000
X2 <- factor(c(rep("blue",500),rep("red",500)))
Y  <- ifelse(X2=="blue",X1*rv1,
X1*rv2
)

(mydata <- data.frame(Y,
X1,
X2))

(my_lm <- lm(Y~X1*X2,data=mydata))

mydata\$lm_pred <- predict(my_lm,newdata = mydata)

manual_pred <- function(a,b){
int <- my_lm\$coefficients[]
x1_coeff <- my_lm\$coefficients[]
x2red_coeff <- my_lm\$coefficients[]

int +
ifelse(b=="red",x2red_coeff,0)
}
mydata\$lm_manpred <- manual_pred(mydata\$X1,mydata\$X2)

ggplot(data=mydata,
mapping=aes(x=X1,color=X2,
y=Y)) + geom_point() +
geom_line(aes(y=lm_pred),color="black",linetype=3,size=2) +
geom_line(aes(y=lm_manpred),color="red",linetype=2,size=1)
``````

@nirgrahamuk the other software is QGIS. I don't know how to reproduce this kind of coding in it. I updated the layers with a column taking the adequate values of the estimates. Theoretically, just taking these values in my formula should do it. To be more specific, the estimates of my categorical variable X2 are positive without interaction and when I model it into QGIS I have good results. But when I add the interactions, X2 estimates becomes strongly negative and the interaction effect doesn't seems to be strong enough, giving me negative values for a chemical concentration I try to predict...

I'm trying to do it in QGIS right now. Thanks anyway, for the time and help. I'm sorry not being able to be more specific, but the data and topic are kept under a strict privacy policy... I'm well aware it doesn't help my case 