use linear regression model to output new variable to a data set

Hi, Im very new to R and trying to use it to create a linear regression model based on some historic data, I then want to use this model to attribute a pass or fail outcome to a new data set.

Historic data set:

Variable1, Variable2, Variable3, Variable4, Variable5, Variable6,Outcome

Future data set:

Variable1, Variable2, Variable3, Variable4, Variable5, Variable6

i cant find out how to use the the model that i have created based on the historic data to apply the outcome category to the future data set. I want to add this as a new column to the future data set.

the reason i need to apply outocmes to each of the rows of data in the future data set is that i need to use that outcome to filter and then further investigate.

any help that anyone could provide or point me in the right direction would be appreciated.

Thanks,
F

my model code is:

fModel = glm(Outcome + Variable2 + Variable3, data = ADataSet, family=binomial)

You could use augment() from broom.

https://cran.r-project.org/web/packages/broom/vignettes/broom.html

1 Like

I didn't understand what you were trying to do.

To use glm, create a model, and predict new_data, you can use the following

clf_data <- iris %>% 
  select(Petal.Width,Species)%>%
  filter(Species != "versicolor")%>%
  mutate(not_setosa_tf = Species!="setosa",
         not_setosa = as.numeric(not_setosa_tf))

model_glm <- glm(not_setosa ~ Petal.Width , data=clf_data, family = binomial("logit"))
summary(model_glm)

new_data <- tibble(Petal.Width = seq(0,2.5,0.01))
new_data$pred <- predict(model, list(Petal.Width = new_data$Petal.Width), type="response")
1 Like