I am using glm to predict a percentage. My code works up until I am trying to predict using my test data. My predictions return some percentages, but most of it is NA. I have cleaned my data so I do not know what could be causing it.
Here is the warning message I get as well as the code I am using.
In predict.lm(object, newdata, se.fit, scale = 1, type = if (type == :
prediction from a rank-deficient fit may be misleading
library(ggplot2) library(dplyr) library(caret) train=read.csv("C:\\Users\\jbhoo\\Downloads\\Training2.csv") attach(train) head(train) #taking care of missing values## sapply(train,function(x) sum(is.na(x))/length(playbyplayorder_id)) ##Less than 50% replace value with mean, over 50% get rid of attribute## position2[is.na(position2)]<-0 train.clean=na.omit(train) ###Find significant features to predicting model=glm(reboffensive~.,data = train.clean,family = poisson(link = "log")) summary(model) p.off=predict(model,newdata=train.clean,type="response") ###test data test=read.csv("C:\\Users\\jb\\Downloads\\Testing2.csv") ##Clean data from NA position2[is.na(position2)]<-0 ##Make Predictions test.clean=test[1:124619,] p1=predict(model,newdata=test.clean,type="response")