Predictions are filled with NA, why do I not have a full set of predicitons.

jblakesl · October 16, 2020, 4:47am

Hi,

I am using glm to predict a percentage. My code works up until I am trying to predict using my test data. My predictions return some percentages, but most of it is NA. I have cleaned my data so I do not know what could be causing it.

Here is the warning message I get as well as the code I am using.

Warning message:
In predict.lm(object, newdata, se.fit, scale = 1, type = if (type == :
prediction from a rank-deficient fit may be misleading

library(ggplot2)
library(dplyr)
library(caret)
train=read.csv("C:\\Users\\jbhoo\\Downloads\\Training2.csv")
attach(train)
head(train)

#taking care of missing values##  
sapply(train,function(x) sum(is.na(x))/length(playbyplayorder_id))
##Less than 50% replace value with mean, over 50% get rid of attribute##
position2[is.na(position2)]<-0
train.clean=na.omit(train)

###Find significant features to predicting 
model=glm(reboffensive~.,data = train.clean,family = poisson(link = "log")) 
summary(model)
p.off=predict(model,newdata=train.clean,type="response")
###test data
test=read.csv("C:\\Users\\jb\\Downloads\\Testing2.csv")
##Clean data from NA
position2[is.na(position2)]<-0
##Make Predictions
test.clean=test[1:124619,]
p1=predict(model,newdata=test.clean,type="response")

GreyMerchant · October 16, 2020, 7:16am

Hello,

I really suggest that you create a reprex (FAQ: How to do a minimal reproducible example ( reprex ) for beginners) in order to replicate this behaviour. This will just make it much easier for others to help you.

system · November 6, 2020, 7:16am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.