Error in model.frame.default variable lengths differ

library(MASS)
data(biopsy)
biopsy$ID = NULL
names(biopsy) = c("thick", "u.size", "u.shape", "adhsn", "s.size", "nucl", "chrom", "n.nuc", "mit", "class")
biopsy.v2 = na.omit(biopsy)
smp_size <- floor(0.80 * nrow(biopsy.v2)) #it is used to calculate sample size (80% of the total rows or observations) 

## set the seed to make your partition reproducible
set.seed(123)
select_sample <- sample(1:nrow(biopsy.v2),smp_size) #selects the rows for training set

training_set<-biopsy.v2[select_sample,] #creates the training set from the carseat data using the select_sample
validation_set<-biopsy.v2[-select_sample,] #creates the validation set using the remaining rows which are not in the training set [-select_sample,]

nrow(training_set)
nrow(validation_set)
model_1<-lm(biopsy.v2$thick~., data=training_set) #First regression model

summary (model_1)

Error in model.frame.default(formula = biopsy.v2$thick ~ ., data = training_set, :
variable lengths differ (found for 'u.size')

You are mixing things up. You picked training_set as data for lm to intepret the formula from, yet you involve biopsy.v2...

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.