Hello, I am building a KNN model using the gapminder data set of countries, continents and their economic indices. I am using continents as the classifier. However, when I run the KNN algorithm, I get the error message below:
Error in knn(condata_train, condata_test, cl = condata_train_labels, k = 15) :
'train' and 'class' have different lengths
I have checked the data carefully, and the training data and classification labels are of equal length. Therefore, I do not understand what it means by "different lengths".
Below is my code:
library(class)
library(dplyr)
library(gapminder)
## Continent K-Nearest Neighbors
condata <- head(gapminder,200)
## Clean data
condata <- condata[,-c(1,3)]
condata
summary(condata)
## Normalize data
condata_preprocess <- preProcess(condata[,c(2:4)],method = c("range"))
condata_norm <- predict(condata_preprocess,condata[,c(2:4)])
summary(condata_norm)
## Split data into training and test
condata_ran <- sample(1:nrow(condata),0.7*nrow(condata))
condata_train <- condata_norm[condata_ran,]
condata_test <- condata_norm[-condata_ran,]
nrow(condata_train)
nrow(condata_test)
view(condata_train)
## Create training and test labels
condata_train_labels <- condata[condata_ran,1]
condata_test_labels <- condata[-condata_ran,1]
condata_train_labels
condata_test_labels
nrow(condata_train_labels)
view(cbind.data.frame(condata_train_labels,condata_train))
## Build KNN
condata_knn <- knn(condata_train,condata_test,cl = condata_train_labels,k = 15)
Is there a rule I have not followed?