K Nearest Neighbor Algorithm Error

Hello, I am building a KNN model using the gapminder data set of countries, continents and their economic indices. I am using continents as the classifier. However, when I run the KNN algorithm, I get the error message below:

Error in knn(condata_train, condata_test, cl = condata_train_labels, k = 15) : 
  'train' and 'class' have different lengths

I have checked the data carefully, and the training data and classification labels are of equal length. Therefore, I do not understand what it means by "different lengths".

Below is my code:

library(class)
library(dplyr)
library(gapminder)
## Continent K-Nearest Neighbors 

condata <- head(gapminder,200)

## Clean data

condata <- condata[,-c(1,3)]

condata
summary(condata)

## Normalize data

condata_preprocess <- preProcess(condata[,c(2:4)],method = c("range"))
condata_norm <- predict(condata_preprocess,condata[,c(2:4)])

summary(condata_norm)

## Split data into training and test

condata_ran <- sample(1:nrow(condata),0.7*nrow(condata))

condata_train <- condata_norm[condata_ran,]
condata_test <- condata_norm[-condata_ran,]

nrow(condata_train)
nrow(condata_test)
view(condata_train)

## Create training and test labels

condata_train_labels <- condata[condata_ran,1]
condata_test_labels <- condata[-condata_ran,1]

condata_train_labels
condata_test_labels
nrow(condata_train_labels)
view(cbind.data.frame(condata_train_labels,condata_train))

## Build KNN

condata_knn <- knn(condata_train,condata_test,cl = condata_train_labels,k = 15)

Is there a rule I have not followed?

The cl argument should be a vector. Use:

condata_knn <-
  knn(condata_train,
      condata_test,
      cl = condata_train_labels$continent,
      k = 15)

Thanks, it worked! Didn't think of that.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.