Hello !
I am trying to create a model using #caret , I splitted the data using the following code:
set.seed(3456)
trainIndex_hairpin <- createDataPartition(data_hairpin$subcellular_location, p = .8,
list = FALSE,
times = 1)
hairpinTrain <- data_hairpin[ trainIndex_hairpin,]
hairpinTest <- data_hairpin[-trainIndex_hairpin,]
After that I started to create the model using the following code:
fitControl <- trainControl(## 10-fold CV
method = "repeatedcv",
number = 10,
## repeated ten times
repeats = 10)
training <- hairpinTrain
testing <- hairpinTest
set.seed(825)
gbmFit2 <- train(hairpinTrain$subcellular_location ~ ., data = hairpinTrain,
method = "gbm",
trControl = fitControl,
## This last option is actually one
## for gbm() that passes through
verbose = FALSE)
gbmFit2
BUT it always give me this ERROR:
Error: Can't subset columns that don't exist.
x Column hairpinTrain
doesn't exist.
this is your private data, so your code isn't runnable by us forumites without you taking steps to make a reprex.
That said
is a red flag to me. Unless you are an expert doing something very clever, you should not repeat the name of your dataset anywhere in a function call if you are setting an explicit data param with that datasets name.
i.e. subcellular_location would be assumed to be within the hairpinTrain that you declared you are using.
train(subcellular_location ~ ., data = hairpinTrain,
Thanks for your consideration.
I also tried this and gave me the same error !!
it seems to work fine. Perhaps there is some eccentricity with your data though. you can make a reprex
library(caret)
library(tidyverse)
data_hairpin <- iris %>% rename(subcellular_location=Species)
trainIndex_hairpin <- createDataPartition(data_hairpin$subcellular_location, p = .8,
list = FALSE,
times = 1)
hairpinTrain <- data_hairpin[ trainIndex_hairpin,]
hairpinTest <- data_hairpin[-trainIndex_hairpin,]
fitControl <- trainControl(
method = "repeatedcv",
number = 10,
repeats = 10)
gbmFit2 <- train(hairpinTrain$subcellular_location ~ ., data = hairpinTrain,
method = "gbm",
trControl = fitControl,
verbose = FALSE)
#Error in `[.data.frame`(data, , all.vars(Terms), drop = FALSE) :
#undefined columns selected
#gives some result
gbmFit2 <- train(subcellular_location ~ ., data = hairpinTrain,
method = "gbm",
trControl = fitControl,
verbose = FALSE)
gbmFit2
Thanks very much nirgrahamuk. It works !!
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.