train, trainControl function not working.

Hello all,

I am trying to perform cross validation on a model but I keep getting errors. I have installed the caret package and called the library function, library(caret). However, the problem persist. Anyone an idea of what I'm missing?

Thanks

here is a link to the data set I'm using

set.seed(569)

ctrl <- trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE)
#> Error in trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE): could not find function "trainControl"

model <- train(Fentanyl~
                 Age+Sex+Race+
                 Oxycodone+Hydrocodone+Buprenorphine+
                 Morphine+Codeine+Norbuprenorphine+
                 Naloxone,
               ToTrain,
               method = "glm", family = "poisson",
               trControl = ctrl,
               tuneLength = 10)#end train
#> Error in train(Fentanyl ~ Age + Sex + Race + Oxycodone + Hydrocodone + : could not find function "train"

Created on 2019-02-01 by the reprex package (v0.2.1)

Your error message says: could not find function "train"
You are saying you loaded the package with library(caret), but it's not in your reprex. Do you mind re-running it with loading the package?

I did, I used install.packages("caret")
I went a head and re-installed just in case.

Result:

model <- train(Fentanyl~
                 Age+Sex+Race+
                 Oxycodone+Hydrocodone+Buprenorphine+
                 Morphine+Codeine+Norbuprenorphine+
                 Naloxone,
               ToTrain,
               method = "glm", family = "poisson",
               trControl = ctrl,
               tuneLength = 10)#end train
#> Error in train(Fentanyl ~ Age + Sex + Race + Oxycodone + Hydrocodone + : could not find function "train"

Created on 2019-02-01 by the reprex package (v0.2.1)

sorry i didn't copy this function

ctrl <- trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE)
#> Error in trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE): could not find function "trainControl"

Created on 2019-02-01 by the reprex package (v0.2.1)

I am not talking about installing the package.
I am talking about loading it.
Have you made the library(caret) call?

Also we dont have access to your data because apparently is only available for you and your classmates.

If you have planned to keep asking questions about this, please read and follow our policy about homework questions

1 Like

Yes, I made the call with library(caret)

I don't see it in your reprex. I am asking you to re-run the whole thing with a library call included in your code.
And pay attention to @andresrcs's note: read the homework policy

Thank you for the notice, I did read this before, my question has to do with the train and trainControl functions. The assignment has to do with predicting if fentanyl was involved in any given overdose. I have already made the model for that, via glm() and submitted results. I'm trying to run cross validation to see which of the variables I chose are improving the model or not. I'm not sure how to share the data set here, I could add it to my github and add a link?

It may be simpler than you think.

Once more, can you please create the same reprex, but this time I want to see the library call in the code, and see what error is thrown in that case.

Any chance you can make this reprex with a small subset of your data? (small enough to fit into a reasonable datapasta::df_paste(ToTrain) call).

Here's the whole thing. If it's any peace of mind, our competition ends today and I have already maxed out my submissions. However, i still need to know how to do this.

Thanks,

#Author: Sal Meza
#Created: 1/20/19
#Last edited: 1/29/19
#Competion1


ToTrain=read.csv(file = "train.csv", header = TRUE, sep = ",")
#> Warning in file(file, "rt"): cannot open file 'train.csv': No such file or
#> directory
#> Error in file(file, "rt"): cannot open the connection
#ToTrain$Fentanyl = as.factor(ToTrain$Fentanyl)
#train <- train[train$Home_Zip!="Homeless", ] 
#train <- train[train$Home_Zip!="Unknown", ]
#train <- train[train$Education!="NA", ]

ToTest=read.csv(file = "test.csv", header = TRUE, sep = ",")
#> Warning in file(file, "rt"): cannot open file 'test.csv': No such file or
#> directory
#> Error in file(file, "rt"): cannot open the connection
#test <- test[test$Home_Zip!="Homeless", ]
#test <- test[test$Home_Zip!="Unknown", ]
#test <- test[test$Education!="NA", ]

install.packages("AUC")
#> Installing package into 'C:/Users/userOne/Documents/R/win-library/3.5'
#> (as 'lib' is unspecified)
#> package 'AUC' successfully unpacked and MD5 sums checked
#> 
#> The downloaded binary packages are in
#>  C:\Users\userOne\AppData\Local\Temp\RtmpeoX4U3\downloaded_packages
install.packages("caret")
#> Installing package into 'C:/Users/userOne/Documents/R/win-library/3.5'
#> (as 'lib' is unspecified)
#> package 'caret' successfully unpacked and MD5 sums checked
#> Warning: cannot remove prior installation of package 'caret'
#> 
#> The downloaded binary packages are in
#>  C:\Users\userOne\AppData\Local\Temp\RtmpeoX4U3\downloaded_packages
install.packages("e1071")
#> Installing package into 'C:/Users/userOne/Documents/R/win-library/3.5'
#> (as 'lib' is unspecified)
#> package 'e1071' successfully unpacked and MD5 sums checked
#> Warning: cannot remove prior installation of package 'e1071'
#> 
#> The downloaded binary packages are in
#>  C:\Users\userOne\AppData\Local\Temp\RtmpeoX4U3\downloaded_packages

library(AUC)
#> AUC 0.3.0
#> Type AUCNews() to see the change log and ?AUC to get an overview.
library(caret)
#> Error in library(caret): there is no package called 'caret'
library(e1071)
#> Error in library(e1071): there is no package called 'e1071'
library(reprex)

set.seed(569)

ctrl <- trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE)
#> Error in trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE): could not find function "trainControl"

model <- train(Fentanyl~
                 Age+Sex+Race+
                 Oxycodone+Hydrocodone+Buprenorphine+
                 Morphine+Codeine+Norbuprenorphine+
                 Naloxone,
               ToTrain,
               method = "glm", family = "poisson",
               trControl = ctrl,
               tuneLength = 10)#end train
#> Error in train(Fentanyl ~ Age + Sex + Race + Oxycodone + Hydrocodone + : could not find function "train"

#predicted <- predict(model, ToTrain["Fentanyl"], type = "prob")

#actual <- ToTrain["Fentanyl"]

Created on 2019-02-01 by the reprex package (v0.2.1)

As you can clearly read from your error messages, your caret package has never actually loaded. Let's take it from here.
In fact, it looks like neither the data nor the packages are loading at all. The whole thing is ridden with error messages.
Have you tried restarting the session?
How many versions of R and RStudio do you have installed? We're looking for "1 of each", typically :slight_smile:

Apparently you have some problems with your rstudio setup, let's approach those first, first try deleting your .Rdata file and restarting your r session with Ctrl + Shift + F10

Okay, I did ctrl+shift+f10 and added 100 rows of each data set per datapasta, it's kind of hard to look at. let me know if you need something different.

Thanks for the help.

#Author: Sal Meza
#Created: 1/20/19
#Last edited: 1/29/19
#Competion1


ToTrain=read.csv(file = "train.csv", header = TRUE, sep = ",")
#> Warning in file(file, "rt"): cannot open file 'train.csv': No such file or
#> directory
#> Error in file(file, "rt"): cannot open the connection
#ToTrain$Fentanyl = as.factor(ToTrain$Fentanyl)
#train <- train[train$Home_Zip!="Homeless", ] 
#train <- train[train$Home_Zip!="Unknown", ]
#train <- train[train$Education!="NA", ]

ToTest=read.csv(file = "test.csv", header = TRUE, sep = ",")
#> Warning in file(file, "rt"): cannot open file 'test.csv': No such file or
#> directory
#> Error in file(file, "rt"): cannot open the connection
#test <- test[test$Home_Zip!="Homeless", ]
#test <- test[test$Home_Zip!="Unknown", ]
#test <- test[test$Education!="NA", ]


pastaTrain <- tibble::tribble(
                ~Sex, ~Race, ~Age, ~Morphine, ~Codeine, ~Fentanyl, ~Oxycodone, ~Hydrocodone, ~Oxymorphone, ~Hydromorphone, ~Dihydrocodeine, ~Buprenorphine, ~Norbuprenorphine, ~Cotinine,  ~id,
                   1,     1,   39,         0,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0,  851,
                   1,     1,   25,         1,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0, 1271,
                   0,     1,   48,         1,        1,         0,          0,            0,            0,              0,               0,              0,                 0,         0,  448,
                   1,     1,   55,         0,        0,         0,          1,            0,            0,              0,               0,              0,                 0,         0,  423,
                   0,     1,   29,         0,        0,         0,          0,            1,            0,              0,               0,              0,                 0,         0, 1032,
                   1,     1,   65,         0,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0, 1502,
                   1,     4,   24,         0,        0,         1,          0,            0,            0,              0,               0,              0,                 0,         0, 1557,
                   1,     2,   65,         1,        1,         0,          0,            0,            0,              0,               0,              0,                 0,         0,  681,
                   1,     1,   55,         0,        0,         0,          0,            1,            0,              1,               0,              0,                 0,         0,  981,
                   1,     1,   32,         0,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0, 1353,
                   1,     2,   24,         1,        1,         1,          0,            0,            0,              0,               0,              0,                 0,         0, 1241,
                   1,     2,   38,         0,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0, 1391,
                   0,     1,   48,         0,        0,         1,          0,            0,            0,              0,               0,              0,                 0,         0,  473,
                   1,     1,   59,         0,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0,  859,
                   1,     1,   44,         0,        0,         0,          1,            0,            1,              0,               0,              0,                 0,         0,  920,
                   1,     4,   41,         0,        0,         1,          0,            0,            0,              0,               0,              0,                 0,         0, 1073,
                   0,     2,    1,         0,        0,         0,          0,            1,            0,              1,               0,              0,                 0,         0,  513,
                   0,     1,   46,         0,        0,         0,          1,            0,            1,              0,               0,              0,                 0,         0,  369,
                   0,     2,   64,         1,        1,         0,          0,            0,            0,              0,               0,              0,                 0,         0,  836
                )




pastaTest <- tibble::tribble(
               ~Sex, ~Race, ~Age, ~Morphine, ~Codeine, ~Oxycodone, ~Hydrocodone, ~Oxymorphone, ~Hydromorphone, ~Dihydrocodeine, ~Buprenorphine, ~Norbuprenorphine, ~Cotinine,  ~id,
                  1,     1,   23,         1,        1,          0,            0,            0,              0,               0,              0,                 0,         0, 1066,
                  1,     1,   48,         1,        1,          0,            1,            1,              1,               0,              0,                 0,         0,  913,
                  1,     1,   53,         0,        0,          1,            0,            1,              0,               0,              0,                 0,         0,   53,
                  0,     1,   47,         1,        0,          0,            1,            0,              1,               0,              0,                 0,         0,   94,
                  1,     2,   42,         1,        1,          0,            0,            0,              0,               0,              0,                 0,         0, 1154,
                  0,     1,   48,         0,        0,          0,            0,            0,              0,               0,              0,                 0,         0,  528,
                  1,     1,   49,         0,        0,          0,            0,            0,              0,               0,              0,                 0,         0,  137,
                  1,     1,   38,         1,        1,          0,            0,            0,              0,               0,              0,                 0,         0,  808,
                  0,     1,   43,         0,        0,          1,            0,            0,              0,               0,              0,                 0,         0, 1310,
                  1,     1,   32,         1,        1,          0,            0,            0,              0,               0,              0,                 0,         0,  995,
                  0,     1,   39,         0,        0,          0,            0,            1,              1,               0,              0,                 0,         0,  477,
                  1,     2,   19,         0,        0,          0,            1,            0,              0,               0,              0,                 0,         0, 1137,
                  0,     1,   35,         0,        0,          1,            1,            0,              0,               0,              0,                 0,         0,  543,
                  1,     2,   42,         0,        0,          0,            0,            0,              0,               0,              0,                 0,         0,    2,
                  0,     1,   28,         0,        0,          0,            0,            0,              0,               0,              0,                 0,         0, 1362,
                  0,     1,   38,         1,        1,          0,            0,            0,              0,               0,              0,                 0,         0,  665,
                  0,     1,   40,         0,        0,          0,            0,            0,              0,               0,              0,                 0,         0,  556,
                  1,     1,   28,         0,        0,          1,            0,            1,              0,               0,              0,                 0,         0,  177,
                  0,     1,    0,         1,        0,          0,            0,            0,              0,               0,              0,                 0,         0, 1472
               )





install.packages("AUC")
#> Installing package into 'C:/Users/userOne/Documents/R/win-library/3.5'
#> (as 'lib' is unspecified)
#> package 'AUC' successfully unpacked and MD5 sums checked
#> 
#> The downloaded binary packages are in
#>  C:\Users\userOne\AppData\Local\Temp\Rtmpsj0xqi\downloaded_packages
install.packages("caret")
#> Installing package into 'C:/Users/userOne/Documents/R/win-library/3.5'
#> (as 'lib' is unspecified)
#> package 'caret' successfully unpacked and MD5 sums checked
#> Warning: cannot remove prior installation of package 'caret'
#> 
#> The downloaded binary packages are in
#>  C:\Users\userOne\AppData\Local\Temp\Rtmpsj0xqi\downloaded_packages
install.packages("e1071")
#> Installing package into 'C:/Users/userOne/Documents/R/win-library/3.5'
#> (as 'lib' is unspecified)
#> package 'e1071' successfully unpacked and MD5 sums checked
#> Warning: cannot remove prior installation of package 'e1071'
#> 
#> The downloaded binary packages are in
#>  C:\Users\userOne\AppData\Local\Temp\Rtmpsj0xqi\downloaded_packages
install.packages("datapasta")
#> Installing package into 'C:/Users/userOne/Documents/R/win-library/3.5'
#> (as 'lib' is unspecified)
#> package 'datapasta' successfully unpacked and MD5 sums checked
#> 
#> The downloaded binary packages are in
#>  C:\Users\userOne\AppData\Local\Temp\Rtmpsj0xqi\downloaded_packages
install.packages("reprex")
#> Installing package into 'C:/Users/userOne/Documents/R/win-library/3.5'
#> (as 'lib' is unspecified)
#> package 'reprex' successfully unpacked and MD5 sums checked
#> 
#> The downloaded binary packages are in
#>  C:\Users\userOne\AppData\Local\Temp\Rtmpsj0xqi\downloaded_packages

library(datapasta)
library(AUC)
#> AUC 0.3.0
#> Type AUCNews() to see the change log and ?AUC to get an overview.
library(caret)
#> Error in library(caret): there is no package called 'caret'
library(e1071)
#> Error in library(e1071): there is no package called 'e1071'
library(reprex)

set.seed(569)

ctrl <- trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE)
#> Error in trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE): could not find function "trainControl"

model <- train(Fentanyl~
                 Age+Sex+Race+
                 Oxycodone+Hydrocodone+Buprenorphine+
                 Morphine+Codeine+Norbuprenorphine+
                 Naloxone,
               ToTrain,
               method = "glm", family = "poisson",
               trControl = ctrl,
               tuneLength = 10)#end train
#> Error in train(Fentanyl ~ Age + Sex + Race + Oxycodone + Hydrocodone + : could not find function "train"

#predicted <- predict(model, ToTrain["Fentanyl"], type = "prob")

#actual <- ToTrain["Fentanyl"]

Created on 2019-02-02 by the reprex package (v0.2.1)

Correction; i added 20 rows of each set, I tried 100 but the post was too large. There were 150 columns in the original data as well I reduced that to about 11.

The idea behind a reprex is to make a minimal reproducible example of your problem, and you should not include the install.packages() command unless you're having issues with the installation process itself.

Having that said, this would be your reprex

pastaTrain <- tibble::tribble(
    ~Sex, ~Race, ~Age, ~Morphine, ~Codeine, ~Fentanyl, ~Oxycodone, ~Hydrocodone, ~Oxymorphone, ~Hydromorphone, ~Dihydrocodeine, ~Buprenorphine, ~Norbuprenorphine, ~Cotinine,  ~id,
    1,     1,   39,         0,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0,  851,
    1,     1,   25,         1,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0, 1271,
    0,     1,   48,         1,        1,         0,          0,            0,            0,              0,               0,              0,                 0,         0,  448,
    1,     1,   55,         0,        0,         0,          1,            0,            0,              0,               0,              0,                 0,         0,  423,
    0,     1,   29,         0,        0,         0,          0,            1,            0,              0,               0,              0,                 0,         0, 1032,
    1,     1,   65,         0,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0, 1502,
    1,     4,   24,         0,        0,         1,          0,            0,            0,              0,               0,              0,                 0,         0, 1557,
    1,     2,   65,         1,        1,         0,          0,            0,            0,              0,               0,              0,                 0,         0,  681,
    1,     1,   55,         0,        0,         0,          0,            1,            0,              1,               0,              0,                 0,         0,  981,
    1,     1,   32,         0,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0, 1353,
    1,     2,   24,         1,        1,         1,          0,            0,            0,              0,               0,              0,                 0,         0, 1241,
    1,     2,   38,         0,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0, 1391,
    0,     1,   48,         0,        0,         1,          0,            0,            0,              0,               0,              0,                 0,         0,  473,
    1,     1,   59,         0,        0,         0,          0,            0,            0,              0,               0,              0,                 0,         0,  859,
    1,     1,   44,         0,        0,         0,          1,            0,            1,              0,               0,              0,                 0,         0,  920,
    1,     4,   41,         0,        0,         1,          0,            0,            0,              0,               0,              0,                 0,         0, 1073,
    0,     2,    1,         0,        0,         0,          0,            1,            0,              1,               0,              0,                 0,         0,  513,
    0,     1,   46,         0,        0,         0,          1,            0,            1,              0,               0,              0,                 0,         0,  369,
    0,     2,   64,         1,        1,         0,          0,            0,            0,              0,               0,              0,                 0,         0,  836
)


library(caret)
#> Loading required package: lattice
#> Loading required package: ggplot2

set.seed(569)

ctrl <- trainControl(method = "repeatedcv", number = 10, savePredictions = TRUE)

model <- train(Fentanyl~
                   Age+Sex+Race+
                   Oxycodone+Hydrocodone+Buprenorphine+
                   Morphine+Codeine+Norbuprenorphine,
               data = pastaTrain,
               method = "glm",
               family = "poisson",
               trControl = ctrl,
               tuneLength = 10)
#> Warning in train.default(x, y, weights = w, ...): You are trying to do
#> regression and your outcome only has two possible values Are you trying to
#> do classification? If so, use a 2 level factor as your outcome column.
#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type =
#> ifelse(type == : prediction from a rank-deficient fit may be misleading

#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type =
#> ifelse(type == : prediction from a rank-deficient fit may be misleading

#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type =
#> ifelse(type == : prediction from a rank-deficient fit may be misleading
#> Warning: glm.fit: fitted rates numerically 0 occurred
#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type =
#> ifelse(type == : prediction from a rank-deficient fit may be misleading

#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type =
#> ifelse(type == : prediction from a rank-deficient fit may be misleading
#> Warning: glm.fit: fitted rates numerically 0 occurred
#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type =
#> ifelse(type == : prediction from a rank-deficient fit may be misleading
#> Warning: glm.fit: fitted rates numerically 0 occurred
#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type =
#> ifelse(type == : prediction from a rank-deficient fit may be misleading

#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type =
#> ifelse(type == : prediction from a rank-deficient fit may be misleading
#> Warning: glm.fit: algorithm did not converge
#> Warning: glm.fit: fitted rates numerically 0 occurred
#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type =
#> ifelse(type == : prediction from a rank-deficient fit may be misleading

#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type =
#> ifelse(type == : prediction from a rank-deficient fit may be misleading
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info =
#> trainInfo, : There were missing values in resampled performance measures.
warnings()

I'm not a statistician, so I'm sure other people could help you better to understand why you are getting this warning messages.

thanks for the clarification on the reprex().

Can you give the results of sessionInfo()?

The issue is that caret cannot be installed despite it saying

Try doing this:

install.packages("devtools", repos = "http://cran.r-project.org")
devtools::install_github("r-lib/pkg")

if (require(pkg)) {
  pkg::pkg_install("caret")
}

here is the infosesstion()

sessionInfo()
#> R version 3.5.2 (2018-12-20)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 17134)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.1252 
#> [2] LC_CTYPE=English_United States.1252   
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_3.5.2  magrittr_1.5    tools_3.5.2     htmltools_0.3.6
#>  [5] yaml_2.2.0      Rcpp_1.0.0      stringi_1.2.4   rmarkdown_1.11 
#>  [9] highr_0.7       knitr_1.21      stringr_1.3.1   xfun_0.4       
#> [13] digest_0.6.18   evaluate_0.12

Created on 2019-02-02 by the reprex package (v0.2.1)