Inconsistent results using train function

Hi All,

Kindly help me with a question. My colleagues and I are running the code below using the train function. We are using the most up to date R Studio version, most update to date R version, and all of our libraries are the newest version. We get different results. For example, the highest accuracy I get is with MTRY of 50 in bold below. When they run the same code, they get an MTRY of 100 as having the highest accuracy. I'm sure you will get 100 when you run the code below. What explains the difference?

Quick notes: takes about 5 minutes for the script to run. I have also provided my session info

library(dslabs)
library(rpart)
data("tissue_gene_expression")
set.seed(1991)

x <- tissue_gene_expression$x
y <- tissue_gene_expression$y
set.seed(1991)
fit <- with(tissue_gene_expression,
train(x, y, method = "rf",
nodesize = 1, tuneGrid = data.frame(mtry = seq(50, 200, 25))))
fit$results

mtry Accuracy Kappa AccuracySD KappaSD

#1 50 0.9969167 0.9963149 0.008020859 0.009554839
#2 75 0.9940750 0.9928824 0.010599365 0.012700530
#3 100 0.9953688 0.9944118 0.010914864 0.013159516
#4 125 0.9955857 0.9946730 0.011409477 0.013757460
#5 150 0.9939767 0.9927032 0.014178379 0.017140433
#6 175 0.9921557 0.9905142 0.015418882 0.018628942
#7 200 0.9921557 0.9904997 0.014873157 0.017973473

R 3.6 has messed with set.seed (https://github.com/wch/r-source/blob/8c1c78a/src/library/base/man/Random.Rd#L173-L175) I haven't run across a solution. If you have a 3.5.3 host, run the same code; dollars to donuts you won't see the problem.

Unfortunately our results differ regardless of version of R. I upgraded to 3.6 hoping to fix the issue.

Here's what I get with successive runs

fit$results
mtry Accuracy Kappa AccuracySD KappaSD
1 50 0.9964757 0.9957840 0.009039221 0.01078329
2 75 0.9963612 0.9956487 0.009195473 0.01096989
3 100 0.9958712 0.9950407 0.009294355 0.01111398
4 125 0.9948835 0.9938393 0.011299499 0.01358540
5 150 0.9951198 0.9940812 0.014035389 0.01697160
6 175 0.9945401 0.9933828 0.014124440 0.01707680
7 200 0.9926509 0.9911178 0.015151838 0.01826565

fit$results
mtry Accuracy Kappa AccuracySD KappaSD
1 50 0.9964757 0.9957840 0.009039221 0.01078329
2 75 0.9963612 0.9956487 0.009195473 0.01096989
3 100 0.9958712 0.9950407 0.009294355 0.01111398
4 125 0.9948835 0.9938393 0.011299499 0.01358540
5 150 0.9951198 0.9940812 0.014035389 0.01697160
6 175 0.9945401 0.9933828 0.014124440 0.01707680
7 200 0.9926509 0.9911178 0.015151838 0.01826565

sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rpart_4.1-15    dslabs_0.5.2    caret_6.0-84    ggplot2_3.1.1   lattice_0.20-38

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1          pillar_1.3.1        compiler_3.6.0      gower_0.2.0         plyr_1.8.4         
 [6] iterators_1.0.10    class_7.3-15        tools_3.6.0         ipred_0.9-9         lubridate_1.7.4    
[11] tibble_2.1.1        nlme_3.1-139        gtable_0.3.0        pkgconfig_2.0.2     rlang_0.3.4        
[16] Matrix_1.2-17       foreach_1.4.4       prodlim_2018.04.18  e1071_1.7-1         stringr_1.4.0      
[21] withr_2.1.2         dplyr_0.8.0.1       generics_0.0.2      recipes_0.1.5       stats4_3.6.0       
[26] grid_3.6.0          nnet_7.3-12         tidyselect_0.2.5    data.table_1.12.2   glue_1.3.1         
[31] R6_2.4.0            survival_2.44-1.1   lava_1.6.5          reshape2_1.4.3      purrr_0.3.2        
[36] magrittr_1.5        ModelMetrics_1.2.2  scales_1.0.0        codetools_0.2-16    MASS_7.3-51.4      
[41] splines_3.6.0       randomForest_4.6-14 assertthat_0.2.1    timeDate_3043.102   colorspace_1.4-1   
[46] stringi_1.4.3       lazyeval_0.2.2      munsell_0.5.0       crayon_1.3.4       
>

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.