Error: Illegal argument: training_frame of function: grid: Cannot append new models to a grid with different training input

I want to train a regression model by random forest. I have defined a search grid by H2O. But I got this error:
Error: Illegal argument: training_frame of function: grid: Cannot append new models to a grid with different training input.
I need to mention I get this error in "build grid search " section
This is my code:

set.seed(200)
dataset1 = a #dataframe
  
  # Splitting data into Train and Test
  dataset = dataset1[, 2:length(dataset1)]
  ind <- sample(2, nrow(dataset), replace=TRUE, prob=c(0.70, 0.30))
  train_data <- dataset[ind==1, 1:ncol(dataset)]
  test_data <- dataset[ind==2, 1:ncol(dataset)]
 h2o.no_progress()
  h2o.init(max_mem_size = "5g")
  # create feature names
  y <- "Expression"
  x <- setdiff(names(train_data), y)
  # turn training set into h2o object
  train.h2o <- as.h2o(train_data)
  # hyperparameter grid
  hyper_grid.h2o <- list(
    ntrees      = seq(200, 600, by = 100),
    mtries      = seq(10,30, by = 2),
    sample_rate = c(.55, .632, .70, .80)
  )
  
  
  # random grid search criteria
  search_criteria <- list(
    strategy = "RandomDiscrete",
    stopping_metric = "mse",
    stopping_tolerance = 0.005,
    stopping_rounds = 10,
    max_runtime_secs = 30*60
  )
  
  # build grid search 
  random_grid <- h2o.grid(
    algorithm = "randomForest",
  grid_id = "rf_grid2",
    x = x, 
    y = y, 
    training_frame = train.h2o,
    hyper_params = hyper_grid.h2o,
    search_criteria = search_criteria
  )
  
  # collect the results and sort by our model performance metric of choice
  grid_perf2 <- h2o.getGrid(
   grid_id = "rf_grid2", 
    sort_by = "mse", 
    decreasing = FALSE
  )
  print(grid_perf2)
#grab the best model
  best_model_id <- grid_perf2@model_ids[[1]]
  best_model <- h2o.getModel(best_model_id)
  
  #  evaluate the model performance on a test set
  test_data.h2o <- as.h2o(test_data)
  best_model_perf <- h2o.performance(model = best_model, newdata = test_data.h2o)
  
  # RMSE of best model
  h2o.mse(best_model_perf) %>% sqrt()
  
  #predict
  pred_h2o <- predict(best_model, test_data.h2o)
  head(pred_h2o)

How can I fix this error?
Thanks in advance.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.