Cross validation using K folds

I am trying use the function below to carry cross validation on some of the regression model using K folds instead of set validation. I am struggling to work out how to use the code to test the model using the LASSO regression. I would really appreciate any help with this please many thanks.

## Set the seed to make the analysis reproducible
set.seed(1)

## 10-fold cross validation
nfolds = 10
## Sample fold-assignment index
fold_index = sample(nfolds, n, replace=TRUE)
## Print first few fold-assignments
head(fold_index)

reg_cv = function(X1, y, fold_ind) {
  Xy = data.frame(X1, y=y)
  nfolds = max(fold_ind)
  if(!all.equal(sort(unique(fold_ind)), 1:nfolds)) stop("Invalid fold partition.")
  cv_errors = numeric(nfolds)
  for(fold in 1:nfolds) {
    tmp_fit = lm(y ~ ., data=Xy[fold_ind!=fold,])
    yhat = predict(tmp_fit, Xy[fold_ind==fold,])
    yobs = y[fold_ind==fold]
    cv_errors[fold] = mean((yobs - yhat)^2)
  }
  fold_sizes = numeric(nfolds)
  for(fold in 1:nfolds) fold_sizes[fold] = length(which(fold_ind==fold))
  test_error = weighted.mean(cv_errors, w=fold_sizes)
  return(test_error)
}
lasso_fit = glmnet(
  X1,
  y,
  family = "binomial",
  alpha = 1,
  standardize = FALSE,
  lambda = grid
)

Thanks for your help!

fold ind,contains a consecutive set of integers, starting at 1. On line 4, that create a vector to hold the average MSE computed over each fold. Then loop over the folds.

I am required to do the cross validation in a fair comparison way. This way ensure the same folds is used by each model.

Thanks for your help

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.