In my book, it provides a sample about cross-validation.
housing<-read.table("http://www.jaredlander.com/data/housing.csv",sep=",",header=TRUE,stringAsFactors=FALSE)
names(housing)<-c("neighborhood","class","units","yearbuilt","sqft","income","incomepersqft","expense","valuepersqft","boro")
cv.work<-function(fun,k,data,cost=function(y,yhat)mean((y-yhat)^2),response="y",...)
+ {folds<-data.frame(Fold=sample(rep(x=1:k,length.out=nrow(data))),Row=1:nrow(data))
+ error<-0
+ for(f in 1:max(folds$Fold))
+ {
+ theRows<-folds$Row[folds$Fold==f]
+ mod<-fun(data=data[-theRows,],...)
+ pred<-predict(mod,data[theRows,])
+ error<-error+cost(data[theRows,response],pred)*(length(theRows)/nrow(data))
+ }
+ return(error)
+ }
cv1<-cv.work(lm,5,housing,response = "valuepersqft",formula=valuepersqft~units*sqft+boro)
I repeat the book's code but get the different result and another problem is that the numeric result of cv1 will change when I repeat cv1.
So why the same codes and same dataset will produce different numbers.
Thanks!