A new function written for prediction that I don't understand.

Hi guys, I am working on a project for predicting a specified value, but I cannot understand this newly written function. Can someone please explain it to me? Thank you all

Here is the data.
https://drive.google.com/file/d/1Il0CQxLMbZi286hxz6J8rND8DoLn1HpM/view?usp=sharing

Here is the code for fit.

fit1 <- lm(Bodyfat~Triceps+Thigh+Midarm, data=dat)

Then it is the code for prediction.

mypre <- function(data, mod) {
  varnames <- as.character(formula(mod))[3]
  varnames <- trimws(unlist(stringr::str_split(varnames, "\\+")))
  coef <- coef(mod)
  newdata <- data[, varnames]
  len <- length(newdata[[1]])
  y <- vector(length = len)
  for (i in 1:len) {
    y[i] <- sum(coef[-1] * unlist(newdata[i,])) + coef[1]
  }
  return(y)
}

Here's what mypre does, given dat as an argument. In practice other data, in the same form as dat would be used.

dat <- data.frame(
  Triceps =
    c(19.5, 24.7, 30.7, 29.8, 19.1, 25.6, 31.4, 27.9, 22.1, 25.5, 31.1, 30.4, 18.7, 19.7, 14.6, 29.5, 27.7, 30.2, 22.7, 25.2),
  Thigh =
    c(43.1, 49.8, 51.9, 54.3, 42.2, 53.9, 58.5, 52.1, 49.9, 53.5, 56.6, 56.7, 46.5, 44.2, 42.7, 54.4, 55.3, 58.6, 48.2, 51),
  Midarm =
    c(29.1, 28.2, 37, 31.1, 30.9, 23.7, 27.6, 30.6, 23.2, 24.8, 30, 28.3, 23, 28.6, 21.3, 30.1, 25.7, 24.6, 27.1, 27.5),
  Bodyfat =
    c(11.9, 22.8, 18.7, 20.1, 12.9, 21.7, 27.1, 25.4, 21.3, 19.3, 25.4, 27.2, 11.7, 17.8, 12.8, 23.9, 22.6, 25.4, 14.8, 21.1)
)

fit1 <- lm(Bodyfat ~ Triceps + Thigh + Midarm, data = dat)

mypre <- function(data, mod) {
  varnames <- as.character(formula(mod))[3]
  varnames <- trimws(unlist(stringr::str_split(varnames, "\\+")))
  coef <- coef(mod)
  newdata <- data[, varnames]
  len <- length(newdata[[1]])
  y <- vector(length = len)
  for (i in 1:len) {
    y[i] <- sum(coef[-1] * unlist(newdata[i, ])) + coef[1]
  }
  return(y)
}

mypre(dat, fit1)
#>  [1] 14.85499 20.21884 20.98668 23.12732 11.75761 22.24372 25.71432 22.27064
#>  [9] 19.59482 20.54838 24.59556 24.99231 15.00940 13.67231 11.81195 23.72747
#> [17] 22.97360 26.78590 18.52628 20.48791

mypre is a laborious way of accomplishing the same thing with

predict(fit1,newdata)

Most of mypre is housekeeping. The lifting is done by

    y[i] <- sum(coef[-1] * unlist(newdata[i,])) + coef[1]

which simply applies the fit1 coefficients to the new data seriatim.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.