A new function written for prediction that I don't understand.

Newwww · April 20, 2021, 9:10pm

Hi guys, I am working on a project for predicting a specified value, but I cannot understand this newly written function. Can someone please explain it to me? Thank you all

Here is the data.
https://drive.google.com/file/d/1Il0CQxLMbZi286hxz6J8rND8DoLn1HpM/view?usp=sharing

Here is the code for fit.

fit1 <- lm(Bodyfat~Triceps+Thigh+Midarm, data=dat)

Then it is the code for prediction.

mypre <- function(data, mod) {
  varnames <- as.character(formula(mod))[3]
  varnames <- trimws(unlist(stringr::str_split(varnames, "\\+")))
  coef <- coef(mod)
  newdata <- data[, varnames]
  len <- length(newdata[[1]])
  y <- vector(length = len)
  for (i in 1:len) {
    y[i] <- sum(coef[-1] * unlist(newdata[i,])) + coef[1]
  }
  return(y)
}

technocrat · April 20, 2021, 10:30pm

Here's what mypre does, given dat as an argument. In practice other data, in the same form as dat would be used.

dat <- data.frame(
  Triceps =
    c(19.5, 24.7, 30.7, 29.8, 19.1, 25.6, 31.4, 27.9, 22.1, 25.5, 31.1, 30.4, 18.7, 19.7, 14.6, 29.5, 27.7, 30.2, 22.7, 25.2),
  Thigh =
    c(43.1, 49.8, 51.9, 54.3, 42.2, 53.9, 58.5, 52.1, 49.9, 53.5, 56.6, 56.7, 46.5, 44.2, 42.7, 54.4, 55.3, 58.6, 48.2, 51),
  Midarm =
    c(29.1, 28.2, 37, 31.1, 30.9, 23.7, 27.6, 30.6, 23.2, 24.8, 30, 28.3, 23, 28.6, 21.3, 30.1, 25.7, 24.6, 27.1, 27.5),
  Bodyfat =
    c(11.9, 22.8, 18.7, 20.1, 12.9, 21.7, 27.1, 25.4, 21.3, 19.3, 25.4, 27.2, 11.7, 17.8, 12.8, 23.9, 22.6, 25.4, 14.8, 21.1)
)

fit1 <- lm(Bodyfat ~ Triceps + Thigh + Midarm, data = dat)

mypre <- function(data, mod) {
  varnames <- as.character(formula(mod))[3]
  varnames <- trimws(unlist(stringr::str_split(varnames, "\\+")))
  coef <- coef(mod)
  newdata <- data[, varnames]
  len <- length(newdata[[1]])
  y <- vector(length = len)
  for (i in 1:len) {
    y[i] <- sum(coef[-1] * unlist(newdata[i, ])) + coef[1]
  }
  return(y)
}

mypre(dat, fit1)
#>  [1] 14.85499 20.21884 20.98668 23.12732 11.75761 22.24372 25.71432 22.27064
#>  [9] 19.59482 20.54838 24.59556 24.99231 15.00940 13.67231 11.81195 23.72747
#> [17] 22.97360 26.78590 18.52628 20.48791

mypre is a laborious way of accomplishing the same thing with

predict(fit1,newdata)

Most of mypre is housekeeping. The lifting is done by

    y[i] <- sum(coef[-1] * unlist(newdata[i,])) + coef[1]

which simply applies the fit1 coefficients to the new data seriatim.

system · May 11, 2021, 10:31pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.