Analyzing the Lahman library on Baseball

What is wrong with this code?

lm_data <- Teams %>%
filter(yearID %in% 1961:2018) %>%
mutate(BB = BB/G, HR = HR/G, R = R/G) %>%
group_by(yearID) %>%
filter(term = "BB") %>%
do(tidy(lm(R ~ BB + HR, data = .)))

If I understand your code correctly, you're predicting Runs (R) from Walks (BB) and Home runs (HR) per game. Which this would do:

library(tidyverse)
library(Lahman)

lm_data <- Teams %>% 
  filter(yearID %in% 1961:2018) %>% 
  mutate(BB = BB/G, HR = HR/G, R = R/G) %>% 
  group_by(yearID) %>%
  lm(R ~ BB + HR, data = .)
summary(lm_data)

With your filter() function, you can get rid of it or filter Walks (BB) by a number. For example filter(BB > 3.0)

Thank you so much for your clarity on this code.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.