Lahman Baseball Data Mutate Question

Newbie here and lesser experience in R. Would appreciate anyone's thoughts on this. I am using the Lahman package in R looking at baseball stats. For this inquiry I am specifically looking at the Pitching table.

I'm a bit tripped up and know what I want to do and am not sure how to do it. I think I need to use MUTATE to create a new column summing up all GS per pitcher per team from seasons 1871-2021. Code I have thus far is below.

#To see how many total pitchers had a GS on the 2011 Milwaukee Brewers, 6 total
MIL <- filter(Pitching, yearID == 2011, teamID == "MIL", GS > 0)
#This below lists any pitcher with a GS from 1871-2021, throwing 2022 out
GSPitching <- filter(Pitching, yearID < 2022, GS > 0)

As you can see from the 2011 Milwaukee Brewers example, I am wanting to total the amount of pitchers that had a GS per team, per team. 2011 Brewers had 6 as seen in MIL dataframe. 

GSPitching dataframe has all pitchers with > 0 GS, therefore appearing in at least one game as a starting pitcher for their team.

What is the best way to sum this up per team and per year? I think it's some type of mutate summing the count of GS but how do you get there per team, per pitcher in each distinct year?

Wanting to get that end product in a dataframe so, for example, I can see how many pitchers had a GS for any team in those years summed up (e.g. the 1973 Tigers, 2003 Astros) before then making some visualizations for it. 

Appreciate any guidance here.

I recommended you learn from this general approach :
5 Data transformation | R for Data Science (

thank you this was helpful! specifically, the overview of dplyr and utilizing group_by and summarise to create dataFrames getting the sums of what I wanted.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.