Recommendations for Summing over Different Variables

Hi Everyone,

I am relatively new to working with large data sets, and was looking for some guidance on how to proceed. I don't have any code yet as I'm just working around the best way to do what I'm looking to do. I have an idea in my mind of what I want the code to accomplish, but I'm not sure the best way to go about actually doing it.

I'm including a snapshot of the table I'm using, so you can have a visual of the columns I'm working with. I'm not sure what else is necessary, but let me know if I can include anything else.

Basically, I am working with a dataset I created that has a list of values in the "Pitcher.ID" column. In the end, I want to end up summing the column "Weighted Pitch Count" for each unique Pitcher.ID number. I'd like to have a set of each unique "Pitcher.ID" value along with the total sum of the other column.

Thanks in advance!

Let's say your data set is called DF. Then you can get the sum of Weighted.Pitch.Count for each Pitcher with

library(dplyr)
Stat <- DF %>% group_by(Pitcher) %>% 
    summarize(SumOfPitch = sum(Weighted.Pitch.Count))

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.