Creating an average for individuals in a given year

Hi,

I'm having trouble creating a mean score for individuals in my dataset. I want the average for each individual in each year and am struggling to do this.
Here is the dput:
df1 <- structure(list(monkey.id = c("00J", "00J", "00J", "00J", "00J",
"00J", "00J", "00J", "00J", "00J", "00J", "00J", "00J", "00J",
"00J", "00J", "00J", "00O", "00O", "00O"), partner.id = c("03J",
"10S", "14L", "28N", "32S", "35N", "43J", "50S", "52P", "55V",
"57H", "60P", "63V", "64P", "90D", "98T", "K00", "14L", "28N",
"29Z"), data.year = c(2010L, 2010L, 2010L, 2010L, 2010L, 2010L,
2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L,
2010L, 2010L, 2010L, 2010L, 2010L), current.group = c("F", "F",
"F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F",
"F", "F", "F", "F", "F"), DCSI = c(1.27636402820631, 1.40971549383981,
0.755607504698138, 0.646924233474433, 0.638182014103157, 0.638182014103157,
1.33972961825911, 1.9407727004233, 1.29384846694887, 0.646924233474433,
0.704857746919905, 1.27636402820631, 15.0677748544049, 0.651385779912188,
0.699636578424202, 1.33029490263757, 0.651385779912188, 0.814232224890235,
1.37884581149295, 0.679503151706959)), row.names = c(2L, 4L,
6L, 11L, 14L, 16L, 21L, 24L, 25L, 26L, 29L, 31L, 35L, 37L, 48L,
50L, 51L, 62L, 67L, 69L), class = "data.frame")

Each individual had its own unique ID in the "monkey.id" variable. They have social partners (partner.id) which are characterised by relationship strengh (DCSI). And I have data for different years.
What I want is to create an average DCSI score that an individual has with all its social partners, in a given year. With the idea that a higher average DCSI score, means a monkey is more sociable.
Can anybody help please?

This is easily done with the dpylr package.

library(dplyr)

df1 <- structure(list(monkey.id = c("00J", "00J", "00J", "00J", "00J",
                                    "00J", "00J", "00J", "00J", "00J", "00J", "00J", "00J", "00J",
                                    "00J", "00J", "00J", "00O", "00O", "00O"), 
                      partner.id = c("03J","10S", "14L", "28N", "32S", "35N", "43J", "50S", "52P", "55V",
                                     "57H", "60P", "63V", "64P", "90D", "98T", "K00", "14L", "28N",
                                     "29Z"), 
                      data.year = c(2010L, 2010L, 2010L, 2010L, 2010L, 2010L,
                                    2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L,
                                    2010L, 2010L, 2010L, 2010L, 2010L), 
                      current.group = c("F", "F","F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F",
                                        "F", "F", "F", "F", "F"), 
                      DCSI = c(1.27636402820631, 1.40971549383981,
                               0.755607504698138, 0.646924233474433, 0.638182014103157, 0.638182014103157,
                               1.33972961825911, 1.9407727004233, 1.29384846694887, 0.646924233474433,
                               0.704857746919905, 1.27636402820631, 15.0677748544049, 0.651385779912188,
                               0.699636578424202, 1.33029490263757, 0.651385779912188, 0.814232224890235,
                               1.37884581149295, 0.679503151706959)), 
                 row.names = c(2L, 4L,6L, 11L, 14L, 16L, 21L, 24L, 25L, 26L, 29L, 31L, 35L, 37L, 48L,50L, 
                               51L, 62L, 67L, 69L), class = "data.frame")
STATS <- df1 |> group_by(monkey.id,data.year) |> 
  summarize(Avg=mean(DCSI))
#> `summarise()` has grouped output by 'monkey.id'. You can override using the
#> `.groups` argument.

Created on 2022-06-22 by the reprex package (v2.0.1)

1 Like

Thanks very much for your help.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.