How to average/mean variables in R based on the level of another variable (and save this as a new variable)?

jcblum · May 23, 2018, 4:31am

Here's a tidyverse approach to summarizing your data:

library(tidyverse)

Paper <- c(
  1, 1, 1, 1, 1, 1, 1, 1, 2, 3, 3, 3, 4, 4, 4, 4, 
  4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 7, 7, 7, 7, 7, 
  7, 8, 8, 8, 8, 9, 9, 9
)

selectedES.prepost <- c(
  0.0048, -0.1420, -0.3044, -1.3024, -0.4052, -0.6066, -0.1961, 
  -1.1187, -0.4585, -0.8251, -0.5328, -1.3623, -0.5450, -0.4982, 
  -0.5714, -0.8793, -0.3677, -0.3976, -0.6136, -0.7047, -0.8580, 
  -0.5024, -0.8018, -0.8927, -0.3106, -0.5893, -0.6677, -1.6663, 
  -1.1769, -0.8384, -0.5632, -0.5237, -0.3458, -0.9957, -0.5331, 
  -0.7413, -0.0311, -0.4936, 0.5422, -0.0340
)

# creating a test dataset
mydata <- data.frame(Paper, selectedES.prepost)

mean_by_Paper <- mydata %>% 
  group_by(Paper) %>% 
  summarize(averaged.ES = mean(selectedES.prepost))

mean_by_Paper
#> # A tibble: 9 x 2
#>   Paper averaged.ES
#>   <dbl>       <dbl>
#> 1     1    -0.509  
#> 2     2    -0.458  
#> 3     3    -0.907  
#> 4     4    -0.572  
#> 5     5    -0.659  
#> 6     6    -0.668  
#> 7     7    -0.852  
#> 8     8    -0.575  
#> 9     9     0.00487

# You don't have to stop at the mean...
by_Paper <- mydata %>% 
  group_by(Paper) %>% 
  summarize(
    averaged.ES = mean(selectedES.prepost),
    sd.ES = sd(selectedES.prepost),
    n = n()
  )

by_Paper
#> # A tibble: 9 x 4
#>   Paper averaged.ES  sd.ES     n
#>   <dbl>       <dbl>  <dbl> <int>
#> 1     1    -0.509    0.472     8
#> 2     2    -0.458   NA         1
#> 3     3    -0.907    0.421     3
#> 4     4    -0.572    0.166     8
#> 5     5    -0.659    0.230     6
#> 6     6    -0.668   NA         1
#> 7     7    -0.852    0.493     6
#> 8     8    -0.575    0.409     4
#> 9     9     0.00487  0.519     3

Created on 2018-05-22 by the reprex package (v0.2.0).

Does that help?