Calculating averages within a mutate

mutate(aggregate_attrib = (attribution_scoreQ2 + attribution_scoreQ27 + attribution_scoreQ5 + attribution_scoreQ16 + attribution_scoreQ14 + attribution_scoreQ13 + attribution_scoreQ21)/7)

I want to create a variable that basically takes the average of all these 7 variables combined. How would I go about doing that? This was my intuition. What am I doing wrong?

I expect you saw single-row averages of those columns? If there were any NAs you would have NAs. I'm assuming that dplyr::summarise isn't suitable for you here. If not, & If you want group-level averages you could try this similar version to yours:


mutate(
    aggregate_attrib = mean(
                      colSums(
                          select(., 
                              attribution_scoreQ2, 
                              attribution_scoreQ27,
                              attribution_scoreQ5, 
                              attribution_scoreQ16, 
                              attribution_scoreQ14, 
                              attribution_scoreQ13, 
                              attribution_scoreQ21
                          ), na.rm=T
                      ),  na.rm=T
                  )

I'm not sure if the colSums will respect any groupings - you might need to convert the colSums %>% mean section into something like this:

mutate(
          aggregate_attrib =  across(
                          c(
                              attribution_scoreQ2, 
                              attribution_scoreQ27,
                              attribution_scoreQ5, 
                              attribution_scoreQ16, 
                              attribution_scoreQ14, 
                              attribution_scoreQ13, 
                              attribution_scoreQ21
                              ) 
                          ), sum
                          , na.rm=T
                      ) %>% mean
)

Also this is slightly tricky because your data isn't 'tidy' (one row per observation), though perhaps it now will be if this average is your variable of interest? If not, you might want to consider a tidyr::pivot_longer call to create a combined attribution_score variable and a score_type variable.

Take a look at this:

library("tidyverse")
my_data <- tibble(
  student_id = LETTERS[1:10],
  score_1 = runif(10),
  score_2 = runif(10),
  score_3 = runif(10)
)
my_data %>% 
  rowwise %>% 
  mutate(score_mean = mean(c(score_1, score_2, score_3))) %>% 
  ungroup

Hope it helps! :slightly_smiling_face:

1 Like