Calculating averages within a mutate

mutate(aggregate_attrib = (attribution_scoreQ2 + attribution_scoreQ27 + attribution_scoreQ5 + attribution_scoreQ16 + attribution_scoreQ14 + attribution_scoreQ13 + attribution_scoreQ21)/7)

I want to create a variable that basically takes the average of all these 7 variables combined. How would I go about doing that? This was my intuition. What am I doing wrong?

I expect you saw single-row averages of those columns? If there were any NAs you would have NAs. I'm assuming that dplyr::summarise isn't suitable for you here. If not, & If you want group-level averages you could try this similar version to yours:


mutate(
    aggregate_attrib = mean(
                      colSums(
                          select(., 
                              attribution_scoreQ2, 
                              attribution_scoreQ27,
                              attribution_scoreQ5, 
                              attribution_scoreQ16, 
                              attribution_scoreQ14, 
                              attribution_scoreQ13, 
                              attribution_scoreQ21
                          ), na.rm=T
                      ),  na.rm=T
                  )

I'm not sure if the colSums will respect any groupings - you might need to convert the colSums %>% mean section into something like this:

mutate(
          aggregate_attrib =  across(
                          c(
                              attribution_scoreQ2, 
                              attribution_scoreQ27,
                              attribution_scoreQ5, 
                              attribution_scoreQ16, 
                              attribution_scoreQ14, 
                              attribution_scoreQ13, 
                              attribution_scoreQ21
                              ) 
                          ), sum
                          , na.rm=T
                      ) %>% mean
)

Also this is slightly tricky because your data isn't 'tidy' (one row per observation), though perhaps it now will be if this average is your variable of interest? If not, you might want to consider a tidyr::pivot_longer call to create a combined attribution_score variable and a score_type variable.

Take a look at this:

library("tidyverse")
my_data <- tibble(
  student_id = LETTERS[1:10],
  score_1 = runif(10),
  score_2 = runif(10),
  score_3 = runif(10)
)
my_data %>% 
  rowwise %>% 
  mutate(score_mean = mean(c(score_1, score_2, score_3))) %>% 
  ungroup

Hope it helps! :slightly_smiling_face:

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.