Hi @GabbyJ !
I don't know if this is a viable solution (it's also way more code
), but I pivoted to long, grouped on the prefix, summarized, then pivoted wider.
It looks like you are missing a 'group' variable for the prefix? i.e. ff, gc, or volume?
library(datapasta)
library(tidyverse)
df <- tibble::tribble(
~id, ~ff_l4l5, ~ff_l5s1, ~gc_l4l5, ~gc_l5s1, ~volume_l4l5, ~volume_l5s1,
1, 24.16, 30.32, 2, 2, 28203.11, 12173.37,
2, 7.61, 12.27, 0, 0, 29032.61, 18699.53,
3, 3.77, 7.87, 1, 1, 16189.05, 7394.11,
4, 18.67, 23.86, 2, 2, 37662.74, 16002.27,
5, 16.59, 31.86, 2, 2, 26130.15, 13880.79
)
df_long <- df %>%
# switch data to long
pivot_longer(cols = -id,
names_to = "variable",
values_to = "value") %>%
# create grouping variable based on regex
mutate(k = case_when(
str_detect(variable, "ff_") ~ "ff",
str_detect(variable, "gc_") ~ "gc",
str_detect(variable, "volume_") ~ "volume",
))
df_long
#> # A tibble: 30 × 4
#> id variable value k
#> <dbl> <chr> <dbl> <chr>
#> 1 1 ff_l4l5 24.2 ff
#> 2 1 ff_l5s1 30.3 ff
#> 3 1 gc_l4l5 2 gc
#> 4 1 gc_l5s1 2 gc
#> 5 1 volume_l4l5 28203. volume
#> 6 1 volume_l5s1 12173. volume
#> 7 2 ff_l4l5 7.61 ff
#> 8 2 ff_l5s1 12.3 ff
#> 9 2 gc_l4l5 0 gc
#> 10 2 gc_l5s1 0 gc
#> # … with 20 more rows
This generates a table of means based on the 'group', so I am not sure if it's more helpful than what you're already doing?
But now we can calculate summary stats by k.
df_avgs <- df_long %>%
# group by id and k
group_by(id, k) %>%
# get mean and any other statistics
summarize(avgs = mean(value)) %>%
# pivot back to wide
pivot_wider(names_from = k, values_from = avgs)
#> `summarise()` has grouped output by 'id'. You can override using the `.groups`
#> argument.
df_avgs
#> # A tibble: 5 × 4
#> # Groups: id [5]
#> id ff gc volume
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1 27.2 2 20188.
#> 2 2 9.94 0 23866.
#> 3 3 5.82 1 11792.
#> 4 4 21.3 2 26833.
#> 5 5 24.2 2 20005.
If you don't have a huge number of k (and a consistent naming convention), it might be worthwhile to create a grouping variable.
Created on 2022-03-02 by the reprex package (v2.0.1)