Combining Specific Rows

I have 19 rows worth of data with columns A and B which looks like this:

A B
0 6.5
1 6.4
2 6.3
3 6.1
4 6.0
... ...

If, for example, I wanted to combine rows A=0,1 and rows A=2,3,4, with column B taking the mean of the joint rows, how would I code this? I've tried using aggregate(), but am not sure how to combine the specific rows when they don't share a common value.

In the end, I want something like:

A B
0-1 6.45
2-4 6.13
... ...

Thanks in advance!

You could manually define groups like this

library(tidyverse)
# Sample data
sample_df <- data.frame(
           A = c(0, 1, 2, 3, 4),
           B = c(6.5, 6.4, 6.3, 6.1, 6)
)

sample_df %>% 
    mutate(group = case_when(
        A <= 1 ~ 1,
        A > 1 ~ 2
    )) %>% 
    group_by(group) %>% 
    summarise(A = paste(first(A), last(A), sep = "-"),
              B = mean(B)) %>% 
    select(-group)
#> # A tibble: 2 x 2
#>   A         B
#>   <chr> <dbl>
#> 1 0-1    6.45
#> 2 2-4    6.13

Created on 2019-11-20 by the reprex package (v0.3.0.9000)

1 Like

This is really helpful, thank you!
With the data I'm using, I actually want to group the rows such that A=0, A=1, A=2-4, A>5 are separate rows (so there are 4 rows altogether). I tried using your code and adjusting it to try to do this, but am not sure how to type A=2-4 as code..

library(tidyverse)
# Sample data
sample_df <- data.frame(
    A = c(0, 1, 2, 3, 4, 5),
    B = c(6.5, 6.4, 6.3, 6.1, 6, 5)
)

sample_df %>% 
    mutate(A = case_when(
        A == 0 ~ "0",
        A == 1 ~ "1",
        A > 1 & A <= 4 ~ "2-4",
        TRUE ~ ">5"),
        A = factor(A, levels = c("0", "1", "2-4", ">5"))) %>% 
    group_by(A) %>% 
    summarise(B = mean(B))
#> # A tibble: 4 x 2
#>   A         B
#>   <fct> <dbl>
#> 1 0      6.5 
#> 2 1      6.4 
#> 3 2-4    6.13
#> 4 >5     5

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.