How to subtract by group?

Hi!

I have a dataframe that sort of has the following structure:

ID Code City Water usage
1 start London 10
2 start London 5
3 start Amsterdam 20
4 middle London 100
5 end London 140
6 end Amsterdam 80

I want to do the following:

  • I want, for instance, to subtract the sum of waterusage in London with the code 'start' from the water usage in London with the code 'middle'. And the similar for middle and end.

I already made a subset of the IDs that have the code 'start' and used aggregate to sum the water use by city.

Any suggestions on how to continue?

Is this the way you want to have your data?

library(tidyverse)

# create data frame
df <- tribble(
  ~id, ~code, ~city, ~water_usage,
  1, "start", "London", 10,
  2, "start", "London", 5,
  3, "start", "Amsterdam", 20,
  4, "middle", "London", 100,
  5, "end", "London", 140,
  6, "end", "Amsterdam", 80)

# summarize water_usage by code and city
df_sum <- df %>% 
  group_by(code, city) %>% 
  summarise(water_usage_sum = sum(water_usage))
df_sum
#> # A tibble: 5 x 3
#> # Groups:   code [3]
#>   code   city      water_usage_sum
#>   <chr>  <chr>               <dbl>
#> 1 end    Amsterdam              80
#> 2 end    London                140
#> 3 middle London                100
#> 4 start  Amsterdam              20
#> 5 start  London                 15

# transform to wide format
df_sum %>% 
  pivot_wider(names_from = code, 
              values_from = water_usage_sum) %>% 
  select(city, start, middle, end) # reorder columns if you like
#> # A tibble: 2 x 4
#>   city      start middle   end
#>   <chr>     <dbl>  <dbl> <dbl>
#> 1 Amsterdam    20     NA    80
#> 2 London       15    100   140

# now you can substract etc. (however, see NA values)

Created on 2020-04-15 by the reprex package (v0.3.0)

(I did not care about NA)

Based on your description, I think you were almost there... you just needed to transform the data frame to a wider format were start, middle and end are the columns.

Thank you so much! This really helped

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.