Replace column with value of max value from another column

I would like to replace the category column with the category corresponding to the max value in the sales column.

My data looks as follows:

df <- data.frame(CATEGORY = c("A","A","A","B","B"), SALES = c(10,20,30,40,50))

I'm looking to fill the CATEGORY variable with "B" since the max value in SALES has a CATEGORY of B

df <- data.frame(CATEGORY = c("B","B","B","B","B"), SALES = c(10,20,30,40,50))

If this can be achieved using dplyr syntax I'd be very grateful if anyone could give me a few pointers.

Thanks

Hi @jgarrigan,
This should do it:

suppressPackageStartupMessages(library(tidyverse))

df <- data.frame(CATEGORY = c("A","A","A","B","B"), SALES = c(10,20,30,40,50))
df
#>   CATEGORY SALES
#> 1        A    10
#> 2        A    20
#> 3        A    30
#> 4        B    40
#> 5        B    50

# Basic approach
df %>% 
  mutate(max_cat = CATEGORY[SALES == max(SALES)])
#>   CATEGORY SALES max_cat
#> 1        A    10       B
#> 2        A    20       B
#> 3        A    30       B
#> 4        B    40       B
#> 5        B    50       B

# Do it, plus tidy-up the output
df %>% 
  mutate(max_cat = CATEGORY[SALES == max(SALES)]) %>% 
  select(-1) %>% 
  rename(CATEGORY = max_cat) %>% 
  select(2 ,1)
#>   CATEGORY SALES
#> 1        B    10
#> 2        B    20
#> 3        B    30
#> 4        B    40
#> 5        B    50

Created on 2022-01-29 by the reprex package (v2.0.1)

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.