Summarise with if_else

Hi,

I'm trying to get a summary going with a condition. I can't seem to get the syntax correct to do what I'm trying.

I want to add a column called percent to compare how much of the total is part of a given segment (industry).

library(tidyverse)

df <- tibble(
  year = c(2020,2020,2020,2020,2020),
  zip_code = c(90001,90001,90001,90001,90001),
  industry = c("Tech","Wholesale","Tech", "Manufacturing", "Wholesale"),
  total = c(30500353,12958359,2638403,559844,9177618)
)

df %>% group_by(industry) %>% summarize(total = sum(total),
                                        percent = total/ sum(if_else(industry == "Tech")))
#> Error: Problem with `summarise()` column `percent`.
#> i `percent = total/sum(if_else(industry == "Tech"))`.
#> x argument "true" is missing, with no default
#> i The error occurred in group 1: industry = "Manufacturing".

Created on 2021-09-23 by the reprex package (v2.0.1)

Not sure how to implement the if_else() conditioning to apply the sum as I wish.

Any insight would be greatly appreciated.

Perhaps this?

df %>% 
  group_by(industry) %>% 
  mutate(segmentTotal = sum(total), 
         percentOfSegment = (total / segmentTotal) * 100)

Which yields,

# A tibble: 5 × 6
# Groups:   industry [3]
   year zip_code industry         total segmentTotal percentOfSegment
  <dbl>    <dbl> <chr>            <dbl>        <dbl>   <dbl>
1  2020    90001 Tech          30500353     33138756   92.0 
2  2020    90001 Wholesale     12958359     22135977   58.5 
3  2020    90001 Tech           2638403     33138756    7.96
4  2020    90001 Manufacturing   559844       559844  100   
5  2020    90001 Wholesale      9177618     22135977   41.5 
2 Likes

Here is what you have without the assignment to percent through the failed ifelse statement.

suppressPackageStartupMessages({
  library(dplyr)
})

df_ <- tibble(
  year = c(2020,2020,2020,2020,2020),
  zip_code = c(90001,90001,90001,90001,90001),
  industry = c("Tech","Wholesale","Tech", "Manufacturing", "Wholesale"),
  total = c(30500353,12958359,2638403,559844,9177618)
)

df_ %>% group_by(industry) %>% summarise(Total = sum(total))
#> # A tibble: 3 × 2
#>   industry         Total
#>   <chr>            <dbl>
#> 1 Manufacturing   559844
#> 2 Tech          33138756
#> 3 Wholesale     22135977

percent is to be assigned to the return value of ifelse, which takes at least 3 arguments, when there is only one here.

ifelse(SOMETHING,result1,result2)

Here, we have only SOMETHING.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.