Select Rows who's Sum Value = 80% of the Total

Good day.
I want to select companies that generate 80% of the company's sales.

I have 3 columns

  • Company name
  • The sum of the value of orders
    -% share in total sales

Will you help to achieve such an effect?

familiarise yourself with dplyrs filter functionality

you may have issues relating to variable type. If your colums for % is a numeric (maybe just a decimal representation) it should be straightforward. However if its a character representation i.e. string like '74%' with a percentage character, you would need to convert it to a numeric type before attempting to filter.

1 Like

Here's one possible dplyr solution. I presume you want the companies with the largest order values accounting for 80% of total sales.

library(dplyr, warn.conflicts = FALSE)

df <- tribble(
  ~company_name, ~total_value,
  "ABC Corp", 600,
  "DEF Inc.", 1000,
  "GHI Holdings", 300,
  "XYZ Inc.", 500
)

df %>% 
  arrange(desc(total_value)) %>% 
  mutate(share = cumsum(total_value / sum(total_value))) %>% 
  filter(lag(share, default = 0) < 0.8)
#> # A tibble: 3 x 3
#>   company_name total_value share
#>   <chr>              <dbl> <dbl>
#> 1 DEF Inc.            1000 0.417
#> 2 ABC Corp             600 0.667
#> 3 XYZ Inc.             500 0.875

Created on 2020-09-07 by the reprex package (v0.3.0)

cumsum
You helped me a lot, thank you.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.