Count specific categories in multiple collums and multiplicate them with a sum row

I want to count how often each category occurs at my dataframe:
To do so I need to count the catogory in each row and multiply this number with the sum of collumn 5.

(column c4 is not necessary for my analysis)

preferred output would be:
Analytics = First Row 11 +Second Row 220/ +Third row 0*100 etc..

Ads= 253

Identification= ..

My data looks like this:

tracker_category <- data.frame = c("Tracker1", "Tracker2", "Tracker3", "Tracker4","Tracker5","Tracker6"), 
c1 = c("Analytics", "Crash", "Location", "Identification", "Analytics", "Ads"), 
c2 = c("Ads", "Analytics", "Location", "Analytics", "Identification", "Ads"), 
c3 = c("Identification", "Analytics", "Ads", "Ads", "Analytics", "Location"),
c4 = c("url1.com","ur2.com","url3.com","url4.com","url5.com","url6.com"),
sum_tracker = c(1,20,100,0,5,76))

Hi @paulll,
Welcome to the RStudio Community Forum.

I think I understand what you are requesting. See the following Reproducible Example:

# This dataframe code was corrected.
tracker_category <- data.frame(Tracker = c("Tracker1", "Tracker2", "Tracker3", "Tracker4","Tracker5","Tracker6"), 
c1 = c("Analytics", "Crash", "Location", "Identification", "Analytics", "Ads"), 
c2 = c("Ads", "Analytics", "Location", "Analytics", "Identification", "Ads"), 
c3 = c("Identification", "Analytics", "Ads", "Ads", "Analytics", "Location"),
c4 = c("url1.com","ur2.com","url3.com","url4.com","url5.com","url6.com"),
sum_tracker = c(1,20,100,0,5,76))

tracker_category
#>    Tracker             c1             c2             c3       c4 sum_tracker
#> 1 Tracker1      Analytics            Ads Identification url1.com           1
#> 2 Tracker2          Crash      Analytics      Analytics  ur2.com          20
#> 3 Tracker3       Location       Location            Ads url3.com         100
#> 4 Tracker4 Identification      Analytics            Ads url4.com           0
#> 5 Tracker5      Analytics Identification      Analytics url5.com           5
#> 6 Tracker6            Ads            Ads       Location url6.com          76

suppressPackageStartupMessages(library(tidyverse))

long.df <- pivot_longer(tracker_category, cols=c("c1","c2","c3"),
                        names_to="origin_col",
                        values_to="category")
long.df %>% 
  group_by(Tracker, category) %>% 
  summarise(num=n(),
            val=mean(sum_tracker)) %>% 
  mutate(product=num*val) -> mid.df
#> `summarise()` has grouped output by 'Tracker'. You can override using the `.groups` argument.
mid.df
#> # A tibble: 14 x 5
#> # Groups:   Tracker [6]
#>    Tracker  category         num   val product
#>    <chr>    <chr>          <int> <dbl>   <dbl>
#>  1 Tracker1 Ads                1     1       1
#>  2 Tracker1 Analytics          1     1       1
#>  3 Tracker1 Identification     1     1       1
#>  4 Tracker2 Analytics          2    20      40
#>  5 Tracker2 Crash              1    20      20
#>  6 Tracker3 Ads                1   100     100
#>  7 Tracker3 Location           2   100     200
#>  8 Tracker4 Ads                1     0       0
#>  9 Tracker4 Analytics          1     0       0
#> 10 Tracker4 Identification     1     0       0
#> 11 Tracker5 Analytics          2     5      10
#> 12 Tracker5 Identification     1     5       5
#> 13 Tracker6 Ads                2    76     152
#> 14 Tracker6 Location           1    76      76

mid.df %>%
  group_by(category) %>% 
  summarise(final = sum(product)) -> last.df
last.df
#> # A tibble: 5 x 2
#>   category       final
#>   <chr>          <dbl>
#> 1 Ads              253
#> 2 Analytics         51
#> 3 Crash             20
#> 4 Identification     6
#> 5 Location         276

Created on 2021-06-03 by the reprex package (v2.0.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.