Here is an example where I recoded your missing values as "Unknown" and then used functions from the plyr package to get what I think you want.
DF <- data.frame(User.Type = rep(c("Unknown", "Customer", "Subscriber"), each = 4),
trip.type = rep(c("Avg", "long", "Short", "Unknown"), 3),
num_trips = c(891, 3817, 31, 77, 4123, 20888, 1378, 388, 266299,
225671, 203048, 734))
library(dplyr)
Summary <- DF %>% group_by(User.Type) %>%
summarize(Total = sum(num_trips))
Summary
#> # A tibble: 3 x 2
#> User.Type Total
#> <fct> <dbl>
#> 1 Customer 26777
#> 2 Subscriber 695752
#> 3 Unknown 4816
DF <- left_join(DF, Summary, by = "User.Type") %>%
mutate(Fraction = num_trips/Total)
DF
#> User.Type trip.type num_trips Total Fraction
#> 1 Unknown Avg 891 4816 0.185008306
#> 2 Unknown long 3817 4816 0.792566445
#> 3 Unknown Short 31 4816 0.006436877
#> 4 Unknown Unknown 77 4816 0.015988372
#> 5 Customer Avg 4123 26777 0.153975427
#> 6 Customer long 20888 26777 0.780072450
#> 7 Customer Short 1378 26777 0.051462076
#> 8 Customer Unknown 388 26777 0.014490047
#> 9 Subscriber Avg 266299 695752 0.382749888
#> 10 Subscriber long 225671 695752 0.324355517
#> 11 Subscriber Short 203048 695752 0.291839621
#> 12 Subscriber Unknown 734 695752 0.001054974
Created on 2020-04-21 by the reprex package (v0.3.0)
In the future, please provide a reproducible example so people do not have to manually reproduce your data.
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
Thank you so much. I will definitely provide sample reproducible data next time. Much appreciated.