Here is an example where I recoded your missing values as "Unknown" and then used functions from the plyr package to get what I think you want.
DF <- data.frame(User.Type = rep(c("Unknown", "Customer", "Subscriber"), each = 4),
trip.type = rep(c("Avg", "long", "Short", "Unknown"), 3),
num_trips = c(891, 3817, 31, 77, 4123, 20888, 1378, 388, 266299,
225671, 203048, 734))
library(dplyr)
Summary <- DF %>% group_by(User.Type) %>%
summarize(Total = sum(num_trips))
Summary
#> # A tibble: 3 x 2
#> User.Type Total
#> <fct> <dbl>
#> 1 Customer 26777
#> 2 Subscriber 695752
#> 3 Unknown 4816
DF <- left_join(DF, Summary, by = "User.Type") %>%
mutate(Fraction = num_trips/Total)
DF
#> User.Type trip.type num_trips Total Fraction
#> 1 Unknown Avg 891 4816 0.185008306
#> 2 Unknown long 3817 4816 0.792566445
#> 3 Unknown Short 31 4816 0.006436877
#> 4 Unknown Unknown 77 4816 0.015988372
#> 5 Customer Avg 4123 26777 0.153975427
#> 6 Customer long 20888 26777 0.780072450
#> 7 Customer Short 1378 26777 0.051462076
#> 8 Customer Unknown 388 26777 0.014490047
#> 9 Subscriber Avg 266299 695752 0.382749888
#> 10 Subscriber long 225671 695752 0.324355517
#> 11 Subscriber Short 203048 695752 0.291839621
#> 12 Subscriber Unknown 734 695752 0.001054974
Created on 2020-04-21 by the reprex package (v0.3.0)
In the future, please provide a reproducible example so people do not have to manually reproduce your data.
Thank you so much. I will definitely provide sample reproducible data next time. Much appreciated.
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.