Hello,
I am struggling to create a filtered variable and a total in the same summary.
I tried a code that does not generate the right numbers, I am looking to have quantity and % in the same summary. I am looking to group by type and have the % of films above the mean and the % of total for each type. So there should be 4 columns in total.
This seems pretty straight forward but I keep getting errors so the code below is the only one that was running for me..
Thank you so much.
Christine
reprex::reprex_info()
#> Warning in as.POSIXlt.POSIXct(Sys.time()): unknown timezone 'zone/tz/2017c.
#> 1.0/zoneinfo/Europe/Paris'
#> Created by the reprex package v0.1.1.9000 on 2018-01-06
suppressWarnings(library(dplyr))
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data.frame(stringsAsFactors=FALSE,
title_type = c("Feature film","Feature film", "TV Movie",
"Documentary", "Documentary", "Feature film", "Feature film", "Feature film", "TV Movie", "TV Movie",
"Feature film", "Feature film", "TV Movie",
"Documentary", "Documentary", "Feature film", "Feature film", "Feature film", "TV Movie", "TV Movie"),
imdb_rating = c(2, 3, 2, 10, 8, 3, 3, 2, 8, 5, 2, 3, 2, 10, 8, 3, 3,
2, 8, 5)) %>%
group_by(title_type) %>%
summarise(count =n(), count_high=length(which(imdb_rating> mean(imdb_rating)))) %>%
mutate(ratio=count/sum(count))
#> # A tibble: 3 x 4
#> title_type count count_high ratio
#> <chr> <int> <int> <dbl>
#> 1 Documentary 4 2 0.2
#> 2 Feature film 10 6 0.5
#> 3 TV Movie 6 2 0.3