I'm new to the R and I'm trying to calculate some statistics on the database. I'm stuck and I would really appreciate some help
I was able to group it, I have a statistics, but for one grouping and one column only.
So I want my script to do:
- Cut the database based on the selected column (done)
- calculate group statistics for first specified column (done)
- repeat calculation for the same statistics, but for different column (not sure how)
- repeat above steps for different column (not sure how
I was able to put a variable in (called Benchmark and Component) in the code, but I don't know how to loop it, so if I specify several names of the columns in the variable, it would pick the first one, do the steps and then repeat for the 2nd one.
I attached a print screen of sample database (yellow - columns by which database should be cut - one at a time, green - columns for which statistics should be calculated - one at a time).
My code so far:
RAW <- read.csv('H:/RawDatabase/Database for R.csv',header=T) RAW Benchmark_Name <- ('Benchmark_1') Component <- ("BASE") Benchmark <- select(RAW,Benchmark_Name) results_by_org <- RAW %>% filter(Benchmark == 1) %>% group_by(CODE) %>% summarise(Obs = length(na.omit(!!sym(Component))), Mean_CPY = mean(!!sym(Component),na.rm = F) ) final_stats <- results_by_org %>% summarise(!!paste0("Orgs_",Component) := nrow(na.omit(results_by_org)), !!paste0("Obs_",Component) := sum(na.omit(Obs)), !!paste0("P25_",Component) := quantile(Mean_CPY,probs = (0.25),na.rm = T,type = 6), !!paste0("Mean_",Component) := mean(na.omit(Mean_CPY)), !!paste0("P50_",Component) := quantile(Mean_CPY,probs = (0.5),na.rm = T,type = 6), !!paste0("P75_",Component) := quantile(Mean_CPY,probs = (0.75),na.rm = T,type = 6)) %>% mutate(Benchmark_Name) %>% select(Benchmark_Name,1:ncol(.))