updating function to get a combined summary

I have created a function, for getting summary of average, percentile. but not I want that summary for particular subsets. so I have created subsets accordingly.

but my function is not working properly.

so actually I am trying to update my function so that I can get a summary for list of variables as variable name and summary can be rbind for multiple list of variables.

I have no Idea how can i put "ALL", "MM" as name of variable in my function.
so that the summary for both can be rbind itself

df <- data.frame(Name = c("asdf","kjhgf","cvbnm","rtyui","cvbnm","jhfd","cvbnm","sdfghj","cvbnm","dfghj","cvbnm"),
                 sale=c(27,28,27,16,14,25,14,14,19,18,28),
                 city=c("CA","TX","MN","NY","TX","MT","HU","KL","TX","SA","TX"),
                 Dept = c("HH","MM","NN","MM","AA","VV","MM","HU","JJ","MM","ZZ"))


df1<- df
df$cc1<-1
df2<- subset(df, Dept == 'MM')
df$cc2<-ifelse(df$Dept == 'MM',1,NA)
lst<-list(df$cc1, df$cc2)
listd<-list("ALL" = df1, "MM" =df2)

#I want to run my function for listd so that i can get a  combined summary for all variables in listd
tt2<-function(data,var,footer,Name_of_variable,decimal){
  for (d in 1:length(data)) {
    cat('\n\n#### ', names(data)[d], '\n\n')
    md<-data[[d]]
    table_list<-list()
    for (i in 1:length(d))
      table_list[[i]]<-t1(md,var,footer,decimal,Name_of_variable)
    tt<- do.call(rbind,table_list)
  } 
  cat(knit_print(tt))
  cat('\n\n')
}
t1<-function(dataset,var,Suff,decimal,Name_of_variable){
  numdig <- if (decimal == TRUE) {1} else {0}
  var <- rlang::parse_expr(var) 
summ_tab1<- dataset %>% filter(!is.na(!!var)) %>%   summarise(
  q25 = format(round(quantile(!! var,  type=6, probs = seq(0, 1, 0.25), na.rm=TRUE)[2],digits = numdig),nsmall = numdig),
  Median = format(round(quantile(!! var, type=6, probs = seq(0, 1, 0.25), na.rm=TRUE)[3],digits = numdig),nsmall = numdig),
  Average = format(round( mean(!! var, na.rm=TRUE),digits = numdig),nsmall = numdig),
  q75 = format(round(quantile(!! var, type=6, probs = seq(0, 1, 0.25), na.rm=TRUE)[4],digits = numdig) ,nsmall = numdig),
  N = sum(!is.na(!!var)))
summ_tab<-summ_tab1 %>%  
  mutate(" "=!!Name_of_variable,
         q25 = q25,
         Median =Median,
         Average =Average,
         q75 = q75)%>%
  dplyr::rename(
    `25th percentile` = q25,
    `75th percentile` = q75)%>%select(" ",N,everything())
summ_tab1
}


tt2(data = listd,var = "sale",Name_of_variable = "listd",decimal = TRUE)

Previously I was getting summary like below

ALL
q25 Median Average q75 N
1 14.5 17 19 25.5 4
MM
q25 Median Average q75 N
1 14.5 17 19 25.5 4

but now the output summary should be like , name of variable should be in rows.

image

suppressPackageStartupMessages({
  library(dplyr)
  library(pander)
  library(purrr)
})

df <- data.frame(
  Name = c("asdf", "kjhgf", "cvbnm", "rtyui", "cvbnm", "jhfd", "cvbnm", "sdfghj", "cvbnm", "dfghj", "cvbnm"),
  sale = c(27, 28, 27, 16, 14, 25, 14, 14, 19, 18, 28),
  city = c("CA", "TX", "MN", "NY", "TX", "MT", "HU", "KL", "TX", "SA", "TX"),
  Dept = c("HH", "MM", "NN", "MM", "AA", "VV", "MM", "HU", "JJ", "MM", "ZZ")
)

df %>%
  select(-Name, -city) %>%
  group_by(Dept) -> dat

mk_stats <- function(x) {
  N <- length(x[[1]])
  Median <- median(x[[1]])
  Average <- mean(x[[1]])
  q25 <- quantile(x[[1]])[2]
  q75 <- quantile(x[[1]])[4]
  cbind(q25, Median, Average, q75, N) -> ALL
  x %>% filter(Dept == "MM") -> MM
  N <- length(MM[[1]])
  Median <- median(MM[[1]])
  Average <- mean(MM[[1]])
  q25 <- quantile(MM[[1]])[2]
  q75 <- quantile(MM[[1]])[4]
  cbind(q25, Median, Average, q75, N) -> MM
  as.data.frame(rbind(ALL, MM)) %>%
    `rownames<-`(., c("ALL", "MM")) %>%
    pander()
}

mk_stats(dat)
q25 Median Average q75 N
ALL 15 19 20.91 27 11
MM 15.5 17 19 20.5 4

Created on 2020-09-22 by the reprex package (v0.3.0.9001)

Hi , thanks for your answer, but is there a way I can change my function to get a desired result, because I have more same type functions. so that I can update those also accordingly.

was the problem presented. tt2 and t1 weren't working as required and the desired output indicated only a single subset. mk_stats addresses that situation. A more general solution requires a problem specification with the cases to be covered.

The choice of approach is a matter of preference as far as mk_stats vs. tt2 and t1, which is why the answer didn't attempt to modify the latter.

Hi sorry for not letting you understand my question, so i have just updated my question.
hope now I am able to make you understand

1 Like

And I'm sorry to be unclear: I don't find modifying the code productive. That's why I offered a different solution.

how can I create a function with this to give input as
tt2<-function(data,var,footer,Name_of_variable,decimal)
data= "" #input data
var ="" by which the calculation will be done
Name of variable = " " # this will be list like "ALL" and "MM"
decimal = "" #if If i want values in decimal point or not (only one decimal point)

to get output accordingly, can you help please

Sorry. I am not going to work with you on tt2

No i am saying if you can help me to create mk_stats a function. that will work for me then i will modify all my function according to that.

a function like
mk_stats <-function(data,var,footer,Name_of_variable,decimal)
data= "" #input data
var ="" by which the calculation will be done
Name of variable = " " # this will be list like "ALL" and "MM"
decimal = "" #if If i want values in decimal point or not (only one decimal point)

Hi I have resolved like iterating list of variable but still thanks for your help.

tt2<-function(data,var,Name_of_variable,footer,decimal){
table_list<-list()
for (d in 1:length(data)) {
md<-data[[d]]
table_list[[d]]<-tt1(md,var,Name_of_variable[d],footer,decimal)
}
t1<- do.call(rbind,table_list)
t1
}
}

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.