# Automated summary statistics for many numerical variables

I have a table with 3 numerical variables and 1 logical variable with 2 values.
I need to get the minimum, maximum, average, median, and 1 and 4 quartiles of all the numerical variables for both values of the logical variable.
How can I do that with a single code instead of writing a code for each variable?
Here is the df

``````# A tibble: 10 × 4
a     b c         d
<int> <int> <lgl> <int>
1     1     1 TRUE      1
2     2     2 FALSE     2
3     3     3 TRUE      3
4     4     4 FALSE     4
5     5     5 TRUE      5
6     6     6 FALSE     6
7     7     7 TRUE      7
8     8     8 FALSE     8
9     9     9 TRUE      9
10    10    10 FALSE    10

structure(list(a = 1:10, b = 1:10, c = c(TRUE, FALSE, TRUE, FALSE,
TRUE, FALSE, TRUE, FALSE, TRUE, FALSE), d = 1:10), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
``````
1 Like

Hi, @juandmaz , i found an possible solution，is it what you want? I mainly refer to the following package link.
doBy: Groupwise Statistics, LSmeans, Linear Estimates, Utilities (r-project.org)

``````install.packages("doBy")
library(doBy)
#> Warning: package 'doBy' was built under R version 4.2.3

data188<-structure(list(a = 1:10, b = 1:10, c = c(TRUE, FALSE, TRUE, FALSE,
TRUE, FALSE, TRUE, FALSE, TRUE, FALSE), d = 1:10), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
data189<-summaryBy(a+b+d ~ c, data188, FUN=summary)
data189
#> # A tibble: 2 × 19
#>   c     a.Min. `a.1st Qu.` a.Median a.Mean a.3rd…¹ a.Max. b.Min. b.1st…² b.Med…³
#>   <lgl>  <dbl>       <dbl>    <dbl>  <dbl>   <dbl>  <dbl>  <dbl>   <dbl>   <dbl>
#> 1 FALSE      2           4        6      6       8     10      2       4       6
#> 2 TRUE       1           3        5      5       7      9      1       3       5
#> # … with 9 more variables: b.Mean <dbl>, `b.3rd Qu.` <dbl>, b.Max. <dbl>,
#> #   d.Min. <dbl>, `d.1st Qu.` <dbl>, d.Median <dbl>, d.Mean <dbl>,
#> #   `d.3rd Qu.` <dbl>, d.Max. <dbl>, and abbreviated variable names
#> #   ¹​`a.3rd Qu.`, ²​`b.1st Qu.`, ³​b.Median
``````

Created on 2023-07-11 with reprex v2.0.2

``````library(tidyverse)

some_data <- structure(list(a = 1:10, b = 1:10, c = c(
TRUE, FALSE, TRUE, FALSE,
TRUE, FALSE, TRUE, FALSE, TRUE, FALSE
), d = 1:10), class = c(
"tbl_df",
"tbl", "data.frame"
), row.names = c(NA, -10L))

(wide_version <- some_data |>
group_by(c) |>
summarise(
across(where(is.numeric),
.fns = list(
min = min,
max = max,
mean = mean,
median = median,
q_low = ~ quantile(.x, probs = .25),
q_high = ~ quantile(.x, probs = .75)
)
)))

(long_version <- pivot_longer(wide_version,
cols = -c) |>
pivot_wider(names_from = c))``````
3 Likes

Yes, it works! Thanks

@juandmaz You are welcome! If i solved your peoblem, would you mind gave me a or click the solution buttion in my post? Thanks a lot.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.