How do I change the order of the skimmer functions in the table generated by skim?

Hi,

I'm using the skimr package, and I added two summary functions (iqr_na_rm, i.e., a function which computes the interquartile range by preliminarily removing any na value in the variable we're skimming, and median_na_rm, the equivalent function for the median) to the list of summary functions for the function skim. However, by default these new summary functions (called skimmers in skimr documentation) appear at the end of the table. Instead, I'd like median and iqr to appear after mean and sd. Reprex:

library(skimr)
#> 
#> Attaching package: 'skimr'
#> The following object is masked from 'package:stats':
#> 
#>     filter
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)

iqr_na_rm <- function(x) IQR(x, na.rm = TRUE)
median_na_rm <- function(x) median(x, na.rm = TRUE)

skim_with(numeric = list(p50 = NULL, median = median_na_rm, iqr = iqr_na_rm),
          integer = list(p50 = NULL, median = median_na_rm, iqr = iqr_na_rm))

msleep %>%
  group_by(vore) %>%
  skim(sleep_total)
#> Skim summary statistics
#>  n obs: 83 
#>  n variables: 11 
#>  group variables: vore 
#> 
#> ── Variable type:numeric ───────────────────────────────────────────────────────────────────────────────────
#>     vore    variable missing complete  n  mean   sd  p0  p25   p75 p100
#>    carni sleep_total       0       19 19 10.38 4.67 2.7 6.25 13    19.4
#>    herbi sleep_total       0       32 32  9.51 4.88 1.9 4.3  14.22 16.6
#>  insecti sleep_total       0        5  5 14.94 5.92 8.4 8.6  19.7  19.9
#>     omni sleep_total       0       20 20 10.93 2.95 8   9.1  10.93 18  
#>     <NA> sleep_total       0        7  7 10.19 3    5.4 8.65 12.15 13.7
#>      hist median   iqr
#>  ▃▇▂▇▆▃▂▃   10.4  6.75
#>  ▆▇▁▂▂▆▇▅   10.3  9.92
#>  ▇▁▁▁▁▁▃▇   18.1 11.1 
#>  ▆▇▂▁▁▁▁▂    9.9  1.83
#>  ▃▃▁▁▃▇▁▇   10.6  3.5

Created on 2019-03-14 by the reprex package (v0.2.1)

As you can see, median and iqr are printed and the end of the table, after the sparkline histogram. I'd like them to be printed after sd and before p0. Is it possible?

A little :fairy: told me that you can do this by simply using:

msleep %>%
  group_by(vore) %>%
  skim_to_list(sleep_total)%>%
  .[["numeric"]] %>%
  dplyr::select(vore,variable,missing,complete,n,mean,sd,
                median,iqr,p0,p25,p75,p100,hist)

Add a last

%>%
  kable()

if you want the result to show up nicely in a .Rmd file. Enjoy!!!

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.