I'm thinking of providing a summary()
method for my new package. The package currently provides map()
variants that automatically wrap safely()
and quietly()
, giving you nice formatting for the output when you look at it in a tibble. The idea for a summary function is to have something that, given a safely or quietly mapped column, tells you how many elements of the column had results, warnings, etc. Something like:
summary.mapped_quietly(x) {
# 1. work out how many of x contain results, warnings, etc.
# 2. print out a report of this:
# "4 elements contain results, 3 elements contain warnings..."
# 3. return a tidy version of this invisibly
}
But that output could look like:
# a) a named vector
result = c(result = 5, warning = 3, message = 1, output = 0)
# result warning message output
# 5 3 1 0
# b) a wide data frame with 1 row
result = data_frame(result = 5, warning = 3, message = 1, output = 0)
# # A tibble: 1 x 4
# result warning message output
# <dbl> <dbl> <dbl> <dbl>
# 1 5 3 1 0
# c) a long data frame with two columns
result = data_frame(
component = c("result", "warning", "message", "output"),
count = c(5, 3, 1, 0))
# # A tibble: 4 x 2
# component count
# <chr> <dbl>
# 1 result 5
# 2 warning 3
# 3 message 1
# 4 output 0
I'm not sure what is most likely to be fit in with the likely use cases of this package. I imagine you would want to use this in a tidy workflow with summarise()
, but I'm not sure if there's a one-step way to produce several summary columns with one function.
The closest analogue I can think of is broom::glance()
, which produces a one-row summary—but of a single object, not an entire column. So usually you'd map glance()
over a list-column and (I imagine) bolt it on to the side of a data frame using bind_cols()
.
Does anyone have any thoughts on what the best approach here is?
EDIT: another approach here is to not sweat the details much here, concentrate on summary()
's utility as an interactive function, and separately implement each component like count_errors()
, count_results()
, etc. that could each be used separately in a summarise()
call.