Quickly scanning purrr side effects, and using pillar to style the report

rensa · October 26, 2018, 3:08am

I've started using purrr:safely() and purrr:quietly() to tidily do things to nested data frames, like building a regression model or printing a ggplot. I quickly realised that I wanted to be able to see quickly when things were going wrong, and in which cases, so I built a couple of helpers, safely_status() and quietly_status() to scan the output quickly. Here's how they work:

library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 3.5.1
#> Warning: package 'dplyr' was built under R version 3.5.1

# here they are: safeley_status() and quietley_status() ================================

safely_status = function(x, .prefix_err = 'Error:') {
  map(x,
    ~ if (is.null(.$result)) {
      paste(.prefix_err, .$error$message)
    } else { 'OK' }) %>%
  as_vector()
}

quietly_status = function(x, .prefix_err = 'Error:', .prefix_warn = 'Warning:',
  .prefix_msg = 'Message:', .text_ok = 'OK') {
  map(x,
      ~ if (is.null(.$warning) | is_empty(.$warning)) {
        if (is.null(.$message) | is_empty(.$message)) {
          .text_ok
        } else { paste(.prefix_msg, .$message) }
      } else { paste(.prefix_warn, .$warning) }) %>%
  as_vector()
}

# example use =======================================================================

safe_log = safely(log)
quiet_log = quietly(log)

test = 
  # tidy up and trim down for the example
  mtcars %>%
  rownames_to_column(var = "car") %>%
  as_data_frame() %>%
  select(car, cyl, disp, wt) %>%
  # spike some rows in cyl == 4 to make them fail
  mutate(wt = case_when(
    wt < 2 ~ -wt,
    TRUE ~ wt)) %>%
  # nest and do some operations quietly()
  nest(-cyl) %>%
  mutate(
    qlog = map(data, ~ quiet_log(.$wt)),
    qlog_status = quietly_status(qlog),
    # optional: overwrite the "quiet" output with just successful results
    qlog = map(qlog, "result"))

test
#> # A tibble: 3 x 4
#>     cyl data              qlog       qlog_status           
#>   <dbl> <list>            <list>     <chr>                 
#> 1     6 <tibble [7 x 3]>  <dbl [7]>  OK                    
#> 2     4 <tibble [11 x 3]> <dbl [11]> Warning: NaNs produced
#> 3     8 <tibble [14 x 3]> <dbl [14]> OK

Created on 2018-10-26 by the reprex package (v0.2.0).

So this is nice, but there are some flaws: for one, you might get more than one of warnings, message, etc. I'm thinking of reworking this to a traffic-light system, where there are grey blocks (currently used for sparklines in tibble) for each returned type (result, warning, message, output) and they're coloured (and, for colour-blind users, otherwise altered—say, full-height vs. quarter height) if they're present in the output.

So I have two questions:

am I implementing this the best way in the first place (mapping over a list column and returning a string for each element), and
Is pillar the right way to go about styling the outputted status column? Are there any good tutorials for using pillar this way?

Thanks

rensa · October 28, 2018, 3:29am

I've neeeaaarly gotten this going in a package called collateral, so if anyone with experience extending tibble can help me get over the line with it, I'd really appreciate it

devtools::install_github('rensa/collateral')

I've changed approach: instead of pulling a character vector out from the mapped safely() and quietly() outputs and then trying to style that, I'm replacing safely() and quietly() with my own variants that return wrapped versions of purrr:::capture_errors() and purrr:::capture_output() (the functions that actually do the work under the hood). The upshot is that the output of safely() and quietly() (before mapping) has an S3 class attached (predictably, "safely" or "quietly").

The rest follows the extending tibble vignette largely by implementing a format() function for these classes and then attempting to register implementations of pillar_shaft() for them.

When looking at individual elements of safely() or quietly(), this output works as expected and is styled correctly. But not the list column of mapped output, which looks like a regular list column, other than the type registration due to type_sum():

library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 3.5.1
#> Warning: package 'dplyr' was built under R version 3.5.1
library(collateral)
#> 
#> Attaching package: 'collateral'
#> The following objects are masked from 'package:purrr':
#> 
#>     quietly, safely

quiet_log = collateral::quietly(log)

test =
  # tidy up and trim down for the example
  mtcars %>%
  tibble::rownames_to_column(var = "car") %>%
  tibble::as_data_frame() %>%
  dplyr::select(car, cyl, disp, wt) %>%
  # spike some rows in cyl == 4 to make them fail
  dplyr::mutate(wt = dplyr::case_when(
    wt < 2 ~ -wt,
    TRUE ~ wt)) %>%
  # nest and do some operations quietly()
  tidyr::nest(-cyl) %>%
  dplyr::mutate(qlog = map(data, ~ quiet_log(.$wt)))

test
#> # A tibble: 3 x 3
#>     cyl data              qlog        
#>   <dbl> <list>            <list>      
#> 1     6 <tibble [7 x 3]>  <collat [4]>
#> 2     4 <tibble [11 x 3]> <collat [4]>
#> 3     8 <tibble [14 x 3]> <collat [4]>

test$qlog[[2]]
#> R O _ W

Created on 2018-10-28 by the reprex package (v0.2.0).

(note that reprex doesn't capture the ANSI colours)

Ideally R O _ W would replace <collat [4]>! I'm not quite sure why pillar_shaft.quietly() isn't working, but it's definitely present in the environment if I check for it with pillar_shaft.quietly. I'm wondering if it's not registering as an S3 generic properly, as:

> pillar_shaft(test$qlog)
Error in pillar_shaft(test$qlog) : could not find function "pillar_shaft"
> collateral::pillar_shaft(test$qlog)
Error: 'pillar_shaft' is not an exported object from 'namespace:collateral'
> pillar_shaft.quietly(test$qlog)
[1] "pillar_shaft dispatched"
Called from: eval(expr, p)
Browse[1]> n
debug at C:\Users\rensa\Code\collateral/R/pillar.r#18: pillar::new_pillar_shaft_simple(out, align = "left", width = 7, 
    min_width = 7, na_indent = 0)
Browse[2]> 
0.963174317773006, 1.05605267424931 , 1.16782735768958 , 1.24126858906963 , 1.23547147138531 , 1.23547147138531 , 1.01884732019925 ,                  
0.841567185678219, 1.16002091679675 , 1.14740245283754 , 0.78845736036427 , NaN              , NaN              , 0.902191807494653, NaN              , 0.76080582903376 , NaN              , 1.02245092770255 ,                  , NaNs produced    
1.23547147138531, 1.27256559579155, 1.4036429994545 , 1.31640823365572, 1.3297240096315 , 1.65822807660353, 1.69083355063809, 1.67616154447701, 1.25846098961001, 1.23401692567431, 1.34547236659964, 1.34677360295761, 1.15373158788919, 1.27256559579155,

But also, pillar_shaft doesn't appear on:

> devtools::missing_s3()
Loading collateral
 [1] "[.quietly"            "[.safely"             "c.quietly"            "c.safely"            
 [5] "format.quietly"       "format.safely"        "is_vector_s3.quietly" "is_vector_s3.safely" 
 [9] "obj_sum.quietly"      "obj_sum.safely"       "print.quietly"        "print.safely"        
[13] "type_sum.quietly"     "type_sum.safely"

I'm a liiiiittle bit out of my depth on this part, so if anyone has tips, that would be great

rensa · October 28, 2018, 3:59am

I feel like maybe I'm misunderstanding the section on list columns in that vignette, and the latlon example class is built on complex numbers, so a column of latlon objects is still a vector... that might explain this:

> pillar_shaft(test$qlog[[2]])
[1] "pillar_shaft dispatched"
Called from: eval(expr, p)
Browse[1]> n
debug at C:\Users\rensa\Code\collateral/R/pillar.r#18: pillar::new_pillar_shaft_simple(out, align = "left", width = 7, 
    min_width = 7, na_indent = 0)
Browse[2]> 
R O _ W
> pillar_shaft(test$qlog)
<collat>
<collat>
<collat>

Not sure what to do about it, though: it seems like pillar_shaft gets called on the column rather than on each element, so unless there's a class attached to the list-column (and safely() and quietly() aren't necessarily used mapped, so that may not be the case), I don't know what I can do

rensa · November 2, 2018, 6:52am

So it turns out I've gotten this working! I ended up throwing out the previous approach and creating two map variants instead:

map_safely() automatically wraps safely(), and
map_quietly() automatically wraps quietly().

Both format their output (although not in knitted documents... yet!). Very happy with the result!

library(tidyverse)
library(collateral)

test =
  # tidy up and trim down for the example
  mtcars %>%
  rownames_to_column(var = "car") %>%
  as_data_frame() %>%
  select(car, cyl, disp, wt) %>%
  # spike some rows in cyl == 4 to make them fail
  mutate(wt = dplyr::case_when(
    wt < 2 ~ -wt,
    TRUE ~ wt)) %>%
  # nest and do some operations quietly()
  nest(-cyl) %>%
  mutate(qlog = map_quietly(data, ~ log(.$wt)))

test
#> # A tibble: 3 x 4
#>     cyl data              qlog
#>   <dbl> <list>            <collat>
#> 1     6 <tibble [7 x 3]>  R O _ _
#> 2     4 <tibble [11 x 3]> R O _ W
#> 3     8 <tibble [14 x 3]> R O _ _

collateral output

cderv · November 2, 2018, 7:02am

Wahou ! nice ! Very nice feature and nice to have the color! thank you!

rensa · November 9, 2018, 7:02am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.