Cannot print histograms with skimr

Hi @Andrea!

Thanks for sticking with it and sharing what you find!!

Not sure if it will help you, but here is what I have put together to do what I need for now. Until skimr eventually works on Windows with .Rmd

The code for it can be found in a script within the repository. I've been using this to create a .Rmd file to keep with the rest of my documentation on projects to have as reference for how the variables in my data sets are distributed.

Source the R script in your .Rmd file and then use the function that I wrote make_plots() to give a distribution plot and summary stats table. It's not perfect, but for getting a rough impression of the data and documenting it, it's been working for me.

I really like having this for exploring the data set and looking for issues. But also my collaborators really like it too (they're old SAS users that like to see PROC MEANS and PROC FREQ)..

Let me know if you try it.

Cheers

1 Like

Thanks! It looks similar to DataExplorer, even though DataExplorer is missing the skimr table. I like very much the fact that make_plots doesn't require me to call a different function for categorical vs continuous variables.

I can't even get it to print a wide format when I render the Rmarkdown to HTML. Any ideas why?

Hi! Are you referring to my .Rmd example? Or to @ejlatour script?

Using your example it worked (getting wide format summary table), although the histogram still doesn't show up. I think my issue was that I used kintr::kable instead of the skimr::kable.

Yep, under Windows histograms are not working for anyone (in the rendered HTML file: they do show up in the console). By the way, there was an interesting blog by @kevinushey yesterday about encodings in R: https://kevinushey.github.io/blog/2018/02/21/string-encoding-and-r/ I guess it could be related to what we're seeing here...I don't think it will be possible to show the histograms in HTML under Windows anytime soon. It's not a big deal, though - what I use most are skimr summaries, not its histograms.

1 Like

The same problem occurs on MacOS using skimr_1.0.2 . My workaround is to render to markdown and use a markdown engine (Marked 2, an OSX app) to convert to pdf. That works fine.

Hi @Andrea,

I don't have a solution to your problem, but I have a thought that might (or not) shift your line of thoughts around it. I feel that it might not matter if skimr doesn't render properly: indeed, packages like skimr and DataExplorer are convenient to get a first look at the data with one line of code. They can be huge time savers to get an idea of the overall data structure.

Wanting to render any of their outputs as an html or pdf however might be something worth pausing and thinking about. Rendered documents are usually meant to be shared with others (reports, papers, etc.) and I am not sure these packages produce that kind of output. I don't think that it is their goal.

Of course, I cannot speak of your work, of which I know nothing. And there could be many applications that I fail to see where it might make sense to use skimr outputs with rmarkdown. But I suggest you maybe wonder about it. Maybe (and maybe not!), after having a first look at some data with skimr, you might want to spend time extracting something else from your data to render and share.

This is just a thought though and I will be happy to be corrected with good reasons where this could be a good idea! :slight_smile:

2 Likes

I find it useful to print out and annotate etc. as I'm working, along with stacked marginal tables (e.g. mDF <- reshape2::melt(df); table(mDF$Var2,mdf$value;)