My colleagues have come up with a preferred style for boxplots that they create using matlab. I'd like to replicate the style in ggplot2.
The features of interest are:
the colour and fill aesthetics would be the same, but the middle line would be white.
An errorbar style ('serifs', 'crossbars' at the end of the whiskers)
We always use stat = 'identity', as we choose to present 2.5, 25, 50, 75 and 97.5 percentiles.
I can't work out how to use stat_boxplot to add an errorbar in this case. If it is possible can I also control the width/length of the crossbars?
Is there a clean way to colour the middle line white? I think it would require overdrawing, and by accident I think I can do this with stat_boxplot(geom = 'errorbar', aes(y = median, group = year), but it doesn't seem like a very clean way to do it (I assume I've actually drawn two lines at the same y value? and a zero length line between them?).
Thanks for your time and help. A not-very-developed reprex follows.
So this, featuring geom_boxplot overdrawn with geom_errorbar and geom_linerange, gives me what I'd like.
So I've sort of answered my own question. I guess I could do it cleaner with geom_polygon rather than geom_boxplot with 0 width lines. But maybe not, as geom_boxplot automatically handles widths and positioning if there are multiple boxplots for each year (eg two variables).
I am pretty happy with what I have there, but I'd like the flexibility of having it in a custom geom so that I can benefit from facetting etc. So I've tried to create a geom, copying at the structure from what I can see in the geom_boxplot and GeomBoxplot code, and using the bulk of th code in the reprex above:
Ah, sorry — don’t know how I was able to miss that. In ggplot2 the parameters available to a geom is by default deduced by the arguments to its draw_group/draw_panel methods. But since these aren’t used there you have to put it in the extra_params field of the class.
For the aesthetics, those aesthetics a geom reacts to is deduced by the required_aes, default_aes, optional_aes and non_missing_aes fields. You don’t provide any defaults for fill so the geom assumes you don’t use it
and I get the following when I run it (again with the data from the original example):
Error in `geom_kat()`:
! Problem while converting geom to grob.
ℹ Error occurred in the 1st layer.
Caused by error in `data_frame()`:
! Can't recycle `x` (size 4) to size 8.
And a secondary question: how can I specify default columns in default_aes? I've tried x = year, x = "year" and x = ~year so far.
Change of strategy over the weekend, switched from trying to go from ggplot code to a geom to hacking the geom_boxplot code from github (deleting outlier bits, switching to errorbar instead of segment whiskers, etc). I seem to have something that works now and will post that as an answer once I've polished it a wee bit.
In the meantime, I'm still interested to know how I can specify a column name as the default (eg to have a default x, ymin, etc column default to year, q025, etc).
So, @thomasp85 thanks for your help with the previous code, that has helped me tremendously, and don't spend any time thinking about the last error report. I had something missing from the data being passed to GeomErrorbar - I hit the same problem over the weekend with my new code and solved it.
Back in a bit with a solution, just in case anyone finds it helpful in the future.
Fantastic progress. Regarding the default aesthetics, there is no way to have a default that refers to a column name in the data. The technical reason is that the defaults are only added way later when the original dataset has been lost. The non-technical reason is that it is a bad idea because you couple your code tightly with the format of the data, something that ggplot2 in general goes against...
If you are dead-set on having a default column you can modify geom_kat() to inspect the mapping argument and add an x mapping if none exists, but that would prevent you from setting the mapping in the ggplot() call, so it is not really something I'd recommend