# Mean and median in one boxplot

Hi,
I need to place medians and means on one boxplot and annotate it:

``````means <- aggregate(weight ~  group, PlantGrowth, mean)

medians <- aggregate(weight ~  group, PlantGrowth, median)

PlantGrowth |>
summarize(ymin = quantile(weight, 0),
lower = quantile(weight, 0.25),
median = median(weight),
mean = mean(weight),
upper = quantile(weight, 0.75),
ymax = quantile(weight, 1)) %>%
ggplot(aes(x=group, y=weight, fill=group)) + geom_boxplot(stat = 'identity', aes(ymin = ymin, lower = lower, middle = mean, upper = upper,
ymax = ymax)) +
stat_summary(fun=mean, colour="darkred", geom="line",
shape=18, size=3, show.legend=FALSE) +
geom_text(data = means, aes(label = weight, y = weight + 0.08))+
geom_text(data = medians, aes(label = weight, y = weight + 0.08))

``````

but it throws an error:

``````Error in FUN(X[[i]], ...) : object 'group' not found
``````

What do I do wrong ?

you summarised over PlantGrowth without applying any group, and collapsed it to a single result.
Did you intend a grouped analysis? if so the first step towards that would be

``````PlantGrowth |> group_by(group) |>
summarize(ymin = quantile(weight, 0),
lower = quantile(weight, 0.25),
median = median(weight),
mean = mean(weight),
upper = quantile(weight, 0.75),
ymax = quantile(weight, 1))``````

Ah, thank you, but still errors:

``````PlantGrowth |> group_by(group) |>
summarize(ymin = quantile(weight, 0),
lower = quantile(weight, 0.25),
median = median(weight),
mean = mean(weight),
upper = quantile(weight, 0.75),
ymax = quantile(weight, 1))+
ggplot(aes(x=group, y=weight, fill=group)) + geom_boxplot(stat = 'identity', aes(ymin = ymin, lower = lower, middle = mean, upper = upper,
ymax = ymax)) +
stat_summary(fun=mean, colour="darkred", geom="line",
shape=18, size=3, show.legend=FALSE) +
geom_text(data = means, aes(label = weight, y = weight + 0.08))+
geom_text(data = medians, aes(label = weight, y = weight + 0.08))
``````

the immediate issue is your having used a `+` between the data creation step `summarise` and the start of ggplot commands. for sure ggplot commands are chained together with + but not before the first ggplot() call.
after that you may need to rethink your intent vis `y=weight` as your summarise has not preserved weight but made various aggregations of it.
I generally advise against chaining data transformations into a ggplot call for anything non trivial.
make a dataset , or sets, and load the explicitly into ggplot2 as needed. At least thats my preference, I find it easier to debug.

I just simply want to make a boxplot of weight according to group with medians and means on it, I might overcomplicated it a bit.
Starting over:

``````ggplot(data=PlantGrowth, aes(x=group, y=weight, fill=group)) + geom_boxplot() +
stat_summary(fun=mean, colour="darkred", geom="line",
shape=18, size=3, show.legend=FALSE) +
geom_text(data = means, aes(label = weight, y = weight + 0.08))+
geom_text(data = medians, aes(label = weight, y = weight + 0.08))
``````

I am almost there but how to add a lines for means ?
Something like in here:
https://stackoverflow.com/questions/69444091/mean-and-median-boxplot-legend-for-geom-boxplot-in-the-ggplot2-function

I think "crossbar" is the way to go.

``````ggplot(data=PlantGrowth, aes(x=group, y=weight, fill=group)) +
geom_boxplot() +
stat_summary(fun=mean, color="darkred",geom="crossbar", show.legend=FALSE)  +
geom_text(data = means, aes(label = weight, y = weight + 0.08))+
geom_text(data = medians, aes(label = weight, y = weight + 0.08))``````

Never heard of "crossbar" before, thank you very much. One more question, how to annotate it that people know which is median and which is mean ?
Could be on boxplots or in the legend.

Perhaps this sort of idea

``````
smry_text <- PlantGrowth |>
group_by(group) |>
summarise(across(.cols=weight,
.fns = list(mean=mean,median=median))) |>
mutate(mean_text = paste0("Mean : ",weight_mean),
median_text = paste0("\n\nMedian : ",weight_median))

ggplot(data=PlantGrowth, aes(x=group, y=weight, fill=group)) +
geom_boxplot() +
stat_summary(fun=mean, color="darkred",geom="crossbar", show.legend=TRUE)  +
geom_text(data = smry_text, aes(label = median_text,
y=weight_mean),
,nudge_y = -.2) +
geom_text(data = smry_text, aes(label = mean_text,
y=weight_mean),
,nudge_y = -.2,color="darkred" , fontface = "bold")
``````

1 Like

Thank you very much indeed, exactly what I wanted and very elegant solution.
In the meantime I have tried this:

````````` r
library(tidyverse)
data(PlantGrowth)

means <- aggregate(weight ~  group, PlantGrowth, mean)

medians <- aggregate(weight ~  group, PlantGrowth, median)

ggplot(data=PlantGrowth, aes(x=group, y=weight, fill=group)) +
geom_boxplot() +
stat_summary(fun=mean, color="darkred",geom="crossbar", show.legend=TRUE)  +
geom_text(data = means, aes(label = weight, y = weight + 0.08))+
geom_text(data = medians, aes(label = weight, y = weight + 0.08))+
geom_errorbar(aes(ymin=min(weight),ymax=max(weight)), linetype = 1, width = 0.5)+
stat_summary(
fun = mean, geom = "errorbar", aes(ymax = ..y.., ymin = ..y.., color = "Mean"),
width = 1.15, linetype = "dashed"
) +
stat_summary(
fun = median, geom = "errorbar", aes(ymax = ..y.., ymin = ..y.., color = "Median"),
width = 1.15, linetype = "solid"
)+
scale_colour_manual("Stats", values = c(Median = "black", Mean = "darkred"))
`````` Created on 2022-09-27 with reprex v2.0.2

I have one question if I may, how to tweak my legend, I mean make boxes a bit bigger under Stats title, lines inside rectangles a bit more visible, etc. I would be grateful for a hint where to start. Thank you.

you can change appearance via ggplot2's theme heres a brief guide to resize the legend
How to Change Legend Size in ggplot2 (With Examples) (statology.org)

Hi,
when I want to add horizontal whiskers (to Nir's code) by means of geom_errorbar something is wrong as all 3 whiskers are the same long which is not corresponding with the data. I think this is because of this line of code:

``````# geom_errorbar(aes(ymin=min(weight),ymax=max(weight)), linetype = 1, width = 0.15)

smry_text <- PlantGrowth |>
group_by(group) |>
summarise(across(.cols=weight,
.fns = list(mean=mean,median=median))) |>
mutate(mean_text = paste0("Mean : ",weight_mean),
median_text = paste0("\n\nMedian : ",weight_median))

ggplot(data=PlantGrowth, aes(x=group, y=weight, fill=group)) +
geom_boxplot() +
stat_summary(fun=mean, color="darkred",geom="crossbar", show.legend=TRUE)  +
geom_errorbar(aes(ymin=min(weight),ymax=max(weight)), linetype = 1, width = 0.15)+
geom_text(data = smry_text, aes(label = median_text,
y=weight_mean),
,nudge_y = -.2) +
geom_text(data = smry_text, aes(label = mean_text,
y=weight_mean),
,nudge_y = -.2,color="darkred" , fontface = "bold")
``````

How should I change it in order to be correct, please ?

In the meantime I have managed to do this, but still can't label that second outlier pointed by arrow, please help: