Ggbeeswarm plot

I have created this plot:

using this code:


ggplot(data = dats2,
       aes(x = sex, y = age)) +
  theme_bw(base_size = 16) +
  geom_quasirandom(col = "black", varwidth = TRUE, groupOnX = TRUE, alpha = 3/4, size = 2)

I would like to ask how to interpret this in relation to x axis and proximity to midline and 1.5 value ?

Sex variable is categorical: 1 = women, 2 = men
Age variable is numerical.

How to read this plot ?

What is the difference between points I have encirled in red ?

Hard to tell but you are not plotting sex as a categorical variable.

See the difference in the x-axis here


dat1  <- data.frame(sex = as.factor(sample(1:2, 20, replace = TRUE)), yy = rnorm(20))
ggplot(dat1, aes(sex, yy)) + geom_beeswarm()

The points are distributed more or less randomly along the x-axis, they are spread wider when more points are in the range. So the points at 2.25 aren't "more men" than the points at 1.75, everything in the cloud around 2 is male, the other cloud is female.
You could also use categorical labels on x stating "female" and "male" directly.

Thank you both for your kind replies, so can we say that it is another form of scatter plot showing differently distribution of data ?

Kind of. The idea is to avoid overplotting, to show all data-points even though they may overlap.
If you would replace it with geom_point() you would just see a black line between 40 and 60 as many data-points do overlap here. With the beeswarm the overlapping points are moved away along the x-axis. The width represents the degree of overlapping and with this the distribution.