Geom_sina vs geom_jitter on vertical (geom_sina with distribution of ints)


#1

(hope this is a reasonable place for ggforce questions)

I’m intrigued by geom_sina but it doesn’t seem to work for my data. The data is a distribution of ints between 1 and 5 (no in between values). Looking at the distribution using geom_jitter works fine:

library(tidyverse)
library(ggforce)

# create int version of area
midwest$area_ints <- (midwest$area * 100) %>% floor()

ggplot(midwest, aes(state, area_ints)) + 
  geom_jitter() 

That works in the sense that it jitters the points both horizontally and vertically, so one can see the distribution. geom_sina spreads the values horizontally as expected not vertically. geom_violin works as well (and actually shows the issue the best when combined with geom_sina)

library(tidyverse)
library(ggforce)

# create int version of area
midwest$area_ints <- (midwest$area * 100) %>% floor()

ggplot(midwest, aes(state, area_ints)) + 
  geom_violin() + geom_sina() 

I just get lines at each point, as though the jittering was only happening horizontally and not vertically as well. The expected behavior is that geom_sina() should also spread the values vertically, if I understand correctly essentially matching the shape of a violin plot but using the jitter with transparency approach.

The same thing happens with sinaplot directly, so I’m now wondering if this doesn’t belong with the sinaplot devs?

Thanks,
James


#2

Hmmm. Seems like intended behavior, ggforce visual guide says:

geom_sina is inspired by the strip chart and the violin plot and operates by letting the normalized density of points restrict the jitter along the x-axis.

I’ll post this as an issue/enhancement at the ggforce repo.