Geom_violin weight warning

I have searched for an answer to this but am drawing a blank. I'm trying to weight occurrences in ggplot violin, but get a warning that

In density.default(x, weights = w, bw = bw, adjust = adjust, kernel = kernel, :
sum(weights) != 1 -- will not get true density

I have reproduced the error below using mtcars dataset (my x is a factor, y is continuous and weight is numeric as reproduced below). In my own data set I need to apply a weighting to control for the time data was collected.

mtcars$cyl <- as.factor(mtcars$cyl)
myplot3 <- ggplot(mtcars, mapping=aes(x=cyl, y=drat, weights = qsec/sum(qsec)))+ 
  geom_violin(scale = "count") + 
  labs(title="My plot", 
       x="Name",
       y="Position")
myplot3

Can anyone advise on how to apply weighting correctly? I also want to be able to see the scale="count" as my own data has skewed abundances between factors, which I need to view graphically.

Thanks in advance.

The problem is that you are plotting by group (number of cylinders), but not weighting within those groups. So, the total weight for each cylinder will not add up to one, as it should for "true density".

Below I've separated out the steps to better show what I mean:

suppressPackageStartupMessages(library(tidyverse))
mtcars$cyl <- as.factor(mtcars$cyl)
myplot3 <- ggplot(mtcars, mapping=aes(x=cyl, y=drat, weights = qsec/sum(qsec)))+ 
  geom_violin(scale = "count") + 
  labs(title="My plot", 
       x="Name",
       y="Position")

myplot3
#> Warning in density.default(x, weights = w, bw = bw, adjust = adjust, kernel
#> = kernel, : sum(weights) != 1 -- will not get true density

#> Warning in density.default(x, weights = w, bw = bw, adjust = adjust, kernel
#> = kernel, : sum(weights) != 1 -- will not get true density

#> Warning in density.default(x, weights = w, bw = bw, adjust = adjust, kernel
#> = kernel, : sum(weights) != 1 -- will not get true density



mtcars_summary <- mtcars %>%
  group_by(cyl) %>%
  summarise(cyl_qsec = sum(qsec))

mtcars <- mtcars %>%
  left_join(mtcars_summary, by = "cyl")

ggplot(mtcars, mapping=aes(x=cyl, y=drat, weights = qsec/cyl_qsec))+ 
  geom_violin(scale = "count") + 
  labs(title="My plot", 
       x="Name",
       y="Position")

Created on 2018-08-16 by the reprex package (v0.2.0.9000).

1 Like

This is exactly what I needed to do but for the life of me couldn't work it out. Thanks so much for explaining.

1 Like