Plotting ggplot2 with vroom tibble crashes R

Not too sure whether to post this as an issue or how to troubleshoot this.

Certain plots drawn with tibbles created by vroom crashes the R session. The R session doesn't crash with readr tibbles. I'm running R 3.6.2 on Windows 10, and I've reproduced this same behaviour on RStudio Cloud with R 3.6.0.

# this crashes R 
library(ggplot2)
spotify1 <- vroom::vroom("https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-01-21/spotify_songs.csv?raw=true")
ggplot(spotify1, aes(track_popularity)) + geom_bar() + facet_wrap(vars(playlist_genre)) 

# this is fine 
library(ggplot2)
spotify2 <- readr::read_csv("https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-01-21/spotify_songs.csv?raw=true")
ggplot(spotify2, aes(track_popularity)) + geom_bar() + facet_wrap(vars(playlist_genre)) 

oooh, nasty bug !

I provide not a fix but a dodge:

ggplot(spotify1 %>%  filter(TRUE), aes(track_popularity)) + geom_bar() + facet_wrap(vars(playlist_genre))
1 Like

Thanks, this fixes it!

Would you mind explaining what filter(TRUE) changes that allows this to work? I don't see any differences in the attributes of spotify1 and filter(spotify1, TRUE). cheers!

my understanding is that vroom is doing its own clever thing of only lazily loading the data, and should stream the data as needed, and this streaming is going wrong in the case of ggplot looking to read it in. by applying filter(TRUE) (which all rows would pass) I am forcing dplyr to make a new 'hardened' non-vroom dataframe that ggplot can access as any other.

This is only my hypothesis, I don't really understand vroom internals

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.