extracting counts of each bin in geom_histogram

hi is there an easy way to extract the counts of each bins in a geom_histogram and save this as a DataFrame?thank u

calculate the range of your vector, and divide it into intervals with equal widths with the help of the cut_width function from dplyr

#if you are using arg bins in geom_histogram than
#deafult bins arg set to 30
bins <-  30

#lets create a vector
x <- rnorm(10000, mean = 0, sd = 1)

minx <- min(x)
maxx <- max(x)

width_ <- (maxx-minx)/ bins

tibble(x = x, w = cut_width(x, width = width_ )) %>% 
  count(w) %>% 
  print( n = Inf)
1 Like

thank u so much.. is it possible to somehow have two datasets here, using facet wrap? meaning the x vector is split into 500 row for one graph and 500 row for another

There is also a function in {ggplot2} called layer_data() which can extract the data computed while building a ggplot. For example:


my_data <- rnorm(100, 15,5)

g_hist <- ggplot(enframe(my_data)) +
  geom_histogram(aes(x = value))

dat <- layer_data(g_hist)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

#>   y count        x     xmin     xmax    density     ncount   ndensity
#> 1 1     1 3.979584 3.581626 4.377543 0.01256413 0.09090909 0.09090909
#> 2 1     1 4.775501 4.377543 5.173459 0.01256413 0.09090909 0.09090909
#> 3 0     0 5.571418 5.173459 5.969376 0.00000000 0.00000000 0.00000000
#> 4 1     1 6.367335 5.969376 6.765293 0.01256413 0.09090909 0.09090909
#> 5 1     1 7.163252 6.765293 7.561210 0.01256413 0.09090909 0.09090909
#> 6 2     2 7.959168 7.561210 8.357127 0.02512825 0.18181818 0.18181818
#>   flipped_aes PANEL group ymin ymax colour   fill size linetype alpha
#> 1       FALSE     1    -1    0    1     NA grey35  0.5        1    NA
#> 2       FALSE     1    -1    0    1     NA grey35  0.5        1    NA
#> 3       FALSE     1    -1    0    0     NA grey35  0.5        1    NA
#> 4       FALSE     1    -1    0    1     NA grey35  0.5        1    NA
#> 5       FALSE     1    -1    0    1     NA grey35  0.5        1    NA
#> 6       FALSE     1    -1    0    2     NA grey35  0.5        1    NA

# Now we can rebuild the histogram "manually"
g_rect <- ggplot(dat) +
  geom_rect(aes(xmin=xmin,xmax=xmax, ymin=0, ymax=count))

patchwork::wrap_plots(g_hist, g_rect)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Created on 2022-10-15 by the reprex package (v2.0.1)

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.