Geom_bar for one level of a variable

ggplot2
tidyverse
rstudio

#1

Hello,

I am trying to create a bar graph of one level of a categorical variable. However, I am stuck at how to code ggplot2 properly so i can parse out the level. Here is my code below.

     head (Re_Space_wo2)
  
       Remain Gender NM1_SpacePump       Freq                 Prop
    1        3      2             0         230580.383     0.247448980
    2        3      2             1         166398.214     0.178571429
    3        3      1             0         118855.867     0.127551020
    4        2      2             0          99838.929      0.107142857
    5        4      2             0          80821.990      0.086734694
    6        3      1             1          61805.051      0.066326531

I would like to graph to show results for only the following variables and levels. Remain, all levels. Gender only level 2, and NM1_SpacePump, all levels. I would like to keep the Freq and Prop columns as well. Could someone point me in the right direction?

It's only been about a month since I started exploring ggplot2. All help is greatly appreciated.

Thank you.


#2

Hi @quipmaster! It sounds like you want to filter your data before you plot it—that is, only keep rows that meet certain conditions (like being a given level of a factor). A really easy way to do this is with the filter function in dplyr:

library(dplyr)      # you could also use library(tidyverse)

filtered_data <- Re_Space_wo2 %>% filter(
  Remain %in% 2:4,                 # Remain is any of 2, 3 or 4...
  Gender %in% 1:2,                 # AND Gender is any of 1 or 2...
  NMI_SpacePump %in% 0:1)          # AND NMI_SpacePump is any of 0 or 1

This code keeps rows that meet all three conditions (another way to express it would be to join the three conditions with ampersands instead of commas: Remain %in% 2:4 & Gender %in% 1:2, NMI_SpacePump %in% 0:1.

The other thing going on here that you might not see in other languages is %in%. It compares your data to many values, so it's a lot shorter and readable than Remain == 2 | Remain == 3 | …

If you'd actually like to keep rows that meet any of the three criteria, you can join conditions with a pipe:

filtered_data <- Re_Space_wo2 %>% filter(
  Remain %in% 2:4 |               # Remain is any of 2, 3 or 4...
  Gender %in% 1:2 |               # OR Gender is any of 1 or 2...
  NMI_SpacePump %in% 0:1)         # OR NMI_SpacePump is any of 0 or 1

Then just use filtered_data in ggplot2 in place of the original data. I hope that helps!