ggplot boxplot excluding missing data

Hi, all! I'm very new to RStudio, but thankfully I've been given the code I need (I just have to rename everything and make sure it runs).

When I create a boxplot using the following line of code, it creates a boxplot including a blank category for the "AgeLet" group.

LengthPlotZ <- ggplot(FDataZ, aes(x=SexLet, y=Length_z, fill=AgeLet)) +
geom_boxplot()
LengthPlotZ

image

How do I exclude missing values? Any help is appreciated. If you need anymore information to better help, please let me know!

I said just the "AgeLet" category, but I meant to say how do I omit missing data from all variables?

You could make a little filter for each one, and omit NAs values.

na.omit(FDataZ) # omit NAs values of all data frame. You can select expecific variables whit $. e.g .  na.omit(FDataZ$Length_z) 

library(dplyr)
filter(FDataZ, AgeLet <= 64) #You can select the number of  filter.
filter(FDataZ, AgeLet >= 60)

It looks like you have blank values in the AgeLet variable. A solution would be to filter the blank strings in AgeLet:

filtered_data <- FDataZ %>% dplyr::filter(!AgeLet %in% '')

Or specify the values you wish to keep in AgeLet:

filtered_data <- FDataZ %>% dplyr::filter(AgeLet %in% c("AHY", "ASY", "HY", "SY")

1 Like

Hey, thanks! This works great.

Would I be able to filter NAs in multiple categories with this? If so, how would I format it?
filtered_data <- FDataZ %>% dplyr::filter(!AgeLet %in% '')

Thank you!

Use na.omit(FDataZ) and it will drop all rows containing an NA. Use with caution though - if you have lots of NAs in your data, you may end up throwing away more data you would wish to!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.