Can we eliminate NA in ggplot that is occuring. I tried with na.rm but its not working

vinayprakash808 · October 1, 2019, 6:35pm

I have a dataframe df, But there are NA in a value column

 p12 <- data.frame(Var1 = c("A","asgfg","B","dds","dfg","dfh"),ColA=c("NA","NA",2.1,2.1,4.2,2.1))

So when i plot

ggplot(data=p12,aes(x=Var1,y=ColA,fill=Var1))+geom_bar(stat = "identity")+coord_flip()

It is even taking NA into account . Is there a way to eliminate this. I need only values to be plotted

FJCC · October 1, 2019, 7:12pm

Your example happens to be very confusing because the NAs in ColA are actually strings that happen to have the two letters NA. This causes the entire column to be interpreted as a factor. There is further confusion because the other two values in the column would differ by a factor of two if they were interpreted as numbers, so they look plausibly spaced on the axis. Examine these variations of your data.

library(ggplot2)
#Original
p12 <- data.frame(Var1 = c("A","asgfg","B","dds","dfg","dfh"),
                  ColA=c("NA","NA",2.1,2.1,4.2,2.1))
ggplot(data=p12,aes(x=Var1,y=ColA,fill=Var1))+geom_bar(stat = "identity")+coord_flip()


#Actual NA
p12_2 <- data.frame(Var1 = c("A","asgfg","B","dds","dfg","dfh"),
                  ColA=c(NA,NA,2.1,2.1,4.2,2.1))

ggplot(data=p12_2,aes(x=Var1,y=ColA,fill=Var1))+geom_bar(stat = "identity")+coord_flip()
#> Warning: Removed 2 rows containing missing values (position_stack).


#Strings that are not "NA"
p12_3 <- data.frame(Var1 = c("A","asgfg","B","dds","dfg","dfh"),
                    ColA=c("foo","bar",2.1,2.1,4.2,2.1))
ggplot(data=p12_3,aes(x=Var1,y=ColA,fill=Var1))+geom_bar(stat = "identity")+coord_flip()



#Different ColA values
p12_4 <- data.frame(Var1 = c("A","asgfg","B","dds","dfg","dfh"),
                    ColA=c("foo","bar",200.1,200.1,4.2,200.1))

ggplot(data=p12_4,aes(x=Var1,y=ColA,fill=Var1))+geom_bar(stat = "identity")+coord_flip()

^{Created on 2019-10-01 by the reprex package (v0.2.1)}
In your actual data, you probably want the text NA to be interpreted as an NA, not as a string. How are the data being entered?

vinayprakash808 · October 1, 2019, 7:22pm

Thanks. I need categories with NA to be removed. In this case, I do not need "A" and "asgfg" to come in the plot

FJCC · October 1, 2019, 7:41pm

You can manually filter out the "NA" rows as shown below but it would be far better to read the data in cleanly in the first place. Then you would not have to convert ColA to be numeric.

library(ggplot2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#Original
p12 <- data.frame(Var1 = c("A","asgfg","B","dds","dfg","dfh"),
                  ColA=c("NA","NA",2.1,2.1,4.2,2.1))
p12 %>% filter(ColA != "NA") %>% 
  mutate(ColA = as.numeric(as.character(ColA))) %>% 
  ggplot(aes(x=Var1,y=ColA,fill=Var1))+geom_col()+coord_flip()

^{Created on 2019-10-01 by the reprex package (v0.2.1)}

vinayprakash808 · October 1, 2019, 7:49pm

Perfect thanks a lot

system · October 9, 2019, 5:20am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.