random data points at the bottom of ggplot scatter point

Hi, R community. This is my first help thread so I apologize if I don't give the most descriptive information.

I'm trying to plot two variables from a data frame my econometrics professor has given me while simultaneously taking the log of them. The result is a normal looking scatter plot; however, there is a line of data points along the x axis that aren't coming from any of my data points. All I need is some help to remove those points. I do have NAs in my data frame but even omitting them using na.ommit doesn't help. Changing the limit on the y axis doesn't resolve the issue either. Thank you all in advance!

part_c <- ggplot(Terrorism_Data, aes(y = log(ftmpop), x = log(gdppc))) + geom_point(color = "steelblue2") + labs(x = "ln(GDP per Capita)", y = "ln(Fatalities (per million persons)") + ggtitle("ln Relationship")

Those points are due to cases where Fatalities = 0 so the log(Fatalities) returns -Inf. Notice in the plot below that I get similar points at x values of 2,6, and 8. You can remove these points with the filter() function from the dplyr package.

part_c <- Terrorism_Data %>% filter(ftmpop > 0) %>%
   ggplot(aes(y = log(ftmpop), x = log(gdppc))) + 
     geom_point(color = "steelblue2") + 
     labs(x = "ln(GDP per Capita)", y = "ln(Fatalities (per million persons)") + 
     ggtitle("ln Relationship")

Example showing similar points:

library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 3.5.3
DF <- data.frame(X = 1:10, Y = log(c(10,0,44,23,105,0,87, 0, 33, 6)))
ggplot(DF, aes(X, Y)) + geom_point(size = 3)

Created on 2019-10-27 by the reprex package (v0.3.0.9000)

2 Likes

Thank you. Worked out perfectly.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.