Getting "-Inf" in R

I have been doing some work in r on the National Survey of College Graduates. I was trying to look at the difference in salaries between different majors. To do this:

  1. I first took the log of peoples salaries and put it into a separate column

logsalary<- log(projectdata$salary)
projectdata <- cbind(projectdata, logsalary )

  1. Then just to see how it worked I wanted to take the mean of the column which ended up like this:

mean(projectdata$logsalary)
[1] -Inf

What does the -Inf mean? When I take the mean of salary as not a log it works.

  1. When I try to run a regression with this, this happens:

genderreg <- lm(projectdata$logsalary ~ projectdata$female)
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
NA/NaN/Inf in 'y'

Does anyone have any idea what is going on here?

are there any 0s in projectdata$salary ?

for the purpose of looking into the logged salary, I'd either

  • set anything <1 to 1 before taking the log
  • or set the logs of those values to be NA

depending on if you want 0s averaged in or not.

or take the median.

2 Likes