Hello there,
I have a dataframe with negative numbers that shouldn't be there (data errors). I was hoping to find a way to replace each negative value in the dataframe with the mean of the column that the value is found in.
Here's an example dataframe:
var1 var2 var3
Site1 1 3 -5
Site2 2 6 6
Site3 -3 4 3
Site4 4 -7 2
I thought about first replacing the negative values with 0:
df1[df1<0] <- 0
Which results in this:
var1 var2 var3
Site1 1 3 0
Site2 2 6 6
Site3 0 4 3
Site4 4 0 2
So that now I can replace the 0s with the mean of each column, so that it looks like this:
var1 var2 var3
Site1 1.00 3.00 2.75
Site2 2.00 6.00 6.00
Site3 1.75 4.00 3.00
Site4 4.00 3.25 2.00
But I don't know how to do this in any sort of quick or efficient manner. Any insight into how to approach this would be greatly appreciated.
Thank you so much!