How to replace outliers with the 5th and 95th percentile values in R

Hi,

I have 9 univariate outliers in my dataframe.
I need to make "tuck" those outliers so that they would fit inside 3 standard deviations from the mean, of basically have them assigned the value of the 5th percentile if they're outside the lower limit, and the value of the 95th percentile if they're outside the upper limit.

I tried this code:

x <- ozone$pressure_height
qnt <- quantile(x, probs=c(.25, .75), na.rm = T)
caps <- quantile(x, probs=c(.05, .95), na.rm = T)
H <- 1.5 * IQR(x, na.rm = T)
x[x < (qnt[1] - H)] <- caps[1]
x[x > (qnt[2] + H)] <- caps[2]

with my own data, but it doesn't change anything in my dataframe, and after plotting it the outliers remain the same.

I don't want to remove the outliers - just change their values.
Any help is appreciated!

I also tried:

fun <- function(x){
    quantiles <- quantile( x, c(.05, .95 ) )
    x[ x < quantiles[1] ] <- quantiles[1]
    x[ x > quantiles[2] ] <- quantiles[2]
    x
}
fun( yourdata )

and it gave me this error:

Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)) : 
  undefined columns selected

Any idea why it's not working?

when you got this error, what was 'yourdata' ? i would guess you passed a dataframe when you should have passed a vector... your function works fine if passed the appropriate arguments.




x <- iris$Petal.Length
caps <- quantile(x, probs=c(.05, .95), na.rm = T)
x[x < caps[1]] <- caps[1]
x[x > caps[2]] <- caps[2]


df <- data.frame(ipl = iris$Petal.Length,
           x = x)
df$changemarked <- ifelse(df$x!=df$ipl, "CHANGED","")


fun <- function(x){
  quantiles <- quantile( x, c(.05, .95 ) )
  x[ x < quantiles[1] ] <- quantiles[1]
  x[ x > quantiles[2] ] <- quantiles[2]
  x
}
y <- fun( iris$Petal.Length )

identical(x,y)
# same in your function as without...

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.