Using `caret::BoxCoxTrans` to help transform highly skewed feature.

Hi RStudio,
I have a question regarding caret::BoxCoxTrans. As you recall this finds the lambda (for highly skewed data) so that a BoxCox transformation can be carried out.

I have a feature/attribute that is highly skewed, skewness = 2.912 and the ratio of maximum value : min value = 208.8. The actual minimum is zero but I found the smallest value that is NOT zero and calculated the ratio. (Correct or not?)

However when I use the caret::BoxCoxTrans to determine the lambda value I get a message stating,

[1] 0.004016064 0.009803922 0.000000000 0.052083333 0.010677344 0.030769231
Box-Cox Transformation

3500 data points used to estimate Lambda

Input data summary:
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.000000 0.000000 0.006093 0.008548 0.012821 0.159420 

Lambda could not be estimated; no transformation is applied

Any thoughts?

Does this mean that at least one of the observations in your dataset is 0?

If so, then this is the expected behaviour. You cannot apply Box Cox transformation with non-positive observations. All observations have to be strictly positive.

It's also mentioned in the documentation:

If any(y <= 0) or if length(unique(y)) < numUnique , lambda is not estimated and no transformation is applied.

1 Like

I've found this round-up/discussion of transformations for data that include zeroes helpful in the past:
https://robjhyndman.com/hyndsight/transformations/

3 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.