Hi everyone,
I'm currently working on a data set of heavy metals and I need to perform some transformation in order to make the data normal (follow a normal distribution). You can see here the distribution of the Ni data after deleting some outliers :
So I compute the Shapiro-wilk test on my initial data to see if it was necessary:
shapiro.test(clean.donnees.Ni$Ni)
and I get this as result :
Shapiro-Wilk normality test
data: clean.donnees.Ni$Ni
W = 0.97966, p-value < 2.2e-16
This seems rather strange to me because the distribution almost looks like a true normal distribution.
But I was not a the end of surprises !
I still made a BoxCox transformation of my data using the AID package :
Cdbx.Nii <- boxcoxnc(clean.donnees.Ni$Ni, verbose = FALSE) # Find best alpha
clean.donnees.Ni[!is.na(Ni), Cdbx.Nii := Cdbx.Nii$tf.data] # Create column with transformed data
alpha <- Cdbx.Nii$lambda.hat
and then I tried the Shapiro-Wilk test on my transformed data
shapiro.test(Cdbx.Nii$tf.data)
and this is my results :
Shapiro-Wilk normality test
data: Cdbx.Nii$tf.data
W = 0.99377, p-value = 9.967e-10
With this distribution of data, I really don't understand how the Shapiro-Wilk test can't be successful.
How you can help me to resolve this !