 Understanding a strange Shapiro-wilk test

Hi everyone,

I'm currently working on a data set of heavy metals and I need to perform some transformation in order to make the data normal (follow a normal distribution). You can see here the distribution of the Ni data after deleting some outliers :

So I compute the Shapiro-wilk test on my initial data to see if it was necessary:

shapiro.test(clean.donnees.Ni\$Ni)

and I get this as result :

Shapiro-Wilk normality test

data:  clean.donnees.Ni\$Ni
W = 0.97966, p-value < 2.2e-16

This seems rather strange to me because the distribution almost looks like a true normal distribution.

But I was not a the end of surprises !

I still made a BoxCox transformation of my data using the AID package :

Cdbx.Nii <- boxcoxnc(clean.donnees.Ni\$Ni, verbose = FALSE) # Find best alpha

clean.donnees.Ni[!is.na(Ni), Cdbx.Nii := Cdbx.Nii\$tf.data] # Create column with transformed data
alpha <- Cdbx.Nii\$lambda.hat

and then I tried the Shapiro-Wilk test on my transformed data

shapiro.test(Cdbx.Nii\$tf.data)

and this is my results :

Shapiro-Wilk normality test

data:  Cdbx.Nii\$tf.data
W = 0.99377, p-value = 9.967e-10

With this distribution of data, I really don't understand how the Shapiro-Wilk test can't be successful.

How you can help me to resolve this !

I'm guessing your sample size is relatively large, i.e. >50 ?

The Shapiro–Wilk test is more appropriate method for small sample sizes ( <50 samples ) although it can also be handling on larger sample size while Kolmogorov–Smirnov test is used for n ≥50.

Exactly, ~3000 obs.
So I tried with a Kolmogorov–Smirnov test and even with a Lilliefors test, but result are still not even close from a normal distribution ... I don't know what to think.
I made a few tests with ks.test() function. For remind, Cdbx.Nii is my transformed Nickel data, so it is almost normal, with this distribution:

> ks.test(Cdbx.Nii\$tf.data, "pnorm",mean=mean(Cdbx.Nii\$tf.data), sd=sd(Cdbx.Nii\$tf.data))

One-sample Kolmogorov-Smirnov test

data:  Cdbx.Nii\$tf.data
D = 0.030362, p-value = 0.01007
alternative hypothesis: two-sided

I tried using others normal distribution (such as dnorm and qnorm) but p-value is < 2.2e-16.

Other example with a Lilliefors test:

> lillie.test(Cdbx.Nii\$tf.data)

Lilliefors (Kolmogorov-Smirnov) normality test

data:  Cdbx.Nii\$tf.data
D = 0.030362, p-value = 2.126e-06

On histograms it is very hard to see small deviations from normality. It is much easier to use a Q-Q plot (with qqnorm() and qqline()), which makes it more obvious where the data is non-normal.

As for statistical tests, their power directly depends on the sample size, so with a huge sample like yours any slight deviation from normality will make the p-value ridiculously small.

Remember: the p-value is not a measure of how important an effect is, just of whether random noise could cause it; so with a big enough sample size any irrelevantly small effect will bring your p-value to basically 0.

Which brings me to the real point: you almost never need a normality test. This rant explains it better than than I can, and gives a paper to cite.

If the conditions of the CLT are met, your sample data will be approximately distributed. But not exactly (since the population has no reason to be normal), so a big enough sample size will give a small p-value. If you had a very small sample size and something very non-normal, a normality test would fail to reject the null (because too small sample size), and mislead you to conclude your data is normal. Normality tests are almost never relevant, the only difficulty is to convince the reviewers of that fact.