Results Eta2() and anova()

Hello,
I am trying to observe corelation beetween one qualitative var. (3 modalities) and one quantitative var.
So I use both Eta2() and anova() functions from BioStatR but the results seems to be incoherent :thinking:
Eta2 doesnot indicate relation between variables, but AovSum seems to indicate a significant relation.

eta2(dvar_sna$v1_egoduree, dvar_sna$situation_emploi)
[1] 0.0971701

anova(lm(dvar_sna$v1_egoduree ~ dvar_sna$situation_emploi))
Response: dvar_sna$v1_egoduree
Df Sum Sq Mean Sq F value Pr(>F)
dvar_sna$situation_emploi 2 3667 1833.41 97.888 < 2.2e-16 ***
Residuals 1819 34069 18.73

Do you have so advice :grimacing: ?
Thank you !!

Are you concerned that the eta2 value is small but the p value of the ANOVA is also small? They are describing different things. eta2 describes how much of the variance in the dependent variable is accounted for by the independent variable. The p value describes the probability of seeing that result, (edit: or one more extreme) if the experiment were repeated many times and the null hypothesis were true. If both values are small, it means that not much of the variation in the dependent variable is explained but the result is probably not due to noise or "luck".

Here is an example from linear regression where p is small but so is R-squared. Only a little of the variance of y is explained by x (R-squared is small) but the effect is far beyond what you are likely to see if there were no relationship at all (p is small).

set.seed(1)
DF <- data.frame(x = seq(0,10, 0.01), y = seq(1,2, 0.001) + rnorm(1001, 0, 1))
summary(lm(y ~ x, data = DF))
#> 
#> Call:
#> lm(formula = y ~ x, data = DF)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.0023 -0.6797 -0.0207  0.7050  3.8203 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  1.03291    0.06539  15.795  < 2e-16 ***
#> x            0.09132    0.01132   8.064 2.09e-15 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.035 on 999 degrees of freedom
#> Multiple R-squared:  0.06112,    Adjusted R-squared:  0.06018 
#> F-statistic: 65.03 on 1 and 999 DF,  p-value: 2.095e-15
plot(DF$x, DF$y)
abline(a = 1.03, b = 0.0913)

Created on 2020-05-02 by the reprex package (v0.3.0)

1 Like
1 Like

I knew I would get in trouble trying to write a quick description of a p value.

No, No, no trouble at all @FJCC,
I respect your experience here in this Forum,but
I tried to point disscussion out to this remarkable article.
Best,
Andrzej

Hi again,
If I may ask @FJCC what do you mean by:

regarding your code specific example ?
regards,
Andrzej

I was trying to express what is said in item 9 of the paper you posted: " The P value refers not only to what we observed, but also observations more extreme than what we observed (where “extremity” is measured in a particular way)."

Thanks for the link to that paper! I read it quickly last night, it was late for me, and bookmarked it for future reference.

1 Like

Thank you so much for your explanation and the example ! You made it really clear to me. :+1:

Thank you for this reference !!

Here there is another one:

http://daniellakens.blogspot.com/2015/06/why-you-should-use-omega-squared.html

With this famous quote:
"Eta-squared (η²) and partial eta-squared (ηp²) are biased effect size estimators. I knew this, but I never understood how bad it was. Here’s how bad it is: If η² was a flight from New York to Amsterdam, you would end up in Berlin."

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.