An AOV can be performed on a data.frame with either the lm()
function or the aov()
function (either, since an AOV is an usual regression fitting). It works regardless of equal sample sizes (actually, it would be pretty useless if it was restricted to that). Here is an example on how to calculate an AOV in R
with your data:
Europeans <- c(16, 22, 11, 14, 19, 16, 23, 22, 13, 23, 21, 18, 15, 16, 13, 13)
Americans <- c(20, 16, 16, 18, 21, 7, 21, 19, 9, 19, 19, 12, 15, 16, 20, 30, 15, 23, 17)
Australasians <- c(22, 18, 18, 18, 19, 17, 19, 17, 12, 13, 19, 21, 14, 14, 16, 17, 15, 18, 14, 15, 20, 16, 16, 17)
Data <- data.frame(
Continent = as.factor(c(rep('Europeans',length(Europeans)), rep('Americans',length(Americans)), rep('Australasians',length(Australasians)))),
value = c(Europeans,Americans,Australasians)
)
model_lm <- lm(value ~ Continent, data = Data)
model_aov <- aov(value ~ Continent, data = Data)
summary(model_lm)
#>
#> Call:
#> lm(formula = value ~ Continent, data = Data)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -10.526 -2.357 0.125 2.125 12.474
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 17.5263 0.8937 19.611 <2e-16 ***
#> ContinentAustralasians -0.6513 1.1962 -0.544 0.588
#> ContinentEuropeans -0.3388 1.3218 -0.256 0.799
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 3.896 on 56 degrees of freedom
#> Multiple R-squared: 0.005274, Adjusted R-squared: -0.03025
#> F-statistic: 0.1485 on 2 and 56 DF, p-value: 0.8624
summary.aov(model_lm)
#> Df Sum Sq Mean Sq F value Pr(>F)
#> Continent 2 4.5 2.253 0.148 0.862
#> Residuals 56 849.8 15.175
summary(model_aov)
#> Df Sum Sq Mean Sq F value Pr(>F)
#> Continent 2 4.5 2.253 0.148 0.862
#> Residuals 56 849.8 15.175
Created on 2022-11-08 by the reprex package (v2.0.1)
As you can see, in your sample there is no statistically significant difference in means between all the groups (F-statistic 0.148 with p-value 0.8624). Looking at the coefficients you can see that the intercept (the baseline for the AOV) is statistically significant, but the slope of European and Australasians is not (which also indicates to some extend the missing statistical significance of the AOV).
In conclusion, you cannot say that there is a statistically significant difference in means between your defined groups with the given sample data.
Kind regards
Edit: As a side note, do not try to do several t-tests as a standard procedure. You will increase your likelihood of finding any statistically significant result if you just have enough groups to test on. Use AOV (=omnibus test, e.g. you know there is/is not anything to find, but not where exactly) to see if there is anything to see at all and if it is significant, you should specify your hypothesis and test for those specific cases, instead of testing for everything and adjusting your hypothesis on the way out to your results.