Doubts about how to analyze seed germination data (percentages) in a factorial scheme.

Hello friends! I hope you are well.
I have a problem, or it would be a doubt.

I have a data set that is about seed germination. I have two categorical variables and one variable that is germination percentages. I cannot use a parametric analysis (ANOVA) because the data do not follow normality and homogeneity. I was wondering if you guys help me on how I can analyze this data. I will put an example of what my data looks like. These data are: two seed treatments (water and KNO3) and Accesses (they are the species), so I evaluated the germination.

TRAT ACESSOS GER
agua 7754_04_06 0
agua 7754_04_06 0
agua 7754_04_06 0
agua 7754_04_06 0
kno3 7754_04_06 0
kno3 7754_04_06 0
kno3 7754_04_06 0
kno3 7754_04_06 0
agua ARI_31_05 61
agua ARI_31_05 33
agua ARI_31_05 50
agua ARI_31_05 74
kno3 ARI_31_05 5
kno3 ARI_31_05 7
kno3 ARI_31_05 0
kno3 ARI_31_05 2
agua ARI_06_06 71
agua ARI_06_06 30
agua ARI_06_06 58
agua ARI_06_06 30
kno3 ARI_06_06 3
kno3 ARI_06_06 0
kno3 ARI_06_06 0
kno3 ARI_06_06 0

I would suggest starting with some exploration of the data. Something like this:

library(ggplot2)
ggplot(data = your_data, mapping=aes(x=factor(TRAT), y=GER)) +
    geom_boxplot() +
    facet_grid(~factor(ACESSOS))

your_data needs to be a dataframe and I'm assuming that the categorical variables are strings so I cast them to factors in the ggplot function call.

I suggest you see https://www.statforbiology.com/2019/stat_survival_germination/. The code is on github.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.