Hi all,
I'm struggling with analysing a data set that is predominantly qualitative, categorical data. My data is basically organised as follows: the presence of three different mosquito species (yes/no or 1/0), and which season, year, month, locality, collection method and sex are associated with each entry. I don't know if that makes sense. I've attached some sample data below that will hopefully clarify. I have tried making some of the data continuous by converting to count data.
| Orig collection date |
Month |
Year |
Season |
Locality |
An. Merus |
An. arabiensis |
An. quadriannulatus |
Identification |
Count |
Sex |
Collection Method |
| 27/10/2009 |
October |
2009 |
Spring |
Block A |
1 |
0 |
0 |
An. merus |
1 |
Male |
Larvae |
| 27/10/2009 |
October |
2009 |
Spring |
Block A |
1 |
0 |
0 |
An. merus |
1 |
Male |
Larvae |
| 30/04/2015 |
April |
2015 |
Autumn |
Block A |
0 |
0 |
1 |
An. quadriannulatus |
1 |
Male |
Outdoor pot/bucket |
| 30/04/2015 |
April |
2015 |
Autumn |
Block A |
0 |
0 |
1 |
An. quadriannulatus |
1 |
Male |
Outdoor pot/bucket |
| 16/03/2016 |
March |
2016 |
Autumn |
Vlakbult |
0 |
1 |
0 |
An. arabiensis |
1 |
Female |
CO₂ tent |
| 16/03/2016 |
March |
2016 |
Autumn |
Vlakbult |
0 |
1 |
0 |
An. arabiensis |
1 |
Female |
CO₂ tent |
Because of the nature of the data, parametric tests aren't appropriate, and a non-parametric test such as Kruskal-Wallis also doesn't work. A friend suggested I try correlation tests, so I tried Spearman's Rank Correlation, and additionally Wilcoxon Signed Rank tests. I got some statistical results from that, but I'm worried that the assumptions for the tests are not met. My data are not normally distributed, nor homogenous, as per Shapiro-Wilk and Levene's test. I would like to do multi-variable multi-comparison tests if possible. The Wilcoxon Signed Rank test gave me a significant p-value, but I don't know what post-hoc test to do then.
My research questions are as follows:
- Is there a significant difference between species abundance across seasons? I.e. species 1's abundance in Summer, Winter, Spring and Fall vs Species 2 and Species 3.
2). Is there a significant difference between species abundance across years? I.e. species 1's abundance in Year 1 etc vs Species 2 and Species 3.
- Is there a significant difference in species abundance between collection methods?
- Is there a significant difference in species abundance between locations?
Any advice on how to analyse the data would be greatly appreciated!