Hi everyone. So i have the following problem: for two of my datasets that both contain a categorical variable and a numeric variable of interest the kruskal.test()
function gives me the exact same result, twice.
Concerning my datasets, they do share similarities: whereas the numeric variable I am talking about in both datasets contain quite different values, the categorical variable has the exact same content in both datasets. Both datasets have the same amount of datapoints. Also, the respective categorical and numeric variable have the same name in both datasets.
Now, let me show you what I mean exactly:
> kruskal.test(Signaltonoise ~ MatrixSolution, data = Datasheet_2_matrix_sel_105)
Kruskal-Wallis rank sum test
data: Signaltonoise by MatrixSolution
Kruskal-Wallis chi-squared = 18.701, df = 3, p-value = 0.0003152
> kruskal.test(Signaltonoise ~ MatrixSolution, data = Datasheet_2_matrix_sel_163)
Kruskal-Wallis rank sum test
data: Signaltonoise by MatrixSolution
Kruskal-Wallis chi-squared = 18.701, df = 3, p-value = 0.0003152
My categorical variable (MatrixSolution) splits my data points (21 in total) in both sets the same way:
category 1 --> datapoints 1 to 5 (5 in total)
category 2 --> datapoints 6 to 11 (6 in total)
category 3 --> datapoints 12 to 17 (6 in total)
category 4 --> datapoints 18 to 21 (4 in total)
The two numerical vectors (Signaltonoise) are:
for the dataset that ends with 105
|1|3.2752879|
|2|1.9166651|
|3|2.5643237|
|4|2.3300389|
|5|2.2994027|
|6|1.1736778|
|7|1.0620759|
|8|0.5249439|
|9|0.7423361|
|10|1.1883668|
|11|0.4138182|
|12|30.1478089|
|13|36.7350398|
|14|16.2086811|
|15|26.2752890|
|16|35.4236749|
|17|25.2327129|
|18|10.8551473|
|19|10.8011864|
|20|12.1467999|
|21|9.4906094|
for the dataset that ends with 163
|1|4.0289918|
|2|2.8699921|
|3|2.8377330|
|4|3.1811226|
|5|2.5667746|
|6|1.6326522|
|7|1.6269232|
|8|1.2101360|
|9|0.8997288|
|10|1.2155427|
|11|0.4847995|
|12|81.2482026|
|13|77.7213035|
|14|40.7294365|
|15|61.1245143|
|16|82.0395821|
|17|66.3434549|
|18|23.2610273|
|19|20.6216288|
|20|26.6061189|
|21|25.1866147|
I have other pairs of datasets that also share the same similarities between each other as the the two datasets that I described above, and there I do not have this problem. Theoretically it's possible that those two datasets really do share the exact same Kruskal - Wallis output parameters, but I find it more than hard to believe.
Can anybody explain to me whats going on? Thanks a lot in advance