1) dealing with Nas, and 2) designating subgroups for analyses, e.g., dunnTest

library(reprex)
library(FSA)

FSA v0.8.30. See citation('FSA') if used in publication.

Run fishR() for related website and fishR('IFAR') for related book.

#the minimally representative data set
View(test2)
test2

A tibble: 20 x 5

 REG   COM   NUM   AGE EDU  


1 1 1 101 63 2
2 1 1 102 56 6
3 1 1 103 33 3
4 1 1 104 49 6
5 1 1 105 65 3
6 1 2 201 73 6
7 1 2 202 70 5
8 1 2 203 32 6
9 1 2 204 33 6
10 1 2 205 49 6
11 2 5 501 67 3
12 2 5 502 81 3
13 2 5 503 42 6
14 2 5 504 55 Na
15 2 5 505 82 Na
16 2 6 601 75 1
17 2 6 602 62 0
18 2 6 603 63 0
19 2 6 604 67 2
20 2 6 605 79 0

note that there are two regions (REG), and across these, there are four communities (COM). Each row is a households, indicated with individual number (NUM).
#The goal is to be able to conduct analyses at different levels: between REG, between all COM, and between COM within a region. This last designation is what I can't seem to get right.
# 1st question, because of Nas, I can't re-assign EDU as an integer, instead of character. How can I address the NAs issue here?
EDU <- as.numeric(test2$EDU)
Warning message:
NAs introduced by coercion
class(test2$EDU)
[1] "character"

#I'll give enxamples using AGE, which is an integer
AGE <- as.numeric(test2$AGE)
class(test2$AGE)
[1] "integer"

kruskal.test(AGE ~ REG,

  •                data = test2)
    

    Kruskal-Wallis rank sum test

data: AGE by REG
Kruskal-Wallis chi-squared = 4.025, df = 1, p-value = 0.04483

#these work
kruskal.test(AGE ~ COM,

  •                  data = test2)
    

    Kruskal-Wallis rank sum test

data: AGE by COM
Kruskal-Wallis chi-squared = 4.0894, df = 3, p-value = 0.252

below is same variable, but among COM in REG 1

kruskal.test(AGE ~ COM, REG == 1,

  •                 data = test2)
    

    Kruskal-Wallis rank sum test

data: AGE by COM
Kruskal-Wallis chi-squared = 0.011043, df = 1, p-value = 0.9163

#that works, e.g., see df.
#2nd question: How to do the same designation for other analyses. For example, Dunn's test also among COM in REG 1. I try the same designation, REG == 1, but doesn't work
DT <- dunnTest(AGE ~ COM, REG ==1,

  •                         data= test2,
    
  •                            method="bh")
    

Warning messages:
1: COM was coerced to a factor.
2: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
3: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
4: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
5: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
6: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
7: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
8: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
9: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
10: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used

#I'm brand new to Rstudio, obviously! Your suggestions would be appreciated.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.