I also have these information in my sample. I also have their counts. Now I want to know whether the columns are significantly differently distributed. What is the test I can use?
I don't think Chi-square test is suitable here. I refer to the book 'R IN ACTION', and it said: "Chi-square tests are often used to assess the relationship between two categorical variables. The null hypothesis is typically that the variables are independent versus a research hypothesis that they aren't."
For example, ethnicity Versus individuals expected to be promoted
In my case, I am comparing the same categorical variable. One is from sample, one is population (true value). The goal to do this step is to justify no bias in responses. In other words, my sample is representative to the population. What test should I use?
The chiSq test here would tell you whether the sample is independent of the data. You do not want this happening sine, this would mean that the sample is not representative. Hence, a p value greater than 0.05 is desirable - suggesting that the sample is not significantly different than the data.
In my sample, we have
Bundle administrator Bundle member
712 476
It looks like my sample is representative. However, my codes are:
bundle <- c(712, 476)
res <- chisq.test(bundle, p = c(1224/1823, 599/1823))
res
Chi-squared test for given probabilities
data: bundle
X-squared = 27.989, df = 1, p-value = 1.22e-07
The outcome it is very significant different. I don't understand.