How to interpret chi-square in R

muhammadinfo · September 2, 2020, 5:16pm

Hi,
After doing the chi-square I got these results, how to interpret it?

chisq.test(FL16_pe$LASTDEN3, FL16_pe$DIABETE3)

Pearson's Chi-squared test

data: FL16_pe$LASTDEN3 and FL16_pe$DIABETE3
X-squared = 263.38, df = 2, p-value < 2.2e-16

technocrat · September 2, 2020, 8:14pm

The null hypothesis, H0, is that the two variables are independent, the alternative, H1, is that they are not. Having first selected \alpha, if the test statistic, X^2 has a p-value \leq 1-\alpha, H0 is accepted, otherwise H0 fails to be rejected, leaving H1

muhammadinfo · September 2, 2020, 9:03pm

Thank you for your explanation. I am very new to this area, could you please elaborate more?
I still did not understand the connection between these two variables!

Thanks so much!

technocrat · September 2, 2020, 9:14pm

For a value of \alpha \leq 0.001, for example, the p-value for the test statistic is very low — p-value < 2.2e-16. This signifies that the probability of observing X^2 at this value is \approx 4.5^{-15}, which leads to the failure to reject H0, the null hypothesis, and to conclude that the variables are consistent with independent distributions.

FJCC · September 2, 2020, 9:39pm

Below is a simple example that might help clarify what the chi-squared test is looking at. Let's say we have a two samples of objects that have two properties: color and height. The color can be black or white and height can be low or high. We want to test if color and height are independent of each other. In the first sample, half of the objects are black and half are white while 2/3 of them are high and 1/3 are low. Within the black objects, it is also true that half are high and half are low and the same is true of the white objects. The chi-squared test gives a p value of 1, showing that we cannot reject the null hypothesis that the groups are independent.Knowing the color of an object does not help us in knowing the height.
In the second sample, the overall population again has 1/2 high and 2/3 black but within the black population three quarters of the objects are high and within the white population all of them are low. It seems high/low is not independent of black/white and indeed the chi-squared test gives a low p-value.

ColorVec <- c(rep("Black", 20), rep("White", 10))
DF <- data.frame(Ht = rep(c("High", "Low"), each = 30),
                 Color = rep(ColorVec, 2))
table(DF$Ht, DF$Color)
#>       
#>        Black White
#>   High    20    10
#>   Low     20    10
chisq.test(DF$Ht, DF$Color)
#> 
#>  Pearson's Chi-squared test
#> 
#> data:  DF$Ht and DF$Color
#> X-squared = 0, df = 1, p-value = 1


ColorVec2 <- c(rep("Black", 40), rep("White", 20))
DF2 <- data.frame(Ht = rep(c("High", "Low"), each = 30),
                 Color = ColorVec2)
table(DF2$Ht, DF2$Color)
#>       
#>        Black White
#>   High    30     0
#>   Low     10    20
chisq.test(DF2$Ht, DF2$Color)
#> 
#>  Pearson's Chi-squared test with Yates' continuity correction
#> 
#> data:  DF2$Ht and DF2$Color
#> X-squared = 27.075, df = 1, p-value = 1.957e-07

^{Created on 2020-09-02 by the reprex package (v0.3.0)}

system · September 23, 2020, 9:39pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.