Chi-square test of independence

Hi, Are crimes and storms independent? I was expecting chi-square to tell me no, that a disproportionate number of crimes occur when there are storms. Where is my work incorrect? Thank you.

# Are crimes and storms independent?
df <- data.frame(no_crime=c(11,32), crime=c(40,167))
rownames(df) = c("no_storm", "storm")
df
chi <- chisq.test(df) 
p <- chi$p.value 
cat("p = ",round(p,3), "\n")
ifelse(p < .05, ("Reject the null hypothesis, and conclude that the two variables are in fact dependent."), ("Fail to reject the null hypothesis, and conclude that the two variables are in fact independent." ))  

Hi Jerry!

I am fascinated by your question. Looks like your R code is right. You pass a contingency table to chisq.test.

Two things. First, why do you expect a lower p-value? When there is no storm, there is no crime 11 times and crime 40 times. That’s about 20% no crime. When there is a storm, there is no crime 32 times and crime 167 times, about 15%. These are not that different, so the p-value of .472 seems to be accurate.

We can also debate if using a dichotomous cut-off for p-values is a good way to determine significance. Either way, the effect size matters as much or more than the p-value.

Best,
Tom

Tom, thanks for your note. Now that I think about it,

P(crime) = 207/250 = .828
P(storm) = 199/250 = .796
P(crime & storm) = 167/250 = .668
P(crime | storm) = P(crime & storm) / P(storm) = .668/.796 = .839, which is pretty close to P(crime) = .828, so I guess the variables are pretty close to independent.

1 Like

The Yates correction is doing quite a bit of work here. The uncorrected statistic is 0.859 while the Yates corrected statistic is only 0.516.

1 Like

I always learn something new here. For all my p-value doubts, I did just look at .516 and say, "Yep, that looks high. Probably correct."

You can see chisq.test did use Yates correction by inspecting 'method':

df <- 
  data.frame(no_crime=c(11,32), crime=c(40,167))
  rownames(df) = c("no_storm", "storm")
df
chi <- chisq.test(df) 

chi$method

But is there a way to ask it not to perform the Yates correction?

Here you are:

chi2 <- chisq.test(df, correct=FALSE)
1 Like

It tells you in the output, "with Yates correction" but the OP's custom decision rule based on p-values masks that output as a choice.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.