 When to use yates continuity correction??

Hello,

Can someone please tell me when I need to use Yate continuity correction (using "correct = TRUE" instead of "correct = FALSE") when conducting chi-squared analysis (Pls see an example of my code below)?

chisq.test(mydata\$obesity, mydata\$social_status,correct = TRUE)

x <- matrix(c(17, 13, 8, 20), nc = 2)
chisq.test(x,correct = TRUE)

And the results are:

Pearson's Chi-squared test with Yates' continuity correction

data: x
X-squared = 3.5862, df = 1, p-value = 0.05826

1 Like

Can you pls explain why you used correct = TRUE in your example?

The correction can be appropriate if you have a small number of counts in one or more of the cells of the table. I have seen 5 mentioned as a threshold. How many counts do you have?

The chi-square is a continuous distribution while the underlying binomial(s) is(are) discrete. With large enough cell sizes, this tends not to matter very much. But with small cell sizes, it can matter quite a bit because the chi-square's approximation improves as cell sizes increase. It is for this reason that prop.test and chisq.test include the correction as an option.

2 Likes

BECAUSE if the test it`s with continuity correction, it must have TRUE (if not, FALSE)
https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/chisq.test

Anyway, I made it for you, and note the difference result:

Have a nice day!

Sorry for not making this more clear but my question is why you need to use yates continuity correction in your case?

@rwalker
Thank you so much! So, if I have a small number of counts in one or more of the cells of the table (e.g. <5), why not just use Fisher's exact test? What's the difference between Fisher's exact vs Chi-squared with correction?

Fisher's exact test obviates the need for even considering the correction. Same for tests associated with Barnard and Boschloo. In short, one of the three aforementioned exact tests is likely better than the chi-square test.

1 Like

So in summary, if I have <5 in one of the cells of the table, I should use Fisher's exact and if not, I should use the chi-squared test (with correction), is this right?

In other words, we should avoid using the chi-squared test without correction altogether?

Fisher's exact test was designed for small cell problems; once the table is sufficiently dense, then a chi-square test is usually performed because the approximation is far better. 5 is rather arbitrary, but the intuition is correct. The chi-square tests without correction is a perfectly valid approximation as the sample sizes grow large and the large samples obviate the need for the correction. I do not think we should avoid using the chi-squared test without correction altogether is correct.

when some cell counts are low, typically understood to mean “below 10” or “below 5”. The Yates’
Correction, therefore, is used when conducting a Pearson’s Chi-squared test on
2 × 2 contingency tables and prevents overestimation of statistical significance;

So in my opinion, and reading the article, you will have to write:

correct = TRUE (When is bellow 10 or 5

And write:

correct=FALSE (When it is greater than 10)

Greetings from Chile, my friend!!

1 Like

In Intro to Categorical Data Analysis by Agresti the author recommends to keep correction = false. I can't recall off the top of my head why though.

Also, in the text the author mentions exact binomial tests/functions one could use instead of fishers exact test for small sample sizes.

1 Like

Thanks for providing this here.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.