How to prepare a data for Cochran-Mantel-Haenszel Test ?

Hi All,
I have got this dataframe:

dt <- readr::read_fwf("
gender age_group diagnosis
male     young    x
female   child    y
female   adult    x
male     old      z")

dt %<>%
  janitor::row_to_names(row_number = 1)

I have watched this and read this:
https://www.youtube.com/watch?v=LUuLGLGI650

https://stackoverflow.com/questions/34750987/3-way-chi-squared-test-in-r

Like a user Tybran in the comment section said:"I think this video misses one very important detail: the difference between a significant effect (in this case the female group) and the a nonsignificant effect (the male group) is not per se a significant difference of itself. So it is important to recognize that with this method, the three-way interaction is not directly tested but instead 2 two-way interactions are descriptively compared".

Because I believe that R can do what SPSS can't, I want to do sort of chi square test for 4x3 table which would be a task for Cochran-Mantel-Haenszel Test I suppose. I prepared my dt dataframe but when I want to do CMH test it errors to:

Error in mantelhaen.test(dt) : if 'x' is not an array, 'y' must be given

or: when I have done:

dt = table(dt)

mantelhaen.test(dt)

Error in mantelhaen.test(df) : sample size in each stratum must be > 1

How to rectify this, please ? In my opinion there are more than 1 level in each variable in my dt dataframe.

is it intentional or an oversight that young/child and adult/old are presumable describing the same categories; otherwise with no overlaps, how would any analysis be attempted ?

Don't you need four variables to do a three-way?

library(magrittr)
dt <- readr::read_fwf("
gender age_group diagnosis
male     young    x
female   child    y
female   adult    x
male     old      z")
#> Rows: 5 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> 
#> chr (3): X1, X2, X3
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

dt %<>%
  janitor::row_to_names(row_number = 1)

dt_f <- ftable(diagnosis ~ gender + age_group, dt)

mantelhaen.test(dt_f)
#> Error in mantelhaen.test(dt_f): 'x' must be a 3-dimensional array

Created on 2023-01-21 with reprex v2.0.2

I do not know, I just followed those links:
https://stackoverflow.com/questions/34750987/3-way-chi-squared-test-in-r

and:

https://stackoverflow.com/questions/44953507/arranging-a-3-dimensional-contingency-table-in-r-in-order-to-run-a-cochran-mante?noredirect=1&lq=1

and on SO it works somehow, but for me it does not.

Kindly have a look at the links I provided, please. I am a bit confused as on SO it works, but when I tried their code, it errors.

I have found this as well, where they used Titanic datase but in a special shape and form:
https://cran.r-project.org/web/packages/samplesizeCMH/vignettes/samplesizeCMH-introduction.html

and CMH test works for them. I do not know how to convert my data like that.

Additionally I reprexed code that I suppose worked in 2017, but does not work for me at present.
https://stackoverflow.com/questions/44953507/arranging-a-3-dimensional-contingency-table-in-r-in-order-to-run-a-cochran-mante?noredirect=1&lq=1

library(stats)

ex.label <- c("A","A","A","A","A","A","A","B","B","B")
ex.status  <- c("+","+","-","+","-","-","-","+","+","-")
ex.diag <- c("X","X","Z","Y","Y","Y","X","Y","Z","Z")
ex.data <- data.frame(ex.label,ex.diag,ex.status)

ex.ftable <- ftable(ex.data)

mtx3D <- aperm(array(t(as.matrix(ex.ftable)),c(2,2,3)),c(2,1,3)) 

stats::mantelhaen.test(mtx3D, exact=FALSE)
#> Error in stats::mantelhaen.test(mtx3D, exact = FALSE): sample size in each stratum must be > 1

Created on 2023-01-22 with reprex v2.0.2

More data is needed to be comparable to S/O

library(magrittr)
dt <- readr::read_fwf("
gender age_group diagnosis
male     young    x
female   child    y
female   adult    x
male     old      z")
#> Rows: 5 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> 
#> chr (3): X1, X2, X3
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# fix the data structure
colnames(dt) <- dt[1,]
dt <- dt[-1,]
# compare our data
ftable(dt)
#>                  diagnosis x y z
#> gender age_group                
#> female adult               1 0 0
#>        child               0 1 0
#>        old                 0 0 0
#>        young               0 0 0
#> male   adult               0 0 0
#>        child               0 0 0
#>        old                 0 0 1
#>        young               1 0 0
# to example in S/O
ex.label <- c("A","A","A","A","A","A","A","B","B","B")
ex.status  <- c("+","+","-","+","-","-","-","+","+","-")
ex.diag <- c("X","X","Z","Y","Y","Y","X","Y","Z","Z")
ex.data <- data.frame(ex.diag, ex.label, ex.status)
( ex.ftable <- ftable(ex.data) )
#>                  ex.status - +
#> ex.diag ex.label              
#> X       A                  1 2
#>         B                  0 0
#> Y       A                  2 1
#>         B                  0 1
#> Z       A                  1 0
#>         B                  1 1
( mtx3D <- aperm(array(t(as.matrix(ex.ftable)),c(2,2,3)),c(2,1,3)) )
#> , , 1
#> 
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    0    0
#> 
#> , , 2
#> 
#>      [,1] [,2]
#> [1,]    2    1
#> [2,]    0    1
#> 
#> , , 3
#> 
#>      [,1] [,2]
#> [1,]    1    0
#> [2,]    1    1
mantelhaen.test(mtx3D, exact=FALSE)
#> 
#>  Mantel-Haenszel chi-squared test with continuity correction
#> 
#> data:  mtx3D
#> Mantel-Haenszel X-squared = 0.23529, df = 1, p-value = 0.6276
#> alternative hypothesis: true common odds ratio is not equal to 1
#> 95 percent confidence interval:
#>  NaN NaN
#> sample estimates:
#> common odds ratio 
#>               Inf
mantelhaen.test(mtx3D, exact=TRUE)
#> 
#>  Exact conditional test of independence in 2 x 2 x k tables
#> 
#> data:  mtx3D
#> S = 4, p-value = 0.5
#> alternative hypothesis: true common odds ratio is not equal to 1
#> 95 percent confidence interval:
#>  0.1340796       Inf
#> sample estimates:
#> common odds ratio 
#>               Inf

Created on 2023-01-22 with reprex v2.0.2

Thank you very much for your help, what did I miss ?

I just think there are two few cases in the original dt after running the S/O, which worked and only apparent difference was the number of cases. Too many empties

Thank you very much indeed,

Additionally I have found this link which provides good explanation of the interpretation of CMH results:
https://stats.stackexchange.com/questions/88315/how-to-interpret-cochran-mantel-haenszel-test

1 Like

Would it be possible to convert object ex.data to array ?

For example with the usage of array() function ?

Yes, that's an intermediate step

ex.label <- c("A","A","A","A","A","A","A","B","B","B")
ex.status  <- c("+","+","-","+","-","-","-","+","+","-")
ex.diag <- c("X","X","Z","Y","Y","Y","X","Y","Z","Z")
ex.data <- data.frame(ex.diag, ex.label, ex.status)
ex.data
#>    ex.diag ex.label ex.status
#> 1        X        A         +
#> 2        X        A         +
#> 3        Z        A         -
#> 4        Y        A         +
#> 5        Y        A         -
#> 6        Y        A         -
#> 7        X        A         -
#> 8        Y        B         +
#> 9        Z        B         +
#> 10       Z        B         -
ex.ftable <- ftable(ex.data)
ex.ftable
#>                  ex.status - +
#> ex.diag ex.label              
#> X       A                  1 2
#>         B                  0 0
#> Y       A                  2 1
#>         B                  0 1
#> Z       A                  1 0
#>         B                  1 1
as.matrix(ex.ftable)
#>                 ex.status
#> ex.diag_ex.label - +
#>              X_A 1 2
#>              X_B 0 0
#>              Y_A 2 1
#>              Y_B 0 1
#>              Z_A 1 0
#>              Z_B 1 1
t(as.matrix(ex.ftable))
#>          ex.diag_ex.label
#> ex.status X_A X_B Y_A Y_B Z_A Z_B
#>         -   1   0   2   0   1   1
#>         +   2   0   1   1   0   1
array(t(as.matrix(ex.ftable)))
#>  [1] 1 2 0 0 2 1 0 1 1 0 1 1
aperm(array(t(as.matrix(ex.ftable)),c(2,2,3)),c(2,1,3))
#> , , 1
#> 
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    0    0
#> 
#> , , 2
#> 
#>      [,1] [,2]
#> [1,]    2    1
#> [2,]    0    1
#> 
#> , , 3
#> 
#>      [,1] [,2]
#> [1,]    1    0
#> [2,]    1    1

Thank you very much for your kind reply.

My last question would be: is there a way to quickly replace zeros in the following dt_ftable ?

As you explained there were too many zeros in my original dt dataframe. I have added some rows but still there are many empties. I was wondering if I could replace them quickly with random numbers (preserving some degree of sense), just for learning purposes, thank you.

``` r
dt_ftable <- structure(c(
  12L, 0L, 0L, 0L, 0L, 0L, 0L, 12L, 0L, 12L, 0L, 0L,
  0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 12L, 0L
), dim = c(
  8L,
  3L
), class = "ftable", row.vars = list(gender = c("female", "male"), 
age_group = c("adult", "child", "old", "young")),
 col.vars = list(
  diagnosis = c("x", "y", "z")
))

dt_ftable
#>                  diagnosis  x  y  z
#> gender age_group                   
#> female adult               12  0  0
#>        child                0 12  0
#>        old                  0  0  0
#>        young                0  0  0
#> male   adult                0  0  0
#>        child                0  0  0
#>        old                  0  0 12
#>        young               12  0  0

Created on 2023-01-23 with reprex v2.0.2

I wouldn't try to dig the object out of the ftable object. One of the advantages of fake data is that it's possible to just pull it out of the air.

gender <- c("male","female")
age_group <- c("child","young","adult","old")
diagnosis <- c("x","y","z")

set.seed(42)
dt <- data.frame(
    gender = sample(gender,100,replace = TRUE),
    age_group = sample(age_group,100,replace = TRUE),
    diagnosis = sample(diagnosis,100, replace = TRUE))
(dt_f <- ftable(diagnosis ~ gender + age_group, dt))
#>                  diagnosis x y z
#> gender age_group                
#> female adult               6 2 3
#>        child               5 4 7
#>        old                 3 6 3
#>        young               6 6 5
#> male   adult               3 2 2
#>        child               6 3 2
#>        old                 3 3 6
#>        young               6 6 2

mtx3D <- aperm(array(t(as.matrix(dt_f)),c(2,2,3)),c(2,1,3)) 

mantelhaen.test(mtx3D, exact=FALSE)
#> 
#>  Mantel-Haenszel chi-squared test with continuity correction
#> 
#> data:  mtx3D
#> Mantel-Haenszel X-squared = 0.0028369, df = 1, p-value = 0.9575
#> alternative hypothesis: true common odds ratio is not equal to 1
#> 95 percent confidence interval:
#>  0.4166994 3.3758304
#> sample estimates:
#> common odds ratio 
#>          1.186047