Hi All,
I've so far found very few options for internal cluster validation using k-modes.
However, I recently found a paper which use a validation metric known as the: SW/SB ratio.
- SW = standard deviation on the group.
- SB = standard deviation between the group.
Yet, I've not personally seen an example in R using standard k-modes packages such as klaR.
Therefore, I wondered if anyone would be able to show a worked example of the SW/SB ratio in R using a toy data set?
All help would be appreciated.
library(klaR)
x1 = rep(1:3, times = 40)
x2 = rep(1:3, times = 40)
x3 = rep(1:3, times = 40)
x4 = rep(1:3, times = 40)
x5 = rep(1:3, times = 40)
dat <- data.frame(x1, x2, x3, x4, x5)
km <- kmodes(dat, 3)
The original paper is here.
https://iopscience.iop.org/article/10.1088/1757-899X/1087/1/012085/pdf