I've so far found very few options for internal cluster validation using k-modes.
However, I recently found a paper which use a validation metric known as the: SW/SB ratio.
- SW = standard deviation on the group.
- SB = standard deviation between the group.
Yet, I've not personally seen an example in R using standard k-modes packages such as klaR.
Therefore, I wondered if anyone would be able to show a worked example of the SW/SB ratio in R using a toy data set?
All help would be appreciated.
library(klaR) x1 = rep(1:3, times = 40) x2 = rep(1:3, times = 40) x3 = rep(1:3, times = 40) x4 = rep(1:3, times = 40) x5 = rep(1:3, times = 40) dat <- data.frame(x1, x2, x3, x4, x5) km <- kmodes(dat, 3)
The original paper is here.