Hi All,

I've so far found very few options for internal cluster validation using k-modes.

However, I recently found a paper which use a validation metric known as the: SW/SB ratio.

  • SW = standard deviation on the group.
  • SB = standard deviation between the group.

Yet, I've not personally seen an example in R using standard k-modes packages such as klaR.

Therefore, I wondered if anyone would be able to show a worked example of the SW/SB ratio in R using a toy data set?

All help would be appreciated.


x1 = rep(1:3, times = 40)
x2 = rep(1:3, times = 40)
x3 = rep(1:3, times = 40)
x4 = rep(1:3, times = 40)
x5 = rep(1:3, times = 40)

dat <- data.frame(x1, x2, x3, x4, x5)

km <- kmodes(dat, 3)

The original paper is here.

