I wanted to ask if anyone knew how to obtain a distribution of cluster information, beyond standard summary statistics?
For example, suppose I obtained three clusters from an unsupervised pam routine on a data set called trial.
The descriptive statistics would be given by the command:
pam_fit <- pam(gower_dist, diss = TRUE, k = 3) pam_results <- trial %>% mutate(cluster = pam_fit$clustering) %>% group_by(cluster) %>% do(the_summary = summary(.)) pam_results$the_summary
Here, I would obtain max/min/median within clusters.
However, if I wanted to plot the distribution of information within a certain cluster, say 'age' within cluster 1, then would anybody know how to do this?
Would appreciate any feedback.