Intra-Cluster Analysis

Dear All,

I wanted to ask if anyone knew how to obtain a distribution of cluster information, beyond standard summary statistics?

For example, suppose I obtained three clusters from an unsupervised pam routine on a data set called trial.

The descriptive statistics would be given by the command:

pam_fit <- pam(gower_dist, diss = TRUE, k = 3)
pam_results <- trial %>%
  mutate(cluster = pam_fit$clustering) %>%
  group_by(cluster) %>%
  do(the_summary = summary(.))

Here, I would obtain max/min/median within clusters.

However, if I wanted to plot the distribution of information within a certain cluster, say 'age' within cluster 1, then would anybody know how to do this?

Would appreciate any feedback.


If I read your question correctly, another option would be to plot the different distribution. ggplot2's histogram and density geoms, and cdf, could be a nice way to see things.

Thank you Curtis. I appreciate your feedback and follow up on your links :slight_smile: