Intra-Cluster Analysis

Dear All,

I wanted to ask if anyone knew how to obtain a distribution of cluster information, beyond standard summary statistics?

For example, suppose I obtained three clusters from an unsupervised pam routine on a data set called trial.

The descriptive statistics would be given by the command:

pam_fit <- pam(gower_dist, diss = TRUE, k = 3)
pam_results <- trial %>%
  mutate(cluster = pam_fit$clustering) %>%
  group_by(cluster) %>%
  do(the_summary = summary(.))
pam_results$the_summary

Here, I would obtain max/min/median within clusters.

However, if I wanted to plot the distribution of information within a certain cluster, say 'age' within cluster 1, then would anybody know how to do this?

Would appreciate any feedback.

Best

If I read your question correctly, another option would be to plot the different distribution. ggplot2's histogram and density geoms, and cdf, could be a nice way to see things.

Thank you Curtis. I appreciate your feedback and follow up on your links :slight_smile:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.