How to perform a two-step Cluster Analysis in R?

I'm trying to do the two-step cluster analysis known from SPSS in R since I don't have a license for SPSS. For this, I came across the package 'prcr'. There is a command to perform such an analysis. Strangely enough, I have to specify how many clusters I want to have, whereas the advantage of the two-step analysis is that it tries to determine the optimal number on the first step and then implements it in step 2 with k-means.

Does anyone of you know how I can implement this procedure in R?

Here you can find my code:

daten.iris <- na.omit(iris)

daten.iris <- scale(iris[, -5])

library(prcr)

twostep.res <- create_profiles_cluster(daten.iris, Sepal.Width, Sepal.Length, Petal.Width, Petal.Length, n_profiles = 3)

I appreciate any help.

I searched nearly everywhere but I can't find anything.

1 Like

Hi @Mingabua,

I don't know an automatic way of doing it, but there is one simple solution.

You can use the function NbClust from NbClust package to obtain the best number of clusters (link). This function implements a wide range of algorithms and gives you the most voted solution.

Another option is to only carry out the elbow method with factorextra::fviz_nbclust function.

Then, use stats::kmeans function to compute your cluster analysis specifyng the optimal number.

This is a tutorial I found