I have a dataframe with columns 'x' and 'y' corresponding to x/y coordinates of a scatterplot I've made using
ggplot2. I'm looking for some way to ask, "how many clusters exist here?". I understand that maybe some user input may be required here for what you want to call a 'cluster'.
I have found some success using
Seurat, because it contains a function to label clusters. However, it's more like finding the clusters that correspond to a vector of labels provided by the user (ex: I proivde 5 unique labels so just go find 5 clusters).
LabelClusters function is very useful for labeling clusters starting solely from X/Y coordinates:
library(umap) library(Seurat) my_umap <- umap(iris[,c(1:4)]) my_umap <- as.data.frame(my_umap$layout) my_umap$id <- iris$Species colnames(my_umap) <- c("x","y", "id") p <- ggplot(my_umap, aes(x=x,y=y,color=id)) + geom_point() LabelClusters(plot=p, id='id', color="black")
However, I have a need to detect total # of clusters from these data (without providing labels). By this I mean first detecting how many clusters exist . Maybe here we would see 5 clusters instead of 3 :
Can this be achieved in some way?