Adding colors to clustering plot

Hello fellow R users,

I am quite new at R, and I want to changes the colors of a clustering plot.

You can see here a sample of my data, I have 3 dimensions (the values of each are comprised between 0 and 2). I want 3 clusters (k-means method) on the basis of these 3 dimensions, like on the first plot of my example.

Now, I would like to attribute a color of my chosing to each of these clusters. In know with the plot() function I am supposed to enter the color argument in col=, but in the example I followed to buid my plot, col= is used to let R know I want separate colors for separate clusters.

I have tried few things with fpc package that are not satisfactory, and tutorials I found only describe the way to plot clustering from 2 dimensions.

(Sorry I can't post images, you have to run the code to actually see what I am talking about...) Would anyone have a solution to chose the colors on my clustering plot ?

``` r
# LIBRARIES
library("dplyr")
library("ggplot2")
library("fpc")

# DATAFRAME
data <- tibble::tribble(
  ~Ws, ~Biais, ~Alpha,
  1.591708569, 0.461244729, 1.777250762,
  1.958865971, 0.543320189, 1.436227231,
  1.990803599, 0.521928246, 0.982365297,
  1.990218758, 0.578461771, 1.24661541,
  1.637449971, 0.512539274, 1.883289713,
  0.927799193, 0.50656114, 1.104014063,
  1.591708569, 0.461244729, 1.277250762,
  1.958865971, 0.443329489, 1.036227231,
  1.997803599, 0.521928246, 0.982465297,
  1.990218758, 0.524471771, 1.24661541,
  1.237489971, 0.512539274, 1.883226713,
  0.905599193, 0.50346114, 1.106014063,
  1.578708569, 0.461244729, 1.377250762,
  1.958865971, 0.543329489, 1.036227231,
  1.997803599, 0.497218246, 0.982465297,
  1.990218758, 0.524471771, 1.24661541,
  1.637489971, 0.512539274, 1.383226713,
  0.905599193, 0.50346114, 1.106014063
)


# CODE

k <- 3 # number of clusters
fitK <- kmeans(data, k)
fitK
str(fitK)
fitK$cluster
clusterlist <- fitK$cluster
dataclust <- cbind(data, clusterlist)

plot(data, col = fitK$cluster)


```

# TRIED SOLUTIONS

# with fpc package, doesn't show the 3 dimensions, makes weird plots
# solution 1
plotcluster(data,
  fitK$cluster,
  pch = fitK$cluster,
  clvecd = c(1, 2, 3),
  col = c("green", "purple", "red")
)

```

# solution 2
vcol <- c("green", "purple", "red")
plotcluster(data, fitK$cluster, col = vcol[fitK$cluster])


<sup>Created on 2021-07-01 by the [reprex package](https://reprex.tidyverse.org) (v2.0.0)</sup>

The plot your code produces seems like that's not the format you want to go for, no? Maybe this is not helpful, but you could make a 3D scatterplot and color the variables by their cluster.

library(plotly)
plot_ly(x=dataclust$Ws, y=dataclust$Biais, z=dataclust$Alpha, type="scatter3d", mode="markers", color=dataclust$clusterlist)

Thank you, I did go with this elegant solution, thank you for your time and answer !

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.