R: circlize_dendrogram(): ERROR - Not all labels are unique. Therefore, we pad the labels with a running number, so to be able to produce the plot

Issue

I want to produce a "fan" shaped dendrogram using the function circlize_dendrogram() in the package circlize . I would like to cluster the branches and tips by dolphin whistle subtypes (see tip labels below) measured from a spectrogram and then group them by the Country vector (three classes - 'Spain', 'Holland', and France') that they belong. Hence, I would like to colour the branches, tips, and tip labels (see below) the same colour as the group/cluster they belong to (e.g. one colour per country).

However, I got this error message (see below) because the whistle-type code (or desired tip labels) has detected there are duplicates by the function circlize_dendrogram() , and in response, the function has padded the tip labels with a running number (see diagram 1) . In my study, each whistle type (i.e. tip label codes) has been detected in all three countries, hence this is why the function considers the tip labels as not being unique.

Error Message

Warning message:
In circlize_dendrogram(dend_rainbow, dend_track_height = 0.8, labels_track_height = 0.1,  :
  Not all labels are unique. Therefore, we pad the labels with a running number, so to be able to produce the plot.

I am unsure how to specify in the R-code (see below) that the colour of the branches, tips, and tip labels should represent the whistle subtypes being grouped or/clustered by the vector 'Country', and how to prevent the padded running numbers being added to the tip labels since the tip labels are unique if grouped per Country .

In addition, in diagram 1 - the output of my R-code cannot be correct as you cannot have more than one whistle-type tip label within one class contained in the vector 'Country' (see data below) as each row in the data frame contains a unique $Whistle_Type_Sub . For instance, you cannot have two 'EA' types or two DD1 types for France. There should be one tip label for each whistle type code per Country i.e. EA, DD1, F, AC etc.

Would anyone be able to help solve this error? I would be deeply appreciative.

Many thanks

Issue

I want to produce a "fan" shaped dendrogram using the function circlize_dendrogram() in the package circlize . I would like to cluster the branches and tips by dolphin whistle subtypes (see tip labels below) measured from a spectrogram and then group them by the Country vector (three classes - 'Spain', 'Holland', and France') that they belong. Hence, I would like to colour the branches, tips, and tip labels (see below) the same colour as the group/cluster they belong to (e.g. one colour per country).

However, I got this error message (see below) because the whistle-type code (or desired tip labels) has detected there are duplicates by the function circlize_dendrogram() , and in response, the function has padded the tip labels with a running number (see diagram 1) . In my study, each whistle type (i.e. tip label codes) has been detected in all three countries, hence this is why the function considers the tip labels as not being unique.

Error Message

Warning message:
In circlize_dendrogram(dend_rainbow, dend_track_height = 0.8, labels_track_height = 0.1,  :
  Not all labels are unique. Therefore, we pad the labels with a running number, so to be able to produce the plot.

I am unsure how to specify in the R-code (see below) that the colour of the branches, tips, and tip labels should represent the whistle subtypes being grouped or/clustered by the vector 'Country', and how to prevent the padded running numbers being added to the tip labels since the tip labels are unique if grouped per Country .

In addition, in diagram 1 - the output of my R-code cannot be correct as you cannot have more than one whistle-type tip label within one class contained in the vector 'Country' (see data below) as each row in the data frame contains a unique $Whistle_Type_Sub . For instance, you cannot have two 'EA' types or two DD1 types for France. There should be one tip label for each whistle type code per Country i.e. EA, DD1, F, AC etc.

Would anyone be able to help solve this error? I would be deeply appreciative.

Many thanks

Data frame structure

'data.frame':   62 obs. of  3 variables:
 $ Country         : Factor w/ 3 levels "Spain","France",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Whistle_Type_Sub: Factor w/ 28 levels "A","AA","AA1",..: 1 24 25 11 2 18 7 5 9 13 ...
 $ N      
 : int  25 64 31 5 4 5 3 10 2 1 ...

Desired labels

mylabels<- c("A", "E", "EA", "BE", "AA","D", "B", "AD", "BC", "CA", "AA1", "DD1", "ED", "DC", "C", "AC", "ADC", "DE", 
             "EA",  "A", "E", "DE", "F", "BE", "D", "EE1", "B", "CA", "DB",  "BB", "AA1", "ED", "AD",  "DD1", "AA", 
             "A","ED", "E","DD", "DD1","CDC", "C", "AC", "D",  "F", "EE1","BCB", "DC", "ADC", "DE",  "CA", "AA",  "BE",  
             "CBC", "B", "EA",  "AA1", "AD",  "BB",  "CD", "CB",  "DB")

R Code

library(cluster) #agnes function
library(usedist) #change label names
library(ape) #create fan data frame
library('dendextend')
library(RColorBrewer) #Customise the colour palette
library(phytools)

#DENDROGRAM 
Cluster.Country.Dist<-dist(cbind(Sub_Whistle_Count[1:2], 2*(as.numeric(Sub_Whistle_Count$Whistle_Type_Sub)-2)))
Cluster.Country.Dist

##tip labels
mylabels<- c("A", "E", "EA", "BE", "AA","D", "B", "AD", "BC", "CA", "AA1", "DD1", "ED", "DC", "C", "AC", "ADC", "DE", 
             "EA",  "A", "E", "DE", "F", "BE", "D", "EE1", "B", "CA", "DB",  "BB", "AA1", "ED", "AD",  "DD1", "AA", 
             "A","ED", "E","DD", "DD1","CDC", "C", "AC", "D",  "F", "EE1","BCB", "DC", "ADC", "DE",  "CA", "AA",  "BE",  
             "CBC", "B", "EA",  "AA1", "AD",  "BB",  "CD", "CB",  "DB")


#Change the rows and column label names in the dist object
NewLabels<-dist_setNames(Cluster.Country.Dist, mylabels[1:62])
NewLabels

#Hierarchical Clustering algorithm on the dataset using hclust() 
Cluster.Country.hcl<-hclust(NewLabels)
Cluster.Country.hcl

#Create a dendrogram object
dend = as.dendrogram(Cluster.Country.hcl) 
dend 

#Connecting label color with the country label
geo <- factor(c("Spain", "France", "Holland"))
(mycol<-brewer.pal(3, "Dark2")[geo])

#State the colour of the tip and branch labels 
#k=3; three countries' three clades
dend_Final <- dend %>% 
                 color_branches(k=3, col=mycol) %>% 
                 color_labels(k=3, col=mycol)


#Open a new graphics window
dev.new()

#State the dimensions of the graphics window
par(mar = rep(0, 4))

#Produce the "fan" type dendrogram
circlize_dendrogram(dend_rainbow, dend_track_height = 0.8, labels_track_height = 0.1, labels = TRUE)

Data

structure(list(Country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Spain", "France", 
"Holland"), class = "factor"), Whistle_Type_Sub = structure(c(1L, 
24L, 25L, 11L, 2L, 18L, 7L, 5L, 9L, 13L, 3L, 22L, 26L, 20L, 12L, 
4L, 6L, 23L, 25L, 1L, 24L, 23L, 28L, 11L, 18L, 27L, 7L, 13L, 
19L, 8L, 3L, 26L, 5L, 22L, 2L, 1L, 26L, 24L, 21L, 22L, 17L, 12L, 
4L, 18L, 28L, 27L, 10L, 20L, 6L, 23L, 13L, 2L, 11L, 15L, 7L, 
25L, 3L, 5L, 8L, 16L, 14L, 19L), .Label = c("A", "AA", "AA1", 
"AC", "AD", "ADC", "B", "BB", "BC", "BCB", "BE", "C", "CA", "CB", 
"CBC", "CD", "CDC", "D", "DB", "DC", "DD", "DD1", "DE", "E", 
"EA", "ED", "EE1", "F"), class = "factor"), N = c(25L, 64L, 31L, 
5L, 4L, 5L, 3L, 10L, 2L, 1L, 2L, 3L, 2L, 1L, 1L, 3L, 2L, 4L, 
26L, 54L, 20L, 10L, 18L, 7L, 7L, 10L, 2L, 3L, 2L, 2L, 2L, 1L, 
1L, 1L, 1L, 23L, 1L, 13L, 10L, 5L, 4L, 8L, 9L, 9L, 20L, 1L, 1L, 
9L, 1L, 9L, 2L, 6L, 3L, 1L, 10L, 9L, 2L, 3L, 1L, 2L, 2L, 3L)), row.names = c(NA, 
-62L), class = "data.frame")

Diagram 1: Output from R-code

enter image description here

Diagram 2: Desired tip label output without running numbers, although, the branches, tips and tip labels are not the same colours

enter image description here

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.