Issue
I want to produce a "fan" shaped dendrogram
using the function circlize_dendrogram()
in the package circlize
. I would like to cluster the branches and tips by dolphin whistle subtypes (see tip labels below) measured from a spectrogram and then group them by the Country vector (three classes - 'Spain', 'Holland', and France')
that they belong. Hence, I would like to colour the branches, tips, and tip labels (see below) the same colour as the group/cluster they belong to (e.g. one colour per country).
However, I got this error message (see below) because the whistle-type code (or desired tip labels) has detected there are duplicates by the function circlize_dendrogram()
, and in response, the function has padded the tip labels with a running number (see diagram 1) . In my study, each whistle type (i.e. tip label codes) has been detected in all three countries, hence this is why the function considers the tip labels as not being unique.
Error Message
Warning message:
In circlize_dendrogram(dend_rainbow, dend_track_height = 0.8, labels_track_height = 0.1, :
Not all labels are unique. Therefore, we pad the labels with a running number, so to be able to produce the plot.
I am unsure how to specify in the R-code (see below) that the colour of the branches, tips, and tip labels should represent the whistle subtypes being grouped or/clustered by the vector 'Country', and how to prevent the padded running numbers being added to the tip labels since the tip labels are unique if grouped per Country
.
In addition, in diagram 1 - the output of my R-code cannot be correct as you cannot have more than one whistle-type tip label within one class contained in the vector 'Country' (see data below) as each row in the data frame contains a unique $Whistle_Type_Sub
. For instance, you cannot have two 'EA' types or two DD1 types for France. There should be one tip label for each whistle type code per Country
i.e. EA, DD1, F, AC etc.
Would anyone be able to help solve this error? I would be deeply appreciative.
Many thanks
Issue
I want to produce a "fan" shaped dendrogram
using the function circlize_dendrogram()
in the package circlize
. I would like to cluster the branches and tips by dolphin whistle subtypes (see tip labels below) measured from a spectrogram and then group them by the Country vector (three classes - 'Spain', 'Holland', and France')
that they belong. Hence, I would like to colour the branches, tips, and tip labels (see below) the same colour as the group/cluster they belong to (e.g. one colour per country).
However, I got this error message (see below) because the whistle-type code (or desired tip labels) has detected there are duplicates by the function circlize_dendrogram()
, and in response, the function has padded the tip labels with a running number (see diagram 1) . In my study, each whistle type (i.e. tip label codes) has been detected in all three countries, hence this is why the function considers the tip labels as not being unique.
Error Message
Warning message:
In circlize_dendrogram(dend_rainbow, dend_track_height = 0.8, labels_track_height = 0.1, :
Not all labels are unique. Therefore, we pad the labels with a running number, so to be able to produce the plot.
I am unsure how to specify in the R-code (see below) that the colour of the branches, tips, and tip labels should represent the whistle subtypes being grouped or/clustered by the vector 'Country', and how to prevent the padded running numbers being added to the tip labels since the tip labels are unique if grouped per Country
.
In addition, in diagram 1 - the output of my R-code cannot be correct as you cannot have more than one whistle-type tip label within one class contained in the vector 'Country' (see data below) as each row in the data frame contains a unique $Whistle_Type_Sub
. For instance, you cannot have two 'EA' types or two DD1 types for France. There should be one tip label for each whistle type code per Country
i.e. EA, DD1, F, AC etc.
Would anyone be able to help solve this error? I would be deeply appreciative.
Many thanks
Data frame structure
'data.frame': 62 obs. of 3 variables:
$ Country : Factor w/ 3 levels "Spain","France",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Whistle_Type_Sub: Factor w/ 28 levels "A","AA","AA1",..: 1 24 25 11 2 18 7 5 9 13 ...
$ N
: int 25 64 31 5 4 5 3 10 2 1 ...
Desired labels
mylabels<- c("A", "E", "EA", "BE", "AA","D", "B", "AD", "BC", "CA", "AA1", "DD1", "ED", "DC", "C", "AC", "ADC", "DE",
"EA", "A", "E", "DE", "F", "BE", "D", "EE1", "B", "CA", "DB", "BB", "AA1", "ED", "AD", "DD1", "AA",
"A","ED", "E","DD", "DD1","CDC", "C", "AC", "D", "F", "EE1","BCB", "DC", "ADC", "DE", "CA", "AA", "BE",
"CBC", "B", "EA", "AA1", "AD", "BB", "CD", "CB", "DB")
R Code
library(cluster) #agnes function
library(usedist) #change label names
library(ape) #create fan data frame
library('dendextend')
library(RColorBrewer) #Customise the colour palette
library(phytools)
#DENDROGRAM
Cluster.Country.Dist<-dist(cbind(Sub_Whistle_Count[1:2], 2*(as.numeric(Sub_Whistle_Count$Whistle_Type_Sub)-2)))
Cluster.Country.Dist
##tip labels
mylabels<- c("A", "E", "EA", "BE", "AA","D", "B", "AD", "BC", "CA", "AA1", "DD1", "ED", "DC", "C", "AC", "ADC", "DE",
"EA", "A", "E", "DE", "F", "BE", "D", "EE1", "B", "CA", "DB", "BB", "AA1", "ED", "AD", "DD1", "AA",
"A","ED", "E","DD", "DD1","CDC", "C", "AC", "D", "F", "EE1","BCB", "DC", "ADC", "DE", "CA", "AA", "BE",
"CBC", "B", "EA", "AA1", "AD", "BB", "CD", "CB", "DB")
#Change the rows and column label names in the dist object
NewLabels<-dist_setNames(Cluster.Country.Dist, mylabels[1:62])
NewLabels
#Hierarchical Clustering algorithm on the dataset using hclust()
Cluster.Country.hcl<-hclust(NewLabels)
Cluster.Country.hcl
#Create a dendrogram object
dend = as.dendrogram(Cluster.Country.hcl)
dend
#Connecting label color with the country label
geo <- factor(c("Spain", "France", "Holland"))
(mycol<-brewer.pal(3, "Dark2")[geo])
#State the colour of the tip and branch labels
#k=3; three countries' three clades
dend_Final <- dend %>%
color_branches(k=3, col=mycol) %>%
color_labels(k=3, col=mycol)
#Open a new graphics window
dev.new()
#State the dimensions of the graphics window
par(mar = rep(0, 4))
#Produce the "fan" type dendrogram
circlize_dendrogram(dend_rainbow, dend_track_height = 0.8, labels_track_height = 0.1, labels = TRUE)
Data
structure(list(Country = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Spain", "France",
"Holland"), class = "factor"), Whistle_Type_Sub = structure(c(1L,
24L, 25L, 11L, 2L, 18L, 7L, 5L, 9L, 13L, 3L, 22L, 26L, 20L, 12L,
4L, 6L, 23L, 25L, 1L, 24L, 23L, 28L, 11L, 18L, 27L, 7L, 13L,
19L, 8L, 3L, 26L, 5L, 22L, 2L, 1L, 26L, 24L, 21L, 22L, 17L, 12L,
4L, 18L, 28L, 27L, 10L, 20L, 6L, 23L, 13L, 2L, 11L, 15L, 7L,
25L, 3L, 5L, 8L, 16L, 14L, 19L), .Label = c("A", "AA", "AA1",
"AC", "AD", "ADC", "B", "BB", "BC", "BCB", "BE", "C", "CA", "CB",
"CBC", "CD", "CDC", "D", "DB", "DC", "DD", "DD1", "DE", "E",
"EA", "ED", "EE1", "F"), class = "factor"), N = c(25L, 64L, 31L,
5L, 4L, 5L, 3L, 10L, 2L, 1L, 2L, 3L, 2L, 1L, 1L, 3L, 2L, 4L,
26L, 54L, 20L, 10L, 18L, 7L, 7L, 10L, 2L, 3L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 23L, 1L, 13L, 10L, 5L, 4L, 8L, 9L, 9L, 20L, 1L, 1L,
9L, 1L, 9L, 2L, 6L, 3L, 1L, 10L, 9L, 2L, 3L, 1L, 2L, 2L, 3L)), row.names = c(NA,
-62L), class = "data.frame")
Diagram 1: Output from R-code
Diagram 2: Desired tip label output without running numbers, although, the branches, tips and tip labels are not the same colours