What is the purpose of a dendogram on a correlation heat map?

What does the dendogram on a heat map do - what value or information does it add? I read somewhere that it helps you check the logical relationship between variables but how? If anyone knows, has resources on this, or could explain this that would be great.

To see the sort of thing I’m referring to, see this link and the branches on the outskirts of the matrix:

https://www.datanovia.com/en/blog/how-to-create-an-interactive-correlation-matrix-heatmap-in-r/

Thank you

Hi,

Most heatmaps have been clustered in order to show the most related data together. There are many different clustering algorithms, but if you use the heatmap() function in R, the default is hclust, a hierarchical clustering. "hierarchical " here is key as it is the basis for the dendrograms.

Dendrograms show how related data point are. Here is a short blog that introduces the topic:

Details on the heatmap function are found in the documentation
https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/heatmap

Hope this helps,
PJ

In general, I see two reasons. First, it looks nicer. Second, it helps see patterns. To stay with the most basic mtcars example possible:

heatmap(cor(mtcars))

heatmap(cor(mtcars), Rowv = NA, Colv = NA)

Created on 2021-12-19 by the reprex package (v2.0.1)

Note that it's the exact same data, these two heatmaps are identical in terms of what they show, but the first one looks nicer, and makes it obvious that there are 2 groups of metrics: the car weight/engine size/horsepower are correlated together, and anticorrelated with the miles per gallon etc. On the second heatmap it's not as obvious.

For the first argument, the important part is the column ordering, we wouldn't have something that looks much worse if we didn't represent the dendrogram (although I feel the presence of the dendrogram does signal to the reader that a hierarchical clustering was used, and the nice order is not a coincidence).

For the second part however, the dendrogram is important: sure, it looks like there are two groups, how convincing is it? The distances in the dendrogram help us see how well the groups are separated, and can become even more important when we have a more complex structure.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.