I would like help with the following question:
I am using hierarchical clustering for my data. I'm doing an example using both the hclust function
and using pvclust
(includes the hclust function
).
I did two tests: the first for a base with 8 properties and the other with 19 properties. The first base worked, presented the graphics correctly, they are the same, since both use hclust
function. But when I made it with 19 properties, the graph was different, could it help me understand and solve this problem?
Thank you!
library(rdist)
library(pvclust)
library(geosphere)
#USING HCLUST
df <- structure(
list(Propertie = c(1,2,3,4,5,6,7,8), Latitude = c(-24.779225, -24.789635, -24.763461, -24.794394, -24.747102,-24.781307,-24.761081,-24.761084),
Longitude = c(-49.934816, -49.922324, -49.911616, -49.906262, -49.890796,-49.8875254,-49.8875254,-49.922244),
Waste = c(526, 350, 526, 469, 285, 433, 456,825)),class = "data.frame", row.names = c(NA, -8L))
coordinates<-subset(df,select=c("Latitude","Longitude"))
d<- dist(distm(coordinates[,2:1]), method="euclidean")
fit.average<-hclust(d,method="average")
plot(fit.average,hang=-1,cex=.8,main="Average Linkage Clustering")
### USING PVCLUST
coordinates<-subset(df,select=c("Latitude","Longitude"))
d<-dist(distm(coordinates[,2:1]))
mat <- as.matrix(d)
mat <- t(mat)
fit <- pvclust(mat, method.hclust="average", method.dist="euclidean",
nboot=10)
plot(fit,hang=-1,cex=.8, cex.pv=.5, print.num=FALSE, print.pv=FALSE,
main="Average Linkage Clustering")
### FOR DATABASE DF WITH 19 PROPERTIES
df<-structure(list(Propertie = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19), Latitude = c(-23.8, -23.9, -23.5, -23.4, -23.6,-23.9, -23.2, -23.5, -23.8, -23.7, -23.8, -23.9, -23.4, -23.9,-23.9, -23.2, -23.3, -23.7, -23.8),
Longitude = c(-49.1, -49.3,-49.4, -49.7, -49.6, -49.6, -49.6, -49.6, -49.6, -49.6, -49.7,-49.2, -49.5, -49.8, -49.5, -49.3, -49.3, -49.2, -49.5),
Waste = c(526,350, 526, 469, 285, 175, 175, 350, 350, 175, 350, 175, 175, 364,175, 175, 350, 45.5, 54.6)), class = "data.frame", row.names = c(NA, -19L))