Interpretation of the results generated by the pvclust package

Could you help me to interpret the results I obtained using the pvclust package? I am using two databases, df1 and df2. Therefore, I am generating 2 different scenarios. The executable code is below.

#Database df1
   rm(list=ls())
   library(rdist)
   library(pvclust)
   library(geosphere)
           
df1<-structure(list(Propertie = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19), Latitude = c(-23.8, -23.9, -23.5, -23.4, -23.6,-23.9, -23.2, -23.5, -23.8, -23.7, -23.8, -23.9, -23.4, -23.9, 
                                                                                                -23.9, -23.2, -23.3, -23.7, -23.8), 
                   Longitude = c(-49.1, -49.3,-49.4, -49.7, -49.6, -49.6, -49.6, -49.6, -49.6, -49.6, -49.7,-49.2, -49.5, -49.8, -49.5, -49.3, -49.3, -49.2, -49.5), 
                   Waste = c(526,350, 526, 469, 285, 175, 175, 350, 350, 175, 350, 175, 175, 364,175, 175, 350, 45.5, 54.6)), 
              class = "data.frame", row.names = c(NA, -19L)) 

#PVCLUST
coordinates<-subset(df1,select=c("Latitude","Longitude")) 
d<-distm(coordinates[,2:1])
diag(d)<-1000000
d<-as.dist(d)
mat <- as.matrix(d)
mat <- t(mat)
fit <- pvclust(mat, method.hclust="average", method.dist="euclidean", 
               nboot=10)
plot(fit,hang=-1,cex=.8,main="Average Linkage Clustering") 

pvrect(fit, alpha=.70, pv="au", type="geq") 

Plot Scenario 1
enter image description here

#Database df2   
rm(list=ls())
   library(rdist)
   library(pvclust)
   library(geosphere)
    
df2<-structure(list(Propertie = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16), Latitude = c(-23.8, -23.9, -23.5, -23.4, -23.6,-23.9, -23.2, -23.5, -23.8, -23.7, -23.8, -23.9, -23.4, -23.9, 
                                                                                                    -23.9, -23.2), 
                       Longitude = c(-49.1, -49.3,-49.4, -49.7, -49.6, -49.6, -49.6, -49.6, -49.6, -49.6, -49.7,-49.2, -49.5, -49.8, -49.5, -49.3), 
                       Waste = c(526,350, 526, 469, 285, 175, 175, 350, 350, 175, 350, 175, 175, 364,175, 175)), 
                  class = "data.frame", row.names = c(NA, -16L)) 

#PVCLUST
coordinates<-subset(df2,select=c("Latitude","Longitude")) 
d<-distm(coordinates[,2:1])
diag(d)<-1000000
d<-as.dist(d)
mat <- as.matrix(d)
mat <- t(mat)
fit <- pvclust(mat, method.hclust="average", method.dist="euclidean", 
               nboot=10)
plot(fit,hang=-1,cex=.8,main="Average Linkage Clustering") 

pvrect(fit, alpha=.60, pv="au", type="geq") 

Plot Scenario 2
enter image description here

I opted to study for the AU p-value, as it allows a much better assessment of how strongly the cluster is supported by the data. The table below shows the scenarios I have, the number of clusters for each scenario, AU values for each cluster, as well as AU average and AU standard deviation (SD). But I am having a hard time interpreting the results I obtained.

enter image description here

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.