 # Correlation Matrix with assigned weights

Hello everyone,

I was wondering if there is a way to assign weights (or something similar like standard deviations (SD)) to individual values of a matrix used for a correlation analysis.

So for example we have the mtcars data matrix and use the corr and ggcorrplot function to do the correlation analysis:

data(mtcars)

# do correlation analysis

corr <- cor(mtcars)

# show results

But what I would like to do is to assign an individual weight to each value of the mtcars matrix before using it for the correlation analysis. So lets say this is our data: ...then for example I would like to assign "mpg" of "Mazda Rx4" with a SD of 1 and the "disp" of "Mazda Rx4" with a SD of 0.7. And so on.

In the correlation analysis, the individual SD should then be regarded (e.g. vaules with a higher assigned SD should have less impact on the correlation analysis).

Is there some way to do this? All correlation analysis functions I was able to find only accepted the input matrix as... well... input. But no input on individual SD or something like that.

Any ideas?

googling 'cran weighted correlation' quickly led to this
https://cran.r-project.org/web/packages/wCorr/

There are two problems with this:

library(wCorr) gives out the error
"Error: package or namespace load failed for ‘wCorr’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
there is no package called ‘mnormt’"
The mnorm does not exist in the installation List...

And looking into the documentation of the function weightedCorr() of the wCorr package you suggested, it seems like it also only accepts a vector of weights... not a matrix. I don´t want to assign weights to whole rows or columns, but to individual values. The same problem seems to occur with using a weight matrix in the wtd.cor() function. https://www.rdocumentation.org/packages/weights/versions/1.0.1/topics/wtd.cor

This is listed as an import for wCorr. maybe try install it explicitly ?

I mocked up a demo for you (I assumed it made sense to apply the weights multiplicitavly but you could do otherwise)

``````
library(wCorr)
library(tidyverse)

myweights <- mutate_all(mtcars, runif)
myvalues <- mtcars

to_do_list <- combn(names(mtcars),2,simplify = FALSE)
purrr::map_dfr(to_do_list,
~tibble(var_pair = paste0(.[],":",.[]),
wCorr = wCorr::weightedCorr(x = myvalues[[.[]]],
y= myvalues[[.[]]],
weights = myweights[[.[]]] * myweights[[.[]]],
method = "Pearson")))
# A tibble: 55 x 2
# var_pair    wCorr
# <chr>       <dbl>
# 1 mpg:cyl   -0.866
# 2 mpg:disp  -0.849
# 3 mpg:hp    -0.676
# 4 mpg:drat   0.598
# 5 mpg:wt    -0.843
# 6 mpg:qsec   0.333
# 7 mpg:vs     0.463
# 8 mpg:am     0.587
# 9 mpg:gear   0.303
# 10 mpg:carb -0.487
# ... with 45 more rows``````
1 Like

Thank you! Unfortunately, I still to fail installing mnormt.

It does not show up in the installing packages list in RStudio. When I try it via the command

``````install.packages("mnormt")
``````

it gives out the message

``````Warning in install.packages :
package ‘http://cran.rstudio.com/bin/windows/contrib/3.2/mnormt_1.5-3.zip’ is not available (for R version 3.6.1)
``````

Could it be a problem with the R version? Any suggestions?

According to the CRAN page for mnormt, R 4.0 is required and you appear to have 3.6.1 installed.

https://cran.r-project.org/web/packages/mnormt/index.html

https://cloud.r-project.org/

1 Like

Yup, that worked. Thanks!

@nirgrahamuk: Your mock up worked also just fine now and seems (almost!) to do, what I need. The only issue remaining is the format the data is put out. I need it as a matrix to plot it with ggcorrplot as described here:

Is there an easy way to change the mock up output to something like the "corr" data from the example in the link?

there's almost certainly a more elegant solution but this is what I came up with.

``````library(wCorr)
library(tidyverse)
library(ggcorrplot)

myweights <- mutate_all(mtcars, runif)
myvalues <- mtcars

to_do_list <- combn(names(mtcars),2,simplify = FALSE)
my_res<-purrr::map_dfr(to_do_list,
~tibble(var_row = .[],
var_col = .[],
wCorr = wCorr::weightedCorr(x = myvalues[[.[]]],
y= myvalues[[.[]]],
weights = myweights[[.[]]] * myweights[[.[]]],
method = "Pearson")))
vr<-enframe(names(mtcars),value="var_row")
vc<-enframe(names(mtcars),value="var_col")
v2 <- bind_cols(vr,vc) %>% select(var_row,var_col) %>% mutate(wCorr=0)
combined <-union_all(my_res,v2) %>% arrange(var_col,var_row)
# A tibble: 55 x 2
m<-pivot_wider(combined,
names_from = var_col,
values_from = wCorr) %>% arrange(var_row) %>% select(-var_row)

mm <- as.matrix(m)
rownames(mm) <- sort(names(mtcars))

for(i in 1:dim(mm)[]){
for (j in 1:dim(mm)[]){
if( is.na(mm[i,j])){

mm[i,j]<-mm[j,i]
}
}
}

ggcorrplot(mm)
``````

1 Like

Great, the issue is almost resolved. While using your script on the actual data, one more problem occurred: If one value of an observation is NA, the entire observation will not be included into the correlation analysis. So if I do something like

``````myvalues[3,3] <- NA
myweights[3,3] <- NA
``````

it results in this: So the "disp" observation has no values... I looked into the documentation of the weightedcorr function used in your script, but was not able to find any argument working around this (e.g. the cor function from the stats package allows this via the "use" argument). Do you know any way around this problem? It should (hopefully) be the last one.

You could omit records with NAs in the dataframe or replace them with some arbitrary value(s)

1 Like

I guess replacing with arbitrary values and setting their weight to 0 should do the trick.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.