I was wondering if there is a way to assign weights (or something similar like standard deviations (SD)) to individual values of a matrix used for a correlation analysis.
So for example we have the mtcars data matrix and use the corr and ggcorrplot function to do the correlation analysis:
get input matrix
data(mtcars)
do correlation analysis
corr <- cor(mtcars)
show results
head(corr[, 1:6])
But what I would like to do is to assign an individual weight to each value of the mtcars matrix before using it for the correlation analysis. So lets say this is our data:
...then for example I would like to assign "mpg" of "Mazda Rx4" with a SD of 1 and the "disp" of "Mazda Rx4" with a SD of 0.7. And so on.
In the correlation analysis, the individual SD should then be regarded (e.g. vaules with a higher assigned SD should have less impact on the correlation analysis).
Is there some way to do this? All correlation analysis functions I was able to find only accepted the input matrix as... well... input. But no input on individual SD or something like that.
library(wCorr) gives out the error
"Error: package or namespace load failed for ‘wCorr’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
there is no package called ‘mnormt’"
The mnorm does not exist in the installation List...
And looking into the documentation of the function weightedCorr() of the wCorr package you suggested, it seems like it also only accepts a vector of weights... not a matrix. I don´t want to assign weights to whole rows or columns, but to individual values. The same problem seems to occur with using a weight matrix in the wtd.cor() function. https://www.rdocumentation.org/packages/weights/versions/1.0.1/topics/wtd.cor
@nirgrahamuk: Your mock up worked also just fine now and seems (almost!) to do, what I need. The only issue remaining is the format the data is put out. I need it as a matrix to plot it with ggcorrplot as described here:
Is there an easy way to change the mock up output to something like the "corr" data from the example in the link?
Great, the issue is almost resolved. While using your script on the actual data, one more problem occurred: If one value of an observation is NA, the entire observation will not be included into the correlation analysis. So if I do something like
myvalues[3,3] <- NA
myweights[3,3] <- NA
it results in this:
So the "disp" observation has no values... I looked into the documentation of the weightedcorr function used in your script, but was not able to find any argument working around this (e.g. the cor function from the stats package allows this via the "use" argument). Do you know any way around this problem? It should (hopefully) be the last one.