working with correlations

Jason.C · April 18, 2019, 5:58am

Greetings,

I have found a way to calculate correlations < cor() > - but p-values are missing.
I have also found a way to test correlations < cor.mtest > - but r-values are missing.
I have also found a way to plot correlations which indicate significant correlations - but no p-values. Any help in getting a complete set of information would be greatly appreciated!

Cheers,
Jason

#   Reprex correlations
#   modified: "17/04/2019"

library(ggplot2)
library(ez)
library(corrplot)
#> corrplot 0.84 loaded
options(na.action = na.exclude)
library(PerformanceAnalytics)
library(tidyverse)
library(here)
here() # THIS SETS THE WORKING DIRECTORY TO THE LOCATION OF THIS FILE
dr_here(show_reason = FALSE)

#       RAW DATA ON GITHUB
data <- read.csv(url('https://raw.githubusercontent.com/BrainStormCenter/ASQ_pilot/master/ASQ_pilot_raw_2019_04_17.csv'), header = TRUE)

#       FILTER OUT SUBJECT WHERE Groups3=other
#           LOOK FOR A WAY TO AVOID LOSING THIS SUBJECT ENTIRELY
data2 <- filter(data, Groups3 != "other")

#       CREATING A SMALLER DATASET OF ALL SUBJECTS OF 3 VARIABLES
vars_keep <- select(data2, asq_light, asq_heavy,max.temp)

#       CORELATIONS
"Pearson"
#> [1] "Pearson"
cor(vars_keep, use = "complete.obs", method = "pearson")
#>            asq_light asq_heavy   max.temp
#> asq_light 1.00000000 0.3636925 0.05498907
#> asq_heavy 0.36369251 1.0000000 0.19284421
#> max.temp  0.05498907 0.1928442 1.00000000
cor(vars_keep, use = "pairwise.complete", method = "pearson")
#>            asq_light asq_heavy   max.temp
#> asq_light 1.00000000 0.4290929 0.03524822
#> asq_heavy 0.42909294 1.0000000 0.19284421
#> max.temp  0.03524822 0.1928442 1.00000000

#       WHICH CORRELATION VALUES GO WITH THE P-VALUES FROM THE TEST BELOW?
#           WHICH METHOD DOES THIS USE?
cor.mtest(vars_keep)
#> $p
#>              [,1]         [,2]      [,3]
#> [1,] 0.000000e+00 7.143542e-05 0.7753652
#> [2,] 7.143542e-05 0.000000e+00 0.1583621
#> [3,] 7.753652e-01 1.583621e-01 0.0000000
#> 
#> $lowCI
#>            [,1]       [,2]       [,3]
#> [1,]  1.0000000  0.2311706 -0.2048988
#> [2,]  0.2311706  1.0000000 -0.0763596
#> [3,] -0.2048988 -0.0763596  1.0000000
#> 
#> $uppCI
#>           [,1]      [,2]      [,3]
#> [1,] 1.0000000 0.5929108 0.2713925
#> [2,] 0.5929108 1.0000000 0.4358432
#> [3,] 0.2713925 0.4358432 1.0000000

#       WHERE ARE THE P-VALUES FOR THE GRAPHS BELOW
chart.Correlation(vars_keep, histogram = TRUE, method = "pearson")


cor_plot2 <- ezCor(
    data = vars_keep[,c(1:3)],
    r_size_lims = c(8,8),
    test_alpha = .05,
    label_size = 3
)
print(cor_plot2)

^{Created on 2019-04-18 by the reprex package (v0.2.1)}

Fer · April 18, 2019, 6:23am

You can get the p-value from cor.test().
Fer
PS. Sorry for brevity, I am on the phone...

Yarnabrina · April 18, 2019, 7:12am

cor.test can't be applied on a data.frame or a matrix. See this SO thread.

Following the aforementioned thread's accepted answer, here's a solution using corr.test function from psych package:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(psych)

complete_dataset <- read.csv(file = 'https://raw.githubusercontent.com/BrainStormCenter/ASQ_pilot/master/ASQ_pilot_raw_2019_04_17.csv')

small_subset <- complete_dataset %>%
  # as far as I can see, there's not a single participant with "other" as Group3
  filter(Groups3 != "other") %>% 
  select(asq_light, asq_heavy, max.temp)

correlation_matrix <- corr.test(x = small_subset,
                                use = "complete", # available: pairwise, complete
                                method = "pearson", # available: pearson, spearman, kendall
                                adjust = "none") # available: holm, hochberg, hommel, bonferroni, BH, BY, fd, none

print(x = correlation_matrix,
      short = FALSE)
#> Call:corr.test(x = small_subset, use = "complete", method = "pearson", 
#>     adjust = "none")
#> Correlation matrix 
#>           asq_light asq_heavy max.temp
#> asq_light      1.00      0.36     0.05
#> asq_heavy      0.36      1.00     0.19
#> max.temp       0.05      0.19     1.00
#> Sample Size 
#> [1] 55
#> Probability values (Entries above the diagonal are adjusted for multiple tests.) 
#>           asq_light asq_heavy max.temp
#> asq_light      0.00      0.01     0.69
#> asq_heavy      0.01      0.00     0.16
#> max.temp       0.69      0.16     0.00
#> 
#>  Confidence intervals based upon normal theory.  To get bootstrapped values, try cor.ci
#>             raw.lower raw.r raw.upper raw.p lower.adj upper.adj
#> asq_l-asq_h      0.11  0.36      0.57  0.01      0.05      0.61
#> asq_l-mx.tm     -0.21  0.05      0.32  0.69     -0.27      0.37
#> asq_h-mx.tm     -0.08  0.19      0.44  0.16     -0.14      0.48

^{Created on 2019-04-18 by the reprex package (v0.2.1)}

Jason.C · April 18, 2019, 3:05pm

Thanks. I'll give that a try.

Jason.C · April 18, 2019, 3:11pm

@Yarnabrina Thanks ever so much for the help. I will try this ASAP. I will have to double check the raw data as there used to be just one subject in group "other"...

Two questions:

Am I correct that cor.test can only be used for pairwise estimates?
Would cor.test work on a tibble?

I will read the suggested thread in a few minutes.

Cheers,
Jason

Yarnabrina · April 18, 2019, 5:40pm

Your 1st question is not very clear to me.

If it's about how cor.test handles the missing observation, then note that two variables complete and pairwise are essentially same.
If you asked regarding its use to test correlation between just one independent variable or more than one independent variables, then the answer is YES. It can not be applied to test for multiple correlation coefficient.
If the question is about whether it can be applied on a data.frame directly or not, then the answer isNO. See reprex below.

Regarding your 2nd question, why don't you try it yourself with a dummy data?

I didn't know the answer, and tried it as follows. The answer is NO, as you can find below same error for both data.frame and tibble.

dataset_dataframe <- data.frame(p = 1:5,
                                q = 15:11,
                                r = c(-50, -25, 0, 25, 50))
cor.test(x = dataset_dataframe)
#> Error in cor.test.default(x = dataset_dataframe): argument "y" is missing, with no default

dataset_tibble <- tibble::tibble(p = 1:5,
                                 q = 15:11,
                                 r = c(-50, -25, 0, 25, 50))
cor.test(x = dataset_tibble)
#> Error in cor.test.default(x = dataset_tibble): argument "y" is missing, with no default

Jason.C · April 27, 2019, 10:36pm

Sorry for the confusion. You answered my initial question and two others I didn’t know I had yet.

system · May 18, 2019, 10:36pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.