# R function for returning different correlation coefficient values?

Hi, I want to write a function, in which I can select any variables from different data frames (tibbles) and which will return the coefficients in a vector form. For example, this is my data frame:

df1 <- tibble(a=rnorm(10), b=rnorm(10), c=rnorm(10), d=rnorm(10))
df2 <- tibble(x1=rnorm(10), x2=rnorm(10), x3=rnorm(10), x4=rnorm(10))
df3 <- tibble(y1=rnorm(10), y2=rnorm(10), y3=rnorm(10), y4=rnorm(10))

And these are my suggested correlation tests (only a part of it):
cor(df1\$a, df2\$x1)
cor(df1\$a, df2\$x2)
cor(df1\$a, df2\$x3)
cor(df1\$a, df2\$y1)
cor(df1\$a, df2\$y2)
cor(df1\$a, df2\$y3)
cor(df1\$b, df2\$x1)
cor(df1\$b, df2\$x2)
cor(df1\$b, df2\$x3)
cor(df1\$b, df2\$y1)
cor(df1\$b, df2\$y2)
cor(df1\$b, df2\$y3)

How can I write a function so that I can specify the first group of variables in a vector (say x),
and specify the second group of variables in another vector (say y), and then get the results of cor(x,y) which will give all the suggested correlation coefficients in a vector?

Any alternative solutions would also be appreciated.

Are you looking for something like this?

``````library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#>     filter, lag
#> The following objects are masked from 'package:base':
#>
#>     intersect, setdiff, setequal, union

set.seed(seed = 27286)

df1 <- tibble(a = rnorm(n = 10),
b = rnorm(n = 10),
c = rnorm(n = 10),
d = rnorm(n = 10))

df2 <- tibble(x1 = rnorm(n = 10),
x2 = rnorm(n = 10),
x3 = rnorm(n = 10),
x4 = rnorm(n = 10))

df3 <- tibble(y1 = rnorm(n = 10),
y2 = rnorm(n = 10),
y3 = rnorm(n = 10),
y4 = rnorm(n = 10))

# method I: extract from combined correlation matrix

R <- cor(x = cbind(df1, df2, df3))

## correlations of df1\$a with all columns of df2
R[1, 5:8]
#>         x1         x2         x3         x4
#> -0.2785606 -0.2829311  0.2002700  0.1052528

# method II
correlation <- function(first_dataframe,
second_dataframe,
column_indices_1,
column_indices_2)
{
return(cor(x = first_dataframe[column_indices_1],
y = second_dataframe[column_indices_2]))
}

## correlations of df1\$a with all columns of df2
correlation(first_dataframe = df1,
second_dataframe = df2,
column_indices_1 = 1,
column_indices_2 = 1:4)
#>           x1         x2      x3        x4
#> a -0.2785606 -0.2829311 0.20027 0.1052528
``````

Created on 2019-03-29 by the reprex package (v0.2.1)

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.