# Correlation for each unique ID

Hi Statistians,

Please find the reprex and crude solution as well.
Is there a elegant way to correlate between X and Y for each unique ID in fruit column ?
Extra help if you could confirm if the approach used to element NA is the best 1
Does it correlated X,Y on 2008:2011 as expected using this method ?

``````df <- data.frame(fruit = c(rep("Apple",4),rep("Banana",4),rep("Grapes",4)),
year = c(rep(2008:2011,3)),
X = c(NA,runif(11)),
Y = c(NA,runif(10),NA))
# This correlation accounts for all fruits and year
cor(df\$X,df\$Y)
# In order to get rid of NA
cor(df\$X,df\$Y, use = "complete.obs")

# Crude solution of obtaining correlation for each unique ID in fruit column
df.apple <- df %>% filter(fruit == "Apple")
cor(df.apple\$X,df.apple\$Y, use = "complete.obs")

df.banana <- df %>% filter(fruit == "Banana")
cor(df.banana\$X,df.banana\$Y, use = "complete.obs")

df.grape <- df %>% filter(fruit == "Grapes")
cor(df.grape\$X,df.grape\$Y, use = "complete.obs")

``````

Abi

Are you looking for this?

``````library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#>     filter, lag
#> The following objects are masked from 'package:base':
#>
#>     intersect, setdiff, setequal, union

df <- data.frame(fruit = c(rep("Apple",4),rep("Banana",4),rep("Grapes",4)),
year = c(rep(2008:2011,3)),
X = c(NA,runif(11)),
Y = c(NA,runif(10),NA))

df %>%
group_by(fruit) %>%
summarise(correlation = cor(x = X,
y = Y,
use = "complete.obs"))
#> # A tibble: 3 x 2
#>   fruit  correlation
#>   <fct>        <dbl>
#> 1 Apple       -0.720
#> 2 Banana       0.384
#> 3 Grapes      -0.139
``````

Created on 2019-04-10 by the reprex package (v0.2.1)

1 Like

Thanks @Yarnabrina for Quick help and solution

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.