Correlation for each unique ID

Hi Statistians,

Please find the reprex and crude solution as well.
Is there a elegant way to correlate between X and Y for each unique ID in fruit column ?
Extra help if you could confirm if the approach used to element NA is the best 1
Does it correlated X,Y on 2008:2011 as expected using this method ?

df <- data.frame(fruit = c(rep("Apple",4),rep("Banana",4),rep("Grapes",4)), 
                     year = c(rep(2008:2011,3)), 
                     X = c(NA,runif(11)), 
                     Y = c(NA,runif(10),NA))
# This correlation accounts for all fruits and year
cor(df$X,df$Y)
# In order to get rid of NA 
cor(df$X,df$Y, use = "complete.obs")

# Crude solution of obtaining correlation for each unique ID in fruit column
df.apple <- df %>% filter(fruit == "Apple")
cor(df.apple$X,df.apple$Y, use = "complete.obs")

df.banana <- df %>% filter(fruit == "Banana")
cor(df.banana$X,df.banana$Y, use = "complete.obs")

df.grape <- df %>% filter(fruit == "Grapes")
cor(df.grape$X,df.grape$Y, use = "complete.obs")

Thanks in advance,
Abi

Are you looking for this?

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

df <- data.frame(fruit = c(rep("Apple",4),rep("Banana",4),rep("Grapes",4)), 
                 year = c(rep(2008:2011,3)), 
                 X = c(NA,runif(11)), 
                 Y = c(NA,runif(10),NA))

df %>%
  group_by(fruit) %>%
  summarise(correlation = cor(x = X,
                              y = Y,
                              use = "complete.obs"))
#> # A tibble: 3 x 2
#>   fruit  correlation
#>   <fct>        <dbl>
#> 1 Apple       -0.720
#> 2 Banana       0.384
#> 3 Grapes      -0.139

Created on 2019-04-10 by the reprex package (v0.2.1)

1 Like

Thanks @Yarnabrina for Quick help and solution

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.