How to Make a formula and make an application in R

Hi,

I would like to do the following formula in R and a brief application, if possible. Please, can you help me?

The formula is this of the image:

enter image description here

For the application, please, use:

di<-10

n<-29

I think this would do it for you:

rho <- function(d, n) {
    1 - (6 * (sum(d)^2) / (n * ((n^2) - 1)))
}

rho(10, 29)
#> [1] 0.9753695

rho(d = 3, n = 2)
#> [1] -8

rho(d = 58, n = 10000)
#> [1] 1

rho(d = 1:100, n = 4)
#> [1] -2550249

Created on 2022-03-04 by the reprex package (v1.0.0)

2 Likes

I am confused by your question. The summation notation is used when adding multiple numbers together. You have just one number. If d = 10 then sum(d) = 10.

Normally, you would have a vector d equal to something like (5, 2, 6, 3) and di would refer to the number in the ith position. In this case, n, the number of elements in the vector, would be 4. The summation of di^2, with i = 1 to 4, would be 5^2 + 2^2 + 6^2 + 3^2. Also, the denominator in your equation would be 4 * (4^2 -1).

I suspect that the input to your function should be a vector d, and n=length(d), not an arbitrary number.

1 Like

@EconProf , would it be more or less that?

result<-structure(list(n = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
                               12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 
                               28, 29), M1 = c(29L, 1L, 28L, 27L, 25L, 26L, 24L, 20L, 21L, 
                                                22L, 23L, 15L, 12L, 17L, 18L, 19L, 16L, 13L, 14L, 5L, 6L, 7L, 
                                                8L, 9L, 10L, 11L, 4L, 2L, 3L), M2 = c(1, 29, 28, 27, 26, 25, 
                                                                                       24, 23, 22, 21, 20, 15, 12, 19, 18, 17, 16, 14, 13, 11, 10, 9, 
                                                                                       8, 7, 6, 5, 4, 3, 2), M3 = c(1L, 29L, 28L, 27L, 25L, 26L, 24L, 
                                                                                                                     20L, 21L, 22L, 23L, 15L, 12L, 17L, 18L, 19L, 16L, 13L, 14L, 5L, 
                                                                                                                     6L, 7L, 8L, 9L, 10L, 11L, 4L, 2L, 3L)), class = "data.frame", row.names = c(NA,-29L))

   dif<- result %>% 
    mutate(D1 = M1-M2, D2 = M1-M3, D3 = M2-M3)
  
  d<-dif$D1
  
  n<-nrow(dif)
  
  rho <- function(d, n) {
    1 - (6 * (sum(d)^2) / (n * ((n^2) - 1)))
  }
  
  rho(d, n)

Just to clarify - my solution only works if you buy into @EconProf's excellent point about \Sigma being used as a summation operator and d_i referring to the ith value of the vector d. That is why the rho function sums the d argument. It assumes it is a vector, but R doesn't distinguish between vectors and scalars. If you did intend d_i to be a scalar value, throw out my code please and we'll try again.

The code from @dvetsch75 works perfectly on a vector. If you want to skip the step of setting the value of n, you can also calculate it in your function and have just one argument.

  rho <- function(d) {
    1 - (6 * (sum(d)^2) / (length(d) * ((length(d)^2) - 1)))
  }
  
  rho(d)

Thanks for reply. More one question: Use:
(sum(d)^2) or (sum(d^2)) ? because by the your function the result of rho is 1, but by the second option the result is 0.5778325, considering the result database that I inserted above.

You are correct that it should be sum(d^2).

1 Like

Assuming that you will also apply the formula to D2 and D3, use across() from {dplyr}:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

result<-structure(list(n = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
                               12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 
                               28, 29), M1 = c(29L, 1L, 28L, 27L, 25L, 26L, 24L, 20L, 21L, 
                                                22L, 23L, 15L, 12L, 17L, 18L, 19L, 16L, 13L, 14L, 5L, 6L, 7L, 
                                                8L, 9L, 10L, 11L, 4L, 2L, 3L), M2 = c(1, 29, 28, 27, 26, 25, 
                                                                                       24, 23, 22, 21, 20, 15, 12, 19, 18, 17, 16, 14, 13, 11, 10, 9, 
                                                                                       8, 7, 6, 5, 4, 3, 2), M3 = c(1L, 29L, 28L, 27L, 25L, 26L, 24L, 
                                                                                                                     20L, 21L, 22L, 23L, 15L, 12L, 17L, 18L, 19L, 16L, 13L, 14L, 5L, 
                                                                                                                     6L, 7L, 8L, 9L, 10L, 11L, 4L, 2L, 3L)), class = "data.frame", row.names = c(NA,-29L))

dif <- result %>% 
  mutate(D1 = M1-M2, D2 = M1-M3, D3 = M2-M3)
  
rho <- function(d) {
    1 - (6 * (sum(d^2)) / (length(d) * ((length(d)^2) - 1)))
  }

dif %>% summarise(across(D1:D3, ~ rho(.x)))
#>          D1        D2        D3
#> 1 0.5778325 0.6137931 0.9640394

Created on 2022-03-05 by the reprex package (v2.0.1)

1 Like

Excellent @EconProf , I was trying to figure out some way to do what you did, ie generate for all options. I'm glad you got me up to date =)

One option I've seen now is to do:

cor(result[,2:4], method = "spearman")

> cor(result[,2:4], method = "spearman")

          M1        M2        M3
M1 1.0000000 0.5778325 0.6137931
M2 0.5778325 1.0000000 0.9640394
M3 0.6137931 0.9640394 1.0000000

However, it relates everyone to everyone, and not just as you generated in your previous answer.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.