R code - calculating cosine similarity pairwise

Hi, I am not very experienced in programming. Therefore I have problems in finding a suitable solution for my problem. I would like to calculate cosine similarity in R, however in this way, that I compare each row (a participant) with the "ground truth" (expert). For example:
I have 5 rows:
t_all_1<- c(0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,1,0,0)
t_all_2<- c(0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,1,0,0)
t_all_3<- c(0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0)
t_all_4<- c(0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0)
t_all_5<- c(0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0)

and in these five rows, I would like to compare each row with the output of an expert:
tadol_all_1<- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0)

How to do this? I assume I have to "store" somehow all the rows in a matrix, but then what, how to create that each row of the five rows above is automatically compared to the expert row and cosine similarity calculated?
Thanks in advance for help!

I think you can find an answer from this blog post.

R was created to do statistics. Much is done in the default packages, and finding the methods outside of that can be a challenge. A search of rseek.org, which is an R-tuned front end to Google, turned up the {corr} package as the first hit, and it does exactly what's needed.

library(coop)
t_all_1<- c(0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,1,0,0)
t_all_2<- c(0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,1,0,0)
t_all_3<- c(0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0)
t_all_4<- c(0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0)
t_all_5<- c(0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0)
tadol_all_1<- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0)

m <- cbind(t_all_1,t_all_2,t_all_3,t_all_4,t_all_5,tadol_all_1)
cosine(m)
#>               t_all_1   t_all_2   t_all_3   t_all_4   t_all_5 tadol_all_1
#> t_all_1     1.0000000 1.0000000 0.6546537 0.6546537 0.6546537   0.3779645
#> t_all_2     1.0000000 1.0000000 0.6546537 0.6546537 0.6546537   0.3779645
#> t_all_3     0.6546537 0.6546537 1.0000000 1.0000000 1.0000000   0.5773503
#> t_all_4     0.6546537 0.6546537 1.0000000 1.0000000 1.0000000   0.5773503
#> t_all_5     0.6546537 0.6546537 1.0000000 1.0000000 1.0000000   0.5773503
#> tadol_all_1 0.3779645 0.3779645 0.5773503 0.5773503 0.5773503   1.0000000

Hi, yes, that's exactly what I searched for. I had also another solution where I defined the function cosine :slight_smile:

cos.sim <- function(ix) 
{
    A = X[ix[1],]
    B = X[ix[2],]
    return( sum(A*B)/sqrt(sum(A^2)*sum(B^2)) )
}

however if calling only cosine(m) it doesn't work unless I write:
coop::cosine(m)

Thanks! L

1 Like

Just add to the top

library(coop)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.