What did my colleague do to prepare the data for an ICC?

Hey dear R-Community,

***I want to use my data sample, to calculate the intra-class-correlation of the three raters. ***

My colleague sent the following syntax with which she calculated it for a different sample.

Ttemp1 <- read.xlsx2("coding_format.xlsx",1,colIndex=c(1:21),colClasses=c("numeric","numeric","character",rep("numeric",16),"character","character"))

T.temp1 <- read.xlsx2("coding_format_draft c_Trier_Study_1.xlsx",1,colIndex=c(1:21),colClasses=c("numeric","numeric","character",rep("numeric",16),"character","character"))

T.temp1 <- T.temp1[order(T.temp1$id,T.temp1$session,T.temp1$rater_EXP,T.temp1$Segment),]
IDs<-unique(T.temp1$id)


T.temp2<-data.frame(matrix(nrow = 0,ncol = 37))
names<-names(T.temp1)[-3]
names2<-names[4:20]
names3<-paste("b",names2,sep="_")
names4<-c(names,names3)
names(T.temp2)<-names4
for (t in IDs){
  #t<-IDs[1]
  temp1<-subset(T.temp1,id==t)
  Sessions<-unique(temp1$session)
  for (s in Sessions){
    #s<-Sessions[1]
     temp2<-subset(temp1,session==s)
     coders<-unique(temp2$rater_EXP)
     if (length(coders)==1){
       temp3<-subset(temp2,select = -rater_EXP)
       temp3<-data.frame(temp3,matrix(ncol=17))
       names(temp3)<-names4
       } else{
       temp3.1<-subset(temp2,rater_EXP==coders[1], select = -rater_EXP)
       temp3.2<-subset(temp2,rater_EXP==coders[2], select = -rater_EXP)
       names<-names(temp3.2)[4:20]
       for (i in names){
         names(temp3.2)[names(temp3.2)==i]<-paste("b",i,sep="_")
       }
       temp3<-merge(temp3.1,temp3.2, by=c("id","session","Segment"),all = T,sort=T)  
       length(temp3[1,])
     }
     T.temp2<-rbind(T.temp2,temp3)
  }
}

Since im very new to R, i cant really follow the steps she has made preparing the data. Can someone please explain the steps/or part of the steps she made linking it to the syntax so that I'm able to apply it to my data set.

The calculation of the ICC itself and the syntax that belongs to it is rather easy to understand.

T.temp2 <- T.temp2[order(T.temp2$id,T.temp2$session,T.temp2$Segment),]   
#write.table(T.temp2,file="clipboard-12000",sep="\t",col.name=NA)
#summary(T.temp2)
#ICC for mode and peak 
library(psych)
mode<-T.temp2[,c(4,21)]
mode<-mode[complete.cases(mode),]
mode.icc<-ICC(mode)
mode.icc
#write.table(mode,file="clipboard-12000",sep="\t",col.name=NA)
peak<-T.temp2[,c(5,22)]
peak<-peak[complete.cases(peak),]
peak.icc<-ICC(peak)
peak.icc

We ask posters here to take care that code is properly formatted. It makes the site easier to read, prevents confusion (unformatted code can mistakenly trigger other types of automatic formatting) and generally is considered the polite thing to do. You can read all the details of how to do it here: FAQ: How to format your code

Or try the easy way :grin: : select any code (whether in blocks or in bits and pieces mixed into your sentences) and click the little </> button at the top of the posting box.


As for your question, we can only guess without the original data (or fake data in the same format). My guess is the code in question restructures the data so that each combination of id/session/segment only has one row. If it has multiple rows, each from a different coder, they're merged into a single row. It assumes there are at most two coders for each id/session/segment.

Every row of T.temp2 will have 37 columns: id, session, Segment, and then two each of whatever the other columns are. Of the pairs for the doubled columns, one will have the original name and the other will be named b_ + the original name (e.g., X and b_X).

If you want to adapt this script for your work, I'd suggest taking the concepts behind it and totally rewriting it using either the dplyr or data.table. The choice between the two usually boils down to whatever you prefer to read/write.

As is, the posted code is not idiomatic R; the purpose isn't immediately clear, bugs are more likely to pop up, and the code's likely inefficient. Despite that, if it works for your colleague, that's great. There's no reason for her to rewrite something that works and she understands. But if you're learning R, take advantage of all the shared knowledge.

Hey!

Thank you so much for your time and your advice. Ill post it again with a new trial of better formation .

Best regards
Andreas

This topic was automatically closed 54 days after the last reply. New replies are no longer allowed.