Arranging data sets by ID

Hello guys
I have two data sets, each contains ID collum and a grade collum. Each has same number of values and same ID's. Im trying to calculate a Pearson between them, but i cant figure out how to sort the data so it matches two grades for each ID.
The ID's are not sorted in the same order, and by using "group by" or "sort" i cant seem to crack this one down.

thank you for usefull suggestions

Hi Ryan
try base::merge (by='ID') or even data.table::merge, if your data is quite big (i.e. > 10^6 rows).
start with typing:
?merge

in your terminal

Your can calc peason via
cor(x, y , method = c("pearson"))

cheers

1 Like

Thanks buddy, I will definitely try the merge function.

Thank you, It worked! I just needed to change my collums in my original excel (still easiter than in R at this point)

unitl now :slight_smile:

df=data.frame(a=runif(0:10), b=runif(10:20), d=runif(30:40))
names(df)
#a,b,d
df = df[c('b','a','d')] 
names(df)
#b,a,d

Its so odd but when im trying to make my own example of this, it sais "undifined number of collums"

dff=data.frame(a=1:10, b=21:30, c=41:50)
dff=dff[c('q','t','p')]
Error in `[.data.frame`(dff, c("q", "t", "p")) : 
  undefined columns selected

Look at your colnnanes!
... you declared a,b,c ...

1 Like

got it, so its only replacing but not actually re naming cols

Thats right. If you want to rename your cols, do

names(dff) <- c('new','n2','n3') 

β€œTo understand computations in R, two slogans are helpful: Everything that exists is an object. Everything that happens is a function call.”
– John M. Chambers [https://arxiv.org/abs/1409.3531]

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.