Hello guys I have two data sets, each contains ID collum and a grade collum. Each has same number of values and same ID's. Im trying to calculate a Pearson between them, but i cant figure out how to sort the data so it matches two grades for each ID. The ID's are not sorted in the same order, and by using "group by" or "sort" i cant seem to crack this one down.
thank you for usefull suggestions
Hi Ryan try base::merge (by='ID') or even data.table::merge, if your data is quite big (i.e. > 10^6 rows). start with typing: ?merge
?merge
in your terminal
Your can calc peason via cor(x, y , method = c("pearson"))
cor(x, y , method = c("pearson"))
cheers
Thanks buddy, I will definitely try the merge function.
Thank you, It worked! I just needed to change my collums in my original excel (still easiter than in R at this point)
unitl now
df=data.frame(a=runif(0:10), b=runif(10:20), d=runif(30:40)) names(df) #a,b,d df = df[c('b','a','d')] names(df) #b,a,d
Its so odd but when im trying to make my own example of this, it sais "undifined number of collums"
dff=data.frame(a=1:10, b=21:30, c=41:50) dff=dff[c('q','t','p')] Error in `[.data.frame`(dff, c("q", "t", "p")) : undefined columns selected
Look at your colnnanes! ... you declared a,b,c ...
got it, so its only replacing but not actually re naming cols
Thats right. If you want to rename your cols, do
names(dff) <- c('new','n2','n3')
“To understand computations in R, two slogans are helpful: Everything that exists is an object. Everything that happens is a function call.” – John M. Chambers [https://arxiv.org/abs/1409.3531]
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.