Hi. I am a beginner in the RStudio, a person who is not programmer but start timidly/shyly to be interested of environment/program R and I've just started gaining experience so please forgive me mine lack of knowledge, mistakes that can sneak in when writing a post
but to the point.
For the table in the attachment, I want to execute the code below.
for(r in 2:k){
if(data[r,2]==data[r-1,2]){
data[r,6]<-data[r,6]+data[r-1,6]}
else{data[r,6]<-data[r,6]}
}
My "k" is 1 million (rows), but I can split dataset into the several smaller datasets if necessary.
I want to overwrite the 6th column in each row: in the 6th column, I want to count how many rows above the currently analyzed row contain the same value 'user_id'.
My code is ineffective for 100k. rows, I have to do this task for several tables with 1 million rows (16 million in total).