data manipulation

Hi,
I was trying to manipulate data (data have) to the 'data want'. But, I was not sure how. Wondering, if anyone could help me out.

Thanks in advance.

Khalid

Hi there, welcome to the community.

I'm not clear on the criteria you are using to get from what you have to what you want.

I want to keep the exams (B4 & B5) followed by the earliest exam B. As mentioned in the screenshot, out of two B4s, want to keep the first one, followed by the closest B. Similarly, B5 followed by B. Not sure if I explained it clearly.

I am not clear in the ask but looks like the data is already preformated and its a filtering issue.

df<-structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), exam_date = c("11-Jan-20", 
"1-Jan-20", "30-Mar-20", "5-Apr-20", "28-May-20", "29-May-20", 
"1-Jun-20", "2-Jun-20", "3-Jun-20"), exam = c("B4", "B4", "B", 
"B5", "B", "B5", "B", "B", "B")), class = "data.frame", row.names = c(NA, 
-9L))

df%>%
  mutate(keep_delete=ifelse(exam==lag(exam,default="YES"),"NO","YES"))%>% # checking the earlier entry
  filter(keep_delete=="YES")%>% # filter records where that are duplicates as per your criteria
  select(-keep_delete) # remove the temp column



1 Like

And today I learned about the structure() function. Thanks!

2 more things to add.

  1. try str(any variable) and you will see more details of that variable and its structure.
  2. dput(variable, especially a small data frame) will give the structure output in ascii text that can be shared with others for reuse.

Enjoy your journey in R !!

1 Like

Hi Vinay,
Thank you so much and sorry for being late. Your solution worked for my dataset. Have a wonderful day!

Khalid

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.