Delete rows based on previos rows

Hello,

I have a question regarding my datatabel in R.
Here is a minimal example of my DT (the real DT is > 80.000 rows):

ID, dosis, days
1, 10, 100
1,0,70
1,10,100
1,0,35
1,10,700
2,10,50
2,0,12

What I need to do, is kind of screening of the DT, and when for each ID:
dose=0 & days >= 30, then the row AND the subsequent rows should be deleted.
And then a screening of the next ID and so on.....for all the 80.000 rows (approx 2000 IDs)

Here the result should be:

ID, dosis, days
1, 10, 100
2,10,50
2,0,12

Whould appreciate if anyone can help.

Kindest regards,
Nice

Hi @Nice, could you post a sample of your table, like this?

```
<--- paste output of dput(head(DT, 50)) here, including ```s
```

@dromano
Hi David!

The problem is that the DT is confident, so in my previous example I tried to write a simple dataset that can illustrate what I mean.

Now I have replaced the confident data with other values so it looks more correct:

dput(head(my_data, 10))

gave this output:

structure (list (id = c(1,1,1,2,3,3,4,5,5,5) , dose = c(20,0,20,20,15,20,20,15,20,15), days= structure( c(200,400,100,300,20,300,200,50,50,100), class ="difftime", units ="days")), classe = c("data.table", "data.frame"), row.names = c (NA, -10L), .internal.selfref = <pointer: 0x00000000001f1ef0>)

" ``` " ike this….?

Thanks, Nour: In general it's easier for folks to help you if you paste the data between triple backticks ``` as I described -- could click the pencil icon on your post and insert triple backticks before and after the output?

Also, the output you posted is incomplete -- could you re-paste it?

I tried again, but do not get any " ``` "..... though, I added it on my previous post in the begning and end … hope it is correct :slight_smile:

Almost :slight_smile: -- a step in the right direction, but the output from dput() is still incomplete, and make sure you put the ``` by itself on the line before your output, and again by itself on the line after your output.

Ok, I understand!
Now it is corrected!
Thank you so much!

Hope someone can help :slight_smile:

Great -- now a question of clarification:

What do you mean by 'subsequent'? The rows that appear later in the table?

This is becoming a little frustrating to watch, you are supposed to ask your questions providing a proper REPRoducible EXample (reprex) illustrating your issue.

Like in this answer

library(tidyverse)

sample_df <- data.frame(
          ID = c(1, 1, 1, 1, 1, 2, 2),
       dosis = c(10, 0, 10, 0, 10, 10, 0),
        days = c(100, 70, 100, 35, 700, 50, 12)
)

sample_df %>% 
    group_by(ID) %>% 
    mutate(flag = if_else(dosis == 0 & days >= 30, TRUE, NA)) %>% 
    fill(flag, .direction = "down") %>% 
    filter(is.na(flag)) %>% 
    select(-flag)
#> # A tibble: 3 x 3
#> # Groups:   ID [2]
#>      ID dosis  days
#>   <dbl> <dbl> <dbl>
#> 1     1    10   100
#> 2     2    10    50
#> 3     2     0    12

Created on 2020-03-28 by the reprex package (v0.3.0.9001)

@dromano:
I mean the subsequent rows for the same ID, i.e. if there is a pause in medication for more than 30 days, then I want to delete this row and all the other rows after this row for the patient.
The issue is that I want only the observations for each patient before the first pause in medication ( a pause is defined as 30 days without medicine "0" or more) but not the other rows afterwards - hope this makes sense...

@andresrcs:
Thank you so much - and sorry I did not see the help section - this is my first time here…
Thank you for the code, but it do not solve my problem, as I get:
Error in days >=30 :
comparison (5) is possible only for atomic and list types

What do I do wrong?

It works with the sample data you have provided, to been able to help you, please read the guide on the link I gave you and provide an example that reproduces your issue.

thank you so much...ok...I will read it first and see…..thank you :slight_smile:

So the rows are assumed to be in chronological order then, right?

@dromano
yes, they are :slight_smile:
Now I have also corrected my dput - hope it is correct now!

1 Like

Did you still need help, or did @andresrcs's solution eventually work for you? If you do still help, it'll be important to get your dput() output in order since what you posted still doesn't work.

@dromano
Thank you so much! My problem is solved :slight_smile:

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.