dataframe filter on two columns not working

majjac · February 5, 2020, 12:55pm

hi, I have the following dataframe

ID= c(1,1,1,2,2,2,2,3,3)
week = c(1,1,2,1,1,2,2,1,2)
X = c(3.3,4.23,5.6,12,3.1,4.3,5.9,6.1,5.3)
Y = c(1.3,2.4,6.8,5.5,4.3,3,6.6,2.6,5.7)
DF = data.frame(ID,week,X,Y)

I would like to remove rows where 'week' is identical for each 'ID', i.e. the dataframe should read

ID  week     X     Y
1    1          3.30 1.3
1    2          5.60 6.8
2    1          12.0 5.5
2    2          4.30 3.0
3    1          6.10 2.6
3    2          5.30 5.7

So far I have been trying
DF = DF[!(duplicated(DF$ID) & duplicated(DF$week)), ]
but hat doesn't work as expected.

Any ideas would be very welcome. M

FJCC · February 5, 2020, 1:55pm

Here is one way.

library(dplyr)
 
ID= c(1,1,1,2,2,2,2,3,3)
week = c(1,1,2,1,1,2,2,1,2)
X = c(3.3,4.23,5.6,12,3.1,4.3,5.9,6.1,5.3)
Y = c(1.3,2.4,6.8,5.5,4.3,3,6.6,2.6,5.7)
DF = data.frame(ID,week,X,Y)
DF
#>   ID week     X   Y
#> 1  1    1  3.30 1.3
#> 2  1    1  4.23 2.4
#> 3  1    2  5.60 6.8
#> 4  2    1 12.00 5.5
#> 5  2    1  3.10 4.3
#> 6  2    2  4.30 3.0
#> 7  2    2  5.90 6.6
#> 8  3    1  6.10 2.6
#> 9  3    2  5.30 5.7
DF2 <- DF %>% group_by(ID, week) %>% summarize(X = X[1], Y = Y[1])
DF2
#> # A tibble: 6 x 4
#> # Groups:   ID [3]
#>      ID  week     X     Y
#>   <dbl> <dbl> <dbl> <dbl>
#> 1     1     1   3.3   1.3
#> 2     1     2   5.6   6.8
#> 3     2     1  12     5.5
#> 4     2     2   4.3   3  
#> 5     3     1   6.1   2.6
#> 6     3     2   5.3   5.7

^{Created on 2020-02-05 by the reprex package (v0.3.0)}

martin.R · February 5, 2020, 2:03pm

This should work:
DF[!duplicated(DF[c("ID", "week")]), ]

system · February 12, 2020, 2:03pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.