filter subjects having perticular value in another column

Hi my data set looks like this

id  <- c(1,1,2,2,3,3,4,4,5,5)
amt <- c(250,NA,750,NA,750,NA,500,NA,750,NA)
dv  <- c(NA,1,NA,2,NA,1,NA,5,NA,4)
df  <- data.frame(id,amt,dv)

Each ID has two rows. now i want to select the subjects whose amt == 750.

filter(df,amt == 750)

above code gives me the result, but it gives only one row for each id which have amt==750.
i want have both the rows of the id which has amt==750 ( second row of amt of that id will be NA)

Hello,

So lets take a step back. As you can see below NA is not a value that we can compare against as R doesn't know how to treat it. As you can see in the example it returns NA and not FALSE. Your easiest fix would be to replace NA with an appropriate value such as 0 and then proceed with the filtering.

> NA == 750
[1] NA

Data i have

id  <- c(1,1,2,2,3,3,4,4,5,5)
amt <- c(250,NA,750,NA,750,NA,500,NA,750,NA)
dv  <- c(NA,1,NA,2,NA,1,NA,5,NA,4)
df  <- data.frame(id,amt,dv)

now i want to select ids who have amt of 750

output i need

id  <- c(2,2,3,3,5,5)
amt <- c(750,NA,750,NA,750,NA)
dv  <- c(NA,2,NA,1,NA,4)
df  <- data.frame(id,amt,dv)

please help me to do this .
thanks

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

Do you have a particular reason for having your data in this untidy format? I think it would make more sense to have one observation per row instead of two rows per observation. Consider this example:

library(dplyr)

# Sample data
df <- data.frame(
          id = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5),
         amt = c(250, NA, 750, NA, 750, NA, 500, NA, 750, NA),
          dv = c(NA, 1, NA, 2, NA, 1, NA, 5, NA, 4)
)

# Relevant code
df %>% 
    group_by(id) %>% 
    summarise_all(sum, na.rm = TRUE) %>% 
    filter(amt == 750)
#> # A tibble: 3 x 3
#>      id   amt    dv
#>   <dbl> <dbl> <dbl>
#> 1     2   750     2
#> 2     3   750     1
#> 3     5   750     4

Created on 2020-10-21 by the reprex package (v0.3.0)

If you need the original two rows per id format you can do something like this

df[df$id %in% df[df$amt==750,]$id,]
#>    id amt dv
#> 3   2 750 NA
#> 4   2  NA  2
#> 5   3 750 NA
#> 6   3  NA  1
#> 9   5 750 NA
#> 10  5  NA  4
1 Like

Thank you. Yeah , I have a reason to keep data in that format.

I have one more doubt. how can i go for multiple conditions in this code.

df[df$id %in% df[df$amt==750,]$id,]

i want all the rows of perticular ID whose amt == 750, and dv == 1