Delete whole when it has all "NA" S in selected column

sai_matcha · November 28, 2020, 4:21pm

data i have

id <- c(1,1,1,2,2,3,3,3,4)
a <- c(3,6,7,9,3,1,4,6,4)
b <- c(1,NA,5,NA,NA,1,2,NA,NA)
df <- data.frame(id,a,b)

data i want

id <- c(1,1,1,3,3,3)
a <- c(3,6,7,1,4,6)
b<- c(1,NA,5,1,2,NA)
df2 <- data.frame(id,a,b)

i just want to delete the ID'S whose b column is full of NA's

Please help me to do this.
Thank you

FJCC · November 28, 2020, 4:42pm

Here is one method.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
id <- c(1,1,1,2,2,3,3,3,4)
a <- c(3,6,7,9,3,1,4,6,4)
b <- c(1,NA,5,NA,NA,1,2,NA,NA)
df <- data.frame(id,a,b)

df2 <- df %>% group_by(id) %>% summarize(Flag=any(!is.na(b))) %>% 
  inner_join(df, by = "id") %>% 
  filter(Flag == TRUE) %>% 
  select(id, a, b)
#> `summarise()` ungrouping output (override with `.groups` argument)
df2
#> # A tibble: 6 x 3
#>      id     a     b
#>   <dbl> <dbl> <dbl>
#> 1     1     3     1
#> 2     1     6    NA
#> 3     1     7     5
#> 4     3     1     1
#> 5     3     4     2
#> 6     3     6    NA

^{Created on 2020-11-28 by the reprex package (v0.3.0)}

vinaychuri · December 1, 2020, 3:04pm

Here is an alternative as well since the "b" column is numeric.

df%>%
group_by(id)%>%
mutate(check_col= sum(b,na.rm=TRUE))%>% # create a temp column, later will be removed
filter(check_col>0)%>% # remove rows with b column having only 0
select(-check_col)%>% # remove the temp column
ungroup()

system · December 8, 2020, 3:04pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.