How do I use the filter function from the dplyr package in studio to remove messy data? My csv file has a column of data which is supposed to be binary, but there is non binary data greater than 1 (values of 2). How do I filter this out my dataset?
Welcome to R community.
You may start by reading how to post a question here for faster responses:
However, since this is your first post we will help you straight off.
Your problem is straightforward.
library(tidyverse) # assume you have this dataframe, with few values of y that are 1 or 0 and you want to filter those rows. df <- tribble( ~x, ~y, "A",1, "B",2, "C",0, "D",-1, "E",NA ) df #> # A tibble: 5 × 2 #> x y #> <chr> <dbl> #> 1 A 1 #> 2 B 2 #> 3 C 0 #> 4 D -1 #> 5 E NA # filtering y to be in a set of values df %>% filter(y %in% c(0,1)) #> # A tibble: 2 × 2 #> x y #> <chr> <dbl> #> 1 A 1 #> 2 C 0 Created on 2022-05-07 by the reprex package (v2.0.1)
Happy learning R