In my dataset, I'm looking for multiple entries of values in a column. I first grouped my dataset by the column, let's call it column X. I now want to get rid of all the groups in column X that consist of only 1 entry
Here is one solution. I stored your data in a file named Dummy.csv
library(dplyr)
DF <- read.csv("~/R/Play/Dummy.csv", stringsAsFactors = FALSE)
Singles <- DF %>% group_by(X) %>%
summarize(COUNT = n()) %>%
filter(COUNT == 1)
Singles
#> # A tibble: 1 x 2
#> X COUNT
#> <int> <int>
#> 1 2 1
DFnew <- anti_join(DF, Singles, by = "X")
DFnew
#> X Y Z
#> 1 1 A B
#> 2 1 A C
#> 3 1 G I
#> 4 3 A E
#> 5 3 A O
A related approach to @FJCC's answer is the following:
library(tidyverse)
df <- tribble(
~X, ~Y, ~Z,
1, "A", "B",
1, "A", "C",
1, "G", "I",
2, "R", "D",
3, "A", "E",
3, "A", "O"
)
df %>%
group_by(X) %>%
filter(n() > 1) %>%
ungroup()
#> # A tibble: 5 x 3
#> X Y Z
#> <dbl> <chr> <chr>
#> 1 1 A B
#> 2 1 A C
#> 3 1 G I
#> 4 3 A E
#> 5 3 A O