Find id of Column Rows not in Vector using Which

reprex - identify rows in dataframe d2 column c3 not in dataframe d1 column c1

c1 <- c("A", "B", "C", "D", "E")
c2 <- c("a", "b", "c", "d", "e")

c3 <- c("A", "z", "C", "z", "E", "F")
c4 <- c("a", "x", "x", "d", "e", "f")

d1 <- data.frame(c1, c2, stringsAsFactors = F)
d2 <- data.frame(c3, c4, stringsAsFactors = F)

x <- unique(d1["c1"])
y <- d2[,"c3"]

id <- which(!(y %in% x) )

I am trying to find the id's of rows in y where the specified column does not include values of x

Your problem is in defining x:

x <- unique(d1["c1"])
x
#>   c1
#> 1  A
#> 2  B
#> 3  C
#> 4  D
#> 5  E

class(x)
#> [1] "data.frame"

In that case, you wanted to extract one column of d1, but you still have a data.frame. You need to extract the column so that the result is a vector:

x <- unique(d1[["c1"]])
# or x <- unique(d1[,"c1"])
x
#> [1] "A" "B" "C" "D" "E"
class(x)
#> [1] "character"

Then it'll work. See sections 20.5.2 and 20.5.3 of the r4ds book for more visual explanations.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.