How to get the households that have at least a member of Ethnic 11?

There is a data frame need to subset based on households, Ethnic, PERSNUM columns.
I have to get all households have have member of a ethnic (e.g here ethnic 11).

There five columns NUMBER, Number_2, Number_3, PERSNUM, and ETHNIC Three of Number columns are households have to group

df <- data.frame(
        Village = c(rep("1", "30")),
        Number = c(33,  33, 33, 33, 33, 33, 33, 1,  1,  30, 30, 30, 30, 30, 30, 30,
                   31,  31, 31, 31, 36, 36, 36, 36, 62, 62, 62, 62, 69, 69),
        Number_1 = c(183,   183,    183,    183,    183,    183,    183,    151,    151,    255,    255,    255,    255,    255,    255,
                     255,   31, 31, 31, 31, 111,    111,    111, 111, 287, 287, 287,287, 219, 219),
        Number_3 = c(137,   137,    137,    137,    137,    137,    137,    113,    113,    191,    191,    191,    191,    191,    191,
                     191,   23, 23, 23, 23, 83, 83, 83, 83, 215, 215, 215, 215, 164, 164),
        PERSNUM = c(1,  2,  3,  4,  5,  6,  7,  1,  2,  3,  1,  2,  3,  4,  5,  6,
                    1,  2,  3,  1,  2,  3,  4, 5,  1, 2, 3, 4, 1, 2),
        ETHNIC = c(33,  33, 33, 33, 33, 33, 33, 1,  1,  1,  1,  1,  1,  0,  11,
                       11,  11, 11, 11, 11, 0,  0,  11, 11, 11, 11, 11, 11, 11, 11))
d <- data.frame(
  Village = c(rep("1", "30")),
  Number = c(
    33, 33, 33, 33, 33, 33, 33, 1, 1, 30, 30, 30, 30, 30, 30, 30,
    31, 31, 31, 31, 36, 36, 36, 36, 62, 62, 62, 62, 69, 69
  ),
  Number_1 = c(
    183, 183, 183, 183, 183, 183, 183, 151, 151, 255, 255, 255, 255, 255, 255,
    255, 31, 31, 31, 31, 111, 111, 111, 111, 287, 287, 287, 287, 219, 219
  ),
  Number_3 = c(
    137, 137, 137, 137, 137, 137, 137, 113, 113, 191, 191, 191, 191, 191, 191,
    191, 23, 23, 23, 23, 83, 83, 83, 83, 215, 215, 215, 215, 164, 164
  ),
  PERSNUM = c(
    1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 6,
    1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2
  ),
  ETHNIC = c(
    33, 33, 33, 33, 33, 33, 33, 1, 1, 1, 1, 1, 1, 0, 11,
    11, 11, 11, 11, 11, 0, 0, 11, 11, 11, 11, 11, 11, 11, 11
  )
)

d[which(d$ETHNIC == 11),]
#>    Village Number Number_1 Number_3 PERSNUM ETHNIC
#> 15       1     30      255      191       5     11
#> 16       1     30      255      191       6     11
#> 17       1     31       31       23       1     11
#> 18       1     31       31       23       2     11
#> 19       1     31       31       23       3     11
#> 20       1     31       31       23       1     11
#> 23       1     36      111       83       4     11
#> 24       1     36      111       83       5     11
#> 25       1     62      287      215       1     11
#> 26       1     62      287      215       2     11
#> 27       1     62      287      215       3     11
#> 28       1     62      287      215       4     11
#> 29       1     69      219      164       1     11
#> 30       1     69      219      164       2     11

Created on 2023-05-19 with reprex v2.0.2

I need to subset the Households not Ethnic. For example I have to get all households that have Ethnic 11. maybe a household has one or two member of ethnic 11.
I have subset all household that have ethnic 11.
There five columns NUMBER, Number_2, Number_3, PERSNUM, and ETHNIC Three of Number columns are households have to group

The package dplyr is very nice to handle these kind of questions.
'Abstracting' your question to football clubs and being interested in Arsenal you could do something like

suppressPackageStartupMessages(library(dplyr))

df1 <- data.frame(
        household = c(1,1,1,2,3,4,4,4),
        club      = c("Arsenal","Arsenal","ManCity","ManUnited",
                      "Arsenal","Westham","ManCity","WBA")
        )

print(df1)
#>   household      club
#> 1         1   Arsenal
#> 2         1   Arsenal
#> 3         1   ManCity
#> 4         2 ManUnited
#> 5         3   Arsenal
#> 6         4   Westham
#> 7         4   ManCity
#> 8         4       WBA

df1 |>
  mutate(fanArsenal= case_when(
    club == "Arsenal" ~ 1,
    TRUE ~ 0
  )) |>
  group_by(household) |>
  summarise(n=sum(fanArsenal)) |>
  filter(n>0)
#> # A tibble: 2 × 2
#>   household     n
#>       <dbl> <dbl>
#> 1         1     2
#> 2         3     1
Created on 2023-05-19 with reprex v2.0.2

I think couldn't explain my question.

I don't want to sum or calculate. I just want to subset and get the all households that has the member of ethnic 11.

As you see in column PERSNUM 1 is for husband, 2 is for wife, 3 is for first child, 4 is for second child, 5 is for father, 6 is for mother and so on.
If in a household just wife be from ethnic 11 have to subset. or whole member be from ethnic 11.
If in a household just wife be from ethnic 11 have to subset or whole household be.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.