How to categorize by value

gvkey <- c(1004, 1072, 1082, 1111, 1173, 1183, 1210, 1240, 1360, 1393, 1468, 1554, 1562, 1618, 1655, 1681, 1686)
SIC2 <- c(50,36, 87, 73, 38, 78, 45, 54, 73, 75, 27, 38, 73, 73, 72, 99, 32)
data_1 <- data.frame(gvkey, SIC2)

Hello everyone,

I currently have a database in which a lot of individual values for the variable SIC2. Currently I am trying to categorize these codes, since I am only interested in their sector.

I tried to run the following code:
data_1 <- data_1 %>%
group_by(SIC2) %>%
mutate(SIC2_new = if_else(any(SIC2 < 100, na.rm = TRUE) & any(SIC2 >= 99, na.rm = TRUE), "99",
if_else(any(SIC2 < 99, na.rm = TRUE) & any(SIC2 >= 90, na.rm = TRUE), "90-98",
if_else(any(SIC2 < 90, na.rm = TRUE) & any(SIC2 >= 70, na.rm = TRUE), "70-89",
if_else(any(SIC2 < 70, na.rm = TRUE) & any(SIC2 >= 60, na.rm = TRUE), "60-67",
if_else(any(SIC2 < 60, na.rm = TRUE) & any(SIC2 >= 52, na.rm = TRUE), "52-59",
if_else(any(SIC2 < 52, na.rm = TRUE) & any(SIC2 >= 50, na.rm = TRUE), "50-51",
if_else(any(SIC2 < 50, na.rm = TRUE) & any(SIC2 >= 40, na.rm = TRUE), "40-49",
if_else(any(SIC2 < 40, na.rm = TRUE) & any(SIC2 >= 20, na.rm = TRUE), "20-39",
if_else(any(SIC2 < 18, na.rm = TRUE) & any(SIC2 >= 15, na.rm = TRUE), "15-17",
if_else(any(SIC2 < 15, na.rm = TRUE) & any(SIC2 >= 10, na.rm = TRUE), "10-14",
if_else(any(SIC2 < 10, na.rm = TRUE) & any(SIC2 >= 0, na.rm = TRUE), "0-9", "no"))))))))))))

But all the observations in my new variable "SIC2_new" are now 99, instead of the correct category based on the value in SIC2.

Can anyone help me in identifying the problem? Or maybe to provide me with an easier method to get to my target?

Thank you in advance!

I don't see why you'd necessarily need group_ by() or any() functions.
wouldnt a simpler approach work?

data_2 <- data_1 %>% mutate(
  sicnew=
  case_when(SIC2<10 ~ "  0-9",
            SIC2<50 ~ " 9-49",
            SIC2<99 ~ "50-98",
            TRUE    ~ "  99+")
)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Thank you very much,
I wasn't aware of the case_when function and didn't know you could apply it like this.

your formula worked!