Problem when creating a variable based on 2 variables.

I have a problem when creating a variable based on 2 variables.

One variable is base$a8 of class numeric and the other one is base$a12 which are all numbers except some observations that contain "NC" and therefore the variable is of class character.

df<- base %>%
  mutate(ocup_hor= case_when(a8==1 & a12>45 ~ "test1",
                             a8==1 & a12 %in% c(35:45) ~ "test2",
                             a8==1 & a12<35 ~ "test3",
                             a8==2 ~ "test4",
                             TRUE ~ "error"))

I apply this code to create it but for some reason there are 2 observations that instead of being test3 are coded as "error". I don't understand why.

df %>%
  select(ocup_hor, a8,a12) %>% 
  filter(ocup_hor == "error") 

# A tibble: 2 × 3
  ocup_hor    a8 a12  
  <chr>    <dbl> <chr>
1 error        1 4    
2 error        1 4 

If a8 is 1 and a12 is 4, shouldn't it be recoded as test3?

I also make this code and surprisingly I get FALSE.

df %>%
  select(ocup_hor, a8,a12) %>% 
  filter(ocup_hor == "error") %>%
  mutate(a8==1 & a12<35)

# A tibble: 2 × 4
  ocup_hor    a8 a12   `a8 == 1 & a12 < 35`
  <chr>    <dbl> <chr> <lgl>               
1 error        1 4     FALSE               
2 error        1 4     FALSE 

But if instead of a12<35 I write a12<40 I get the correct result.

df %>%
  select(ocup_hor, a8,a12) %>% 
  filter(ocup_hor == "error") %>%
  mutate(a8==1 & a12<40)

# A tibble: 2 × 4
  ocup_hor    a8 a12   `a8 == 1 & a12 < 40`
  <chr>    <dbl> <chr> <lgl>               
1 error        1 4     TRUE                
2 error        1 4     TRUE  

Does anyone know why?

You are comparing the text "4" to the numeric values 35 and 40. I suppose that R is coercing the numbers to characters and comparing "4" < "35" and "4" < "40".

"4" < 35
[1] FALSE
"4" < 40
[1] TRUE
"40" == 40
[1] TRUE

If all of the other comparisons in the data are working as you expect, you are lucky. You probably need to change the "NC" values to NA and the change all the characters to numeric values, but I can't be sure of your needs.

Thanks, so for what I understand, there is no way to use the values numerically without first converting the NCs to NAs. Too bad because I was interested in having both values in the column.

I am not sure if this will work.

df<- base %>%
  mutate(ocup_hor= case_when(a8==1 & as.numeric(a12)>45 ~ "test1",
                             a8==1 & as.numeric(a12) %in% c(35:45) ~ "test2",
                             a8==1 & as.numeric(a12)<35 ~ "test3",
                             a8==2 ~ "test4",
                             TRUE ~ "error"))