Create new variable using case when statement

Hi everyone,
I just want to know why is the 3rd code of chunks not working.
I would be happy if someone could show me how it should work with paste or paste0 function.
Thank you!

# data frame code chunks
my_basket = data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"), 
                       ITEM_NAME = c("Apple","Banana","Orange","Mango","Papaya","Carrot","Potato","Brinjal","Raddish","Milk","Curd","Cheese","Milk","Paneer"),
                       Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,NA),
                       Tax = c(2,4,5,6,2,3,5,1,3,4,5,6,4,NA))

# this code of chunks work to create a new column called **Price_band** based on the existing column 
my_basket %>% 
  mutate(Price_band = case_when(Price>=50 & Price <=70   ~ "Medium", 
                                Price > 70 ~ "High", TRUE ~ "Low"))

# But when I use the paste function to create a new column(**Price_band**) based on the existing column, it won't work.
my_basket %>% 
  mutate(Price_band = case_when(ITEM_GROUP %in% c("Fruit","Dairy") ~ paste0(c("Fruit","Dairy")), 
                                Price > 70 ~ "High", TRUE ~ "Low"))

paste0() is giving you a vector 2 long. All case_when() knows is whether the condition is TRUE, so it doesn't know which one to choose.

@startz ,
I thought this should work but got errors like
`

Error in if (my_basket $Price_band %in% my) { : the condition has length > 1

`

my = c("Fruit", "Dairy")
for (i in 1:nrow(my_basket)){
  if(my_basket$Price_band %in% my){
    for(j in 1:length(my)){
     my_basket$Price_band = paste(my[j])
    }
  }
}

Anyways thanks for your feedback.

Does this code do what you want?

library(dplyr)

my_basket = data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"), 
                       ITEM_NAME = c("Apple","Banana","Orange","Mango","Papaya","Carrot","Potato","Brinjal","Raddish","Milk","Curd","Cheese","Milk","Paneer"),
                       Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,NA),
                       Tax = c(2,4,5,6,2,3,5,1,3,4,5,6,4,NA))

my_basket %>% 
  mutate(Price_band = case_when(ITEM_GROUP %in% c("Fruit","Dairy") ~ paste0(c("Fruit","Dairy"), collapse = ""), 
                                Price > 70 ~ "High", TRUE ~ "Low"))
#>    ITEM_GROUP ITEM_NAME Price Tax Price_band
#> 1       Fruit     Apple   100   2 FruitDairy
#> 2       Fruit    Banana    80   4 FruitDairy
#> 3       Fruit    Orange    80   5 FruitDairy
#> 4       Fruit     Mango    90   6 FruitDairy
#> 5       Fruit    Papaya    65   2 FruitDairy
#> 6   Vegetable    Carrot    70   3        Low
#> 7   Vegetable    Potato    60   5        Low
#> 8   Vegetable   Brinjal    70   1        Low
#> 9   Vegetable   Raddish    25   3        Low
#> 10      Dairy      Milk    60   4 FruitDairy
#> 11      Dairy      Curd    40   5 FruitDairy
#> 12      Dairy    Cheese    35   6 FruitDairy
#> 13      Dairy      Milk    50   4 FruitDairy
#> 14      Dairy    Paneer    NA  NA FruitDairy

Created on 2022-11-19 with reprex v2.0.2

@FJCC

Thanks for the reply, but I want Fruit to be Fruit and Dairy to be Dairy, not as FruitDairy in the new column.

That would be

my_basket %>% 
  mutate(Price_band = case_when(ITEM_GROUP %in% c("Fruit","Dairy") ~ ifelse(ITEM_GROUP == "Fruit", "Fruit","Dairy"), 
                                Price > 70 ~ "High", TRUE ~ "Low"))

@FJCC,

yeah, this could be a great solution, but what if you have more than two variables (c("Fruit", "Dairy", "egg", "cereals", "milk" ), then I have to type and run for each which I don't like to do.

I made the answer more complicated than it needs to be.

my_basket %>% 
  mutate(Price_band = case_when(ITEM_GROUP %in% c("Fruit","Dairy") ~ ITEM_GROUP, 
                                Price > 70 ~ "High", TRUE ~ "Low"))
2 Likes

@FJCC ,

Thanks a lot. Prefect !!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.