Concatenate multiple character values in a vector under a condition & apply computation on it

Hola !

I come here so that you can assist me with this problem please, I have a large data table containing data of vegetables coded as character.

I want to perform computations on it with a constraint.

Here is the data structure:
tidyverse_wickham_pres

My problem is that the table contains several values that I would like to concatenate. For example there are two types of "Salade" ("Salade - laitue" + ''Salade plein air ou abris bas'' ") and I would like to concatenate these data by adding the values included in each column to end with a Salad only modality.

My dataset being ver large I would like to find an easy way to do it cleanly

Roughly speaking, this would amount to ''Salade'' = ''Salade - laitue'' + ''Salade plein air ou abris bas''

Here's what I've already started:

# The computation I want and it does works (filtered by the vegetable i want)

Merge_table %>% group_by(CODE) %>%
              filter (CODE %in% c("Concombre plein air ou abri bas",
                                     "Concombre sous serre ou abri haut",
                                     "Courgette plein air ou abri bas",
                                     "Courgette sous serre ou abri haut",
                                     "Salade - laitue",
                                     "Melon plein air ou abri bas",
                                     "Melon sous serre ou abri haut",
                                     "Salade plein air ou abris bas")
                      ) %>%
        
          summarise(nb = n()) %>%  as.data.frame()

I saw it was possible to do something with the package {stringr} but i did not achieve to realise the condition I want

Thanks in advance !!

example:

library(tidyverse)
df_0 <- data.frame(
  stringsAsFactors = FALSE,
  CODE = c(
    "Courgette sous serre ou abri haut", "Concombre sous serre ou abri haut",
    "Courgette sous serre ou abri haut", "Salade plein air ou abris bas",
    "Courgette plein air ou abri bas",
    "Concombre sous serre ou abri haut", "Concombre plein air ou abri bas",
    "Concombre sous serre ou abri haut",
    "Salade plein air ou abris bas", "Melon plein air ou abri bas",
    "Courgette plein air ou abri bas", "Melon sous serre ou abri haut",
    "Melon plein air ou abri bas", "Concombre sous serre ou abri haut",
    "Salade plein air ou abris bas",
    "Courgette sous serre ou abri haut", "Courgette sous serre ou abri haut",
    "Melon plein air ou abri bas",
    "Concombre sous serre ou abri haut", "Salade - laitue",
    "Courgette sous serre ou abri haut", "Salade - laitue",
    "Courgette sous serre ou abri haut", "Concombre sous serre ou abri haut",
    "Salade plein air ou abris bas", "Concombre sous serre ou abri haut",
    "Courgette plein air ou abri bas",
    "Salade plein air ou abris bas", "Melon sous serre ou abri haut",
    "Concombre plein air ou abri bas"
  )
)
group_by(df_0,
  grp = case_when(
    stringr::str_detect(CODE,
      pattern = "Salade"
    ) ~ "Salade",
    TRUE ~ CODE
  )
) |> summarise(n = n())
1 Like

Wow wondeful, it does work !

And if, for example I wanted to it for multiple product once, like for example rows containing "courgette" and rows containing "salade" ? It would be just perfect !

you can repeat this construction. on the left of the ~ symbol is the commonality to detect, and on the right what to assign that to

group_by(df_0,
  grp = case_when(
    stringr::str_detect(CODE,
      pattern = "Salade"
    ) ~ "Salade",
   stringr::str_detect(CODE,
      pattern = "Courgette"
    ) ~ "courgette",
    TRUE ~ CODE
  )
) |> summarise(n = n())
2 Likes

Thank you chief, does works !

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.