How to subset my dataset based on multiply string filters?

My data set is very big, so I copy and paste the first 15 rows as a new exemple excel file. But I don't know how to put my data here. So let's just use "mtcars" data. (By the way, can anyone give me a suggestion how to put my dataset here?)

Based on “mtcars” example, I want to subset my dataset when the car model is Mazda or Merc. Notice that "Mazda" is only the sub-string of the original "Mazda RX4 ".

My desired codes should be kind of like a program, which the inputs are these sub-strings. And I need to extract all rows that include such sub-strings, to set a new data set.

Hi you are asking
a) how to reprex
FAQ: How to do a minimal reproducible example ( reprex ) for beginners
b) how to filter on strings

library(tidyverse)
(mtcars_df <- as_tibble(mtcars, rownames = "carname"))

patterns <- "Mazda|Merc"

(mtcars_df %>% filter(
  str_detect(
    carname,
    patterns
  )))

Thanks. Let me try to paste my sample data here (first 15 rows).

data.frame(
  stringsAsFactors = FALSE,
           KNUMBER = c("K991216","K032099","K073284",
                       "K991588","K991874","K992504","K971443","K973899",
                       "K982275","K983045","K983974","K983994","K993385",
                       "K150547","K061440"),
         APPLICANT = c("(ARMS) HOFFMAN SURGICAL EQUIPMENT, IVF DIVISION","(EMS SA) ELECTRO MEDICAL SYSTEMS",
                       "(EMS SA) ELECTRO MEDICAL SYSTEMS",
                       "(EMS SA) ELECTRO MEDICAL SYSTEMS","(EMS SA) ELECTRO MEDICAL SYSTEMS",
                       "(EMS SA) ELECTRO MEDICAL SYSTEMS","*OWNE MUMFORD USA, INC.**",
                       "*OWNE MUMFORD USA, INC.**","*OWNE MUMFORD USA, INC.**",
                       "*OWNE MUMFORD USA, INC.**",
                       "*OWNE MUMFORD USA, INC.**","*OWNE MUMFORD USA, INC.**",
                       "*OWNE MUMFORD USA, INC.**",".decimal, inc",".DECIMAL, INC."),
        DEVICENAME = c("HOFFMAN IVF-1",
                       "SWISS MASTER LIGHT","AIR-FLOW MASTER STANDARD",
                       "EMS SWISS ORTHOCLAST","EMS SURFACE MOUNT SCALER","EMS KERMIT",
                       "RAPPORT V.T.D.","UNIFINE PENTIPS","RAPPORT V.T.D.","AMIELLE",
                       "OWEN MUMFORD 3ML AUTOPEN","UNIFINE",
                       "AUTOJET 2 (NON-FIXED NEEDLE TYPE)",".decimal Astroid Dosimetry App","P.D")
)

Can you teach me again? For example, I want to subset my dataset when the applicant is decimal or DECIMAL. Notice that "decimal" is only the sub-string of the original ".decimal, inc".
My desired codes should be kind of like a program, which the inputs are these sub-strings. And I need to extract all rows that include such sub-strings, to set a new data set. One caveat is that I don't want to differentiate uppercase and lowercase. In other words, Boston and boston are the same. I want to find them both.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.