Adding values to a column based on regular expressions found in the values of another column

Goal

I am trying to make a data frame that will match the files needed to populate certain variables. I hope to pass this object further down the workflow to control a wrangling function.

I take the object X100samples and put the variable names into a dataframe. I am hoping that from there I can mutate, and conditionally check the names in the dataframe for regular expression such as "source". If the expression is found then I will populate those elements in the new column with a particular file name.

Simplified Reprex

library(dplyr)
library(tibble)
library(stringr)

stead_data <-tibble(c(
  "network_code","receiver_code",
  "receiver_type",
  "receiver_latitude",
  "receiver_longitude",
  "receiver_elevation_m",
  "p_arrival_sample",
  "p_status","p_weight",
  "p_travel_sec",
  "s_arrival_sample",
  "s_status",
  "s_weight",
  "source_id",
  "source_origin_time",
  "source_origin_uncertainty_sec"
))

nmtso_files <- c("eigen.dat", 
                 "hypo71.dat", 
                 "jparam.dat", 
                 "newloc.dat",
                 "picfil.dat", 
                 "refloc.dat", 
                 "stacrd.dat", 
                 "temp_picfil.dat", 
                 "velmod.dat")

matched_data <- stead_data %>%
  mutate(
    nmtso_files = if (stead_data == regex(pattern = "source*")) {
      nmtso_files[1]
    } else {
      nmtso_files[4]
    }
  )

You are not using valid dplyr syntax and I'm not sure I understand what you are trying to accomplish. Is this close to the output you are looking for?

library(dplyr)
library(stringr)

stead_data <-data.frame( names = c(
    "network_code","receiver_code",
    "receiver_type",
    "receiver_latitude",
    "receiver_longitude",
    "receiver_elevation_m",
    "p_arrival_sample",
    "p_status","p_weight",
    "p_travel_sec",
    "s_arrival_sample",
    "s_status",
    "s_weight",
    "source_id",
    "source_origin_time",
    "source_origin_uncertainty_sec"
))

nmtso_files <- c("eigen.dat", 
                 "hypo71.dat", 
                 "jparam.dat", 
                 "newloc.dat",
                 "picfil.dat", 
                 "refloc.dat", 
                 "stacrd.dat", 
                 "temp_picfil.dat", 
                 "velmod.dat")

stead_data %>%
    mutate( nmtso_files = if_else(str_detect(names, "source.*"),
                                  nmtso_files[1],
                                  nmtso_files[4]))
#>                            names nmtso_files
#> 1                   network_code  newloc.dat
#> 2                  receiver_code  newloc.dat
#> 3                  receiver_type  newloc.dat
#> 4              receiver_latitude  newloc.dat
#> 5             receiver_longitude  newloc.dat
#> 6           receiver_elevation_m  newloc.dat
#> 7               p_arrival_sample  newloc.dat
#> 8                       p_status  newloc.dat
#> 9                       p_weight  newloc.dat
#> 10                  p_travel_sec  newloc.dat
#> 11              s_arrival_sample  newloc.dat
#> 12                      s_status  newloc.dat
#> 13                      s_weight  newloc.dat
#> 14                     source_id   eigen.dat
#> 15            source_origin_time   eigen.dat
#> 16 source_origin_uncertainty_sec   eigen.dat

Created on 2021-02-20 by the reprex package (v1.0.0)

That's exactly what I was looking to do! Thank you kindly.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.