Fill column based on presence of string in another column

Hi,

I'm trying to clean a dataframe that looks like this:

df <- data.frame(First = c("Mark", "John", "Anthony"),
                  Last = c("Joshua", "Wellberg", "Kennedy"),
                  Notes = c("DIS# 430477541 Plan Manager: Susan Long McArthur Community Care susan.long@mcarthur.com.au DOB: 19/04/1963 NDIS# 430477541 Start - 15/11/2018 Finish - 15/11/2019",
                            "Plan managed – national disability support partners – invoices@ndsp.com.au",
                            "Self managed
Natalia O/T
NDIS number - 431141456"),
                  NDIS = c(NA,NA,NA),
                  Col = c(NA, NA, NA))

df$Notes <- tolower(df$Notes)

I want to fill the Plan column based on the presence of certain strings in the Notes column. For example, if the Notes column contains the string "self manag", I wan to fill the Plan column with "S". I've tried the following code:

for (row in 1:df) {
        if (grep("self manag", df$Notes)) {
                Plan == "S"
        }
}

When I try this, I get an error saying:
Error in 1:df : NA/NaN argument
In addition: Warning message:
In 1:df : numerical expression has 5 elements: only the first used

What am I doing wrong here?

Thank you kindly for your help.

Hello, your issue is that if() is not vectorise, so r provides ifelse() for that requirement.
There is a good tutorial to look at here :
5 Control flow | Advanced R (hadley.nz)

Hi Ibrahim,

To build on the previous answer, something like this might work well for you:

library(tidyverse)

df <- data.frame(First = c("Mark", "John", "Anthony"),
                 Last = c("Joshua", "Wellberg", "Kennedy"),
                 Notes = c("DIS# 430477541 Plan Manager: Susan Long McArthur Community Care susan.long@mcarthur.com.au DOB: 19/04/1963 NDIS# 430477541 Start - 15/11/2018 Finish - 15/11/2019",
                           "Plan managed – national disability support partners – invoices@ndsp.com.au",
                           "Self managed
Natalia O/T
NDIS number - 431141456"),
                 NDIS = c(NA,NA,NA),
                 Col = c(NA, NA, NA))

df$Notes <- tolower(df$Notes)

df
#>     First     Last
#> 1    Mark   Joshua
#> 2    John Wellberg
#> 3 Anthony  Kennedy
#>                                                                                                                                                               Notes
#> 1 dis# 430477541 plan manager: susan long mcarthur community care susan.long@mcarthur.com.au dob: 19/04/1963 ndis# 430477541 start - 15/11/2018 finish - 15/11/2019
#> 2                                                                                        plan managed – national disability support partners – invoices@ndsp.com.au
#> 3                                                                                                                self managed\nnatalia o/t\nndis number - 431141456
#>   NDIS Col
#> 1   NA  NA
#> 2   NA  NA
#> 3   NA  NA

df %>%
  mutate(Plan = case_when(
    # When "self manag" is found, fill Plan with "S"
    grepl(pattern = "self manag", x = Notes) ~ "S",
    # Same, with "P"
    grepl(pattern = "plan manag", x = Notes) ~ "P"
    
  ))
#>     First     Last
#> 1    Mark   Joshua
#> 2    John Wellberg
#> 3 Anthony  Kennedy
#>                                                                                                                                                               Notes
#> 1 dis# 430477541 plan manager: susan long mcarthur community care susan.long@mcarthur.com.au dob: 19/04/1963 ndis# 430477541 start - 15/11/2018 finish - 15/11/2019
#> 2                                                                                        plan managed – national disability support partners – invoices@ndsp.com.au
#> 3                                                                                                                self managed\nnatalia o/t\nndis number - 431141456
#>   NDIS Col Plan
#> 1   NA  NA    P
#> 2   NA  NA    P
#> 3   NA  NA    S

Created on 2021-03-23 by the reprex package (v1.0.0)

Thank you! Much appreciated.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.