Creating a new column in R (by looking at what a different column says)

Hi! I'm very new to R and I had a quick question and I was hoping someone may know the answer :slight_smile: I am trying to create a new column in a data set and I was hoping to look within a different column. For example, if the already existing Date column has "a certain date", I want "Yes" to be printed in my new column. This is what I have so far but I'm getting an error message that the object Date is not found. I've tried a couple of different things but to no avail. Thank you in advance if anyone is able to provide any advice or direction!

my_data_inflow$SourceFile <- c("Inflow.csv")
my_data_inflow$Month <- c(my_data_inflow, if("7-Oct-19" %in% Date) {
print("Yes")
} else {
print("No")
})

You can use the vectorized function if_else() directly:

my_data_inflow$Month <- if_else("7-Oct-19" %in% my_data_inflow$Date, "yes", "no")

As a side-note, depending on the next steps, it may not be ideal to store the result as text "yes"/"no". If you simply write:

my_data_inflow$Month <- "7-Oct-19" %in% my_data_inflow$Date

You get a boolean variable that can easily be reused for other operations.

EDIT: and I forgot to explain the problem: indeed Date is not found, because it is a column of the data frame, you need to indicate that to R with my_data_inflow$Date. It is different if you use the tidyverse function mutate(), perhaps that's what you had in mind:

my_data_inflow <- my_data_inflow %>%
  mutate(Month = if_else("7-Oct-19" %in% Date, "yes", "no"))

But that is because mutate() and other dplyr functions use a particular method to allow calling column names with the data frame implicitly declared. This only works inside those functions.

1 Like

Wow, thank you so much for your help @AlexisW! I did that and it's almost perfect however it prints yes for each row in the new Month column even though the date isn't "7-Oct-19" each time. Do I maybe have the Date column assigned to the wrong variable type? For some reason, the if_else command isn't reading into the existing Date column. Do you have a suggestion for this? Thank you for your thoughtful and descriptive answer above, I really appreciate it!

Oh no, I totally missed that, sorry! No it totally makes sense: we test if "7-Oct-19" is in the column, yes it is always somewhere in the column! So the result is always yes.

The correct question to ask is whether that particular row is == "7-Oct-19":

my_data_inflow$Month <- if_else(my_data_inflow$Date == "7-Oct-19" , "yes", "no")
1 Like

Thank you so so much! You have been so helpful!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.