Create a new column based on conditions

Hello Everyone, I am working on two datasets. Below are the details -

Dataset 1:

Date toyCategory
01-01-2020 toy1
25-08-2017 toy2
04-04-2016 toy2
04-01-2015 toy3
15-09-2016 toy1

Dataset 2:

toyCategory Rate StartDate EndDate
toy3 0.9 01-01-2015 31-12-2015
toy2 1.1 16-08-2016 31-08-2017
toy1 1.2 01-01-2016 15-10-2016
toy1 1 16-10-2016 31-12-2999
toy2 1.24 15-02-2015 15-08-2016

I want to create a new column in 1st dataset, which will contain the "Rate" from the 2nd dataset. I want to pull the rates for each toy category present in dataset 1 based on a condition -

Date in dataset 1 should fall between the Start Date & End Date given in dataset 2.

Accordingly, the "Rate" should be filled in the new column in Dataset 1.

Please provide a solution to this.

Thank a lot in advance.

Fuzzyjoin package Tidy way to range join tables, on an interval of dates

Thank you for the response. But I'm not exactly able to create a new column. Can you please replicate this on the above provided datasets.

I can provide an example using fuzzyjoin if you first turn your example into a reprex.
Here is a guide for how to do that. FAQ: How to do a minimal reproducible example ( reprex ) for beginners

1 Like

Here you go!
Dataset 1:
data.frame(
stringsAsFactors = FALSE,
Key = c(1L, 2L, 3L, 4L, 5L),
Date = c("2020-01-01",
"2017-08-25","2016-04-04","2015-01-04","2016-09-15"),
toyCategory= c("toy1", "toy2", "toy2", "toy3", "toy1")
)

data.frame(
stringsAsFactors = FALSE,
Description = c("toy3", "toy2", "toy1", "toy1", "toy2"),
Num_Rate = c(0.9, 1.1, 1.2, 1, 1.24),
EffectiveDate = c("2015-01-01","2016-08-16",
"2016-01-01","2016-10-16","2015-02-15"),
ExpiryDate = c("2015-12-31","2017-08-31",
"2016-10-15","2999-12-31","2016-08-15")
)

(dfa <- data.frame(
  stringsAsFactors = FALSE,
  Key = c(1L, 2L, 3L, 4L, 5L),
  Date = c("2020-01-01",
           "2017-08-25","2016-04-04","2015-01-04","2016-09-15"),
  toyCategory= c("toy1", "toy2", "toy2", "toy3", "toy1")
))

(dfb <- data.frame(
  stringsAsFactors = FALSE,
  Description = c("toy3", "toy2", "toy1", "toy1", "toy2"),
  Num_Rate = c(0.9, 1.1, 1.2, 1, 1.24),
  EffectiveDate = c("2015-01-01","2016-08-16",
                    "2016-01-01","2016-1a-16","2015-02-15"),
  ExpiryDate = c("2015-12-31","2017-08-31",
                 "2016-10-15","2999-12-31","2016-08-15")
))

library(fuzzyjoin)

fuzzy_left_join(dfa,
                dfb,
                by=c("toyCategory"="Description",
                     "Date"="EffectiveDate",
                     "Date"="ExpiryDate"),
                match_fun=list(`==`,`>=`,`<=`)
                )

Thanks a lot. The solution works.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.