I have a data-frame that shows the most recent flight purchases of a product (denoted by purchase_date ) AND the total # of returned bookings of the time period before that denoted by (return_date).
I am trying to create a variable that shows the percentage of re-bookings based on the most recent purchases and the total most recent returns in a given time period.
I have two data-frames:
DF1 - shows the total bookings of a product based on the product's return date.
product return_date total_bookings <fct> <date> <dbl> # product_A 2018-05-26 4 # product_A 2019-01-24 9 # product_A 2019-10-25 5 # product_B 2018-09-18 12 # product_B 2018-10-25 30 # product_B 2019-03-16 4
DF2 - shows the total most recent bookings of a product for those who recently returned from a trip (i.e. date_last_return). The idea is that it shows the number of people who purchased a product again since their last return.
product date_last_return purchase_date bookings <fct> <date> <date> <dbl> # product_A 2018-10-25 2019-02-12 2 # product_A 2019-01-24 2019-06-14 2 # product_A 2018-05-26 2019-06-28 2 # product_B 2018-09-18 2018-11-01 2 # product_B 2018-10-25 2019-02-06 1 # product_B 2019-03-16 2019-06-13 2
What I'm trying to do is to aggregate all the total "returned bookings" from DF1 based on if the return date was 6 months before the purchase_date.
product purchase_date re-bookings total_6month_return 1 product_A 2019-02-12 2 9 2 product_A 2019-06-14 2 9 3 product_A 2019-06-28 2 9 4 product_B 2018-11-01 2 42 5 product_B 2019-02-06 1 42 6 product_B 2019-06-13 2 4
I am having great difficultly creating a way to aggregate DF1 by the purchase_date in DF2 and can't seem to find a solution on how to accomplish this.
If you have any coding suggestions in R on how I could do this, I would greatly appreciate it. I'm thinking a for loop would be appropriate but I simply do not know