I am trying to join two data frames to create a tidy dataset. One table is a time series of days, the other table has date ranges and corresponding prices. Ideally the output table would be a column of dates, and then prices for that day on different indices. I am pretty certain the fuzzyjoin package is the best way to do this, however it's resulting in multiple rows in my dataset for each day. Here's a reprex of what I am seeing:
library(tidyverse)
library(fuzzyjoin)
#df1 is a vector of dates
dates <- as_date(c(as_date("2020-01-01"):as_date("2020-12-31")))
dates <- tibble(
OPR_DATE = dates)
#df2 is a tibble of start dates, end dates, index names, and prices
prices <- tibble(startdate = as_date(c("2020-01-01", "2020-01-01", "2020-01-10")),
enddate = as_date(c("2020-01-31", "2020-01-09", "2020-01-31")),
indexname = c("A", "B", "B"),
price = c(2.25, 2.10, 2.15))
#pivoting and joining the tables to create a tidy dataset
wide_prices <- prices %>%
pivot_wider(id_cols = c(startdate, enddate),
names_from = indexname,
values_from = price) %>%
arrange(startdate)
#Join the now wide price table to the dates vector. This should create a 3 column tibble with OPR_DATE, Index A, Index B with prices as the values
join <- dates %>%
fuzzy_inner_join(y = wide_prices,
by = c("OPR_DATE" = "startdate", "OPR_DATE" = "enddate"),
match_fun = list(`>=`, `<=`)) %>%
select(OPR_DATE, A, B)
It's similar to this link, but I want a nice tibble of 3 columns with a price value for each date covered. Instead I get a group method error.
Feeling pretty embarrassed I can't crack this nut.