Hi guys,
I am fairly new to R. I am trying to replicate a code that is merging two financial datasets, one at annual frequency (data.comp.a) and one at monthly frequency (data.crsp.cln).
I have downloaded the 2 datasets and would like to merge them.
I have the following issue: in the merged data frame, I can see the monthly variables, but the annual ones are all set to NA.
Can you please help me with this issue?
Please find below the code I use:
data.both.m <- data.comp.a %>%
mutate(Date = datadate + (18-month(datadate))/12) %>% # map to next year June period when data is known (must occur in previous year)
merge(data.crsp.cln, ., by=c("Date", "PERMNO"), all.x=TRUE, allow.cartesian=TRUE) %>%
arrange(PERMNO, Date, desc(datadate)) %>%
distinct(PERMNO, Date, .keep_all = TRUE) %>% # drop older datadates (must sort by desc(datadate))
mutate(datadate = yearmon(datadate)) %>%
arrange(Date, PERMNO)
I suspect the problem is the following, but I was not able to fix it: data.comp.a data frame has as date variable "datadate", which has the following format: "2004-01-31", whereas the date variable in the data.crsp.cln data frame "Date" has this format: "Jan 2004".
I tried to define datadate in the same format, but I get an error using mutate:
Error in mutate_impl(.data, dots) :
Evaluation error: * not defined for "Date" objects.
This is the code I used:
mutate(datadate = yearmon(datadate))
The PERMNO (id) variable is common to both datasets.
Thank you for your support!