Hello, quite new to R
I have a query regarding joining. I have two data frames with 3 columns with exactly the same number of rows.
steps_hour - Id, ActivityHour, TotalSteps
cals_hour - Id, ActivityHour, Calories
Here is an example (although I have tried specifying the join using by = "Id" and "ActivityHour" too.
step_cal_2 <- steps_hour %>% full_join(cals_hour)
I have tried joining them using all join functions (inner, left, right, full) on both the Id column and the ActivityHour column.I have tried converting the Activity hour to datetime using Lubridate. However, the whichever way I do it, the first columns is always duplicates itself multiple times:
According to the dplyr reference material online it says "For all joins, rows will be duplicated if one or more rows in the x matches with multiple rows in y" . I can't think of a solution around this apart from to delete the ActivityHours rows but then the data will be all but useless.
I have spent over a 12 hours on this now and will just have to use copy and paste in Excel as I have a deadline. But I am curious how to solve this so thanks in advance.