Could you tell me why the dates are coming off weird? I will insert an executable code below. You may notice that date1 and date2 have the wrong dates.
df<-read_excel('C:/Users/Desktop/df.xlsx')
df<-subset(df,df$date2<df$date1)
dim_data<-dim(df)
day<-c(seq.Date(from = as.Date(df$date2[1]),
to = as.Date(df$date2[dim_data[1]]),
by = "1 day"))
df_grouped<-matrix(nrow = 9,ncol=6)
colnames(df_grouped)<-c("Id","date1","date2","Week","D","DR01")
for (k in 1:9) {
df_grouped[k,]<-c(subset(df,date2==day[k])$Id[1],
subset(df,date2==day[k])$date1[1],
subset(df,date2==day[k])$date2[1],
subset(df,date2==day[k])$Week[1],
subset(df,date2==day[k])$D[1],
sum(subset(df,date2==day[k])$DR01))
}
> df_grouped
Id date1 date2 Week D DR01
[1,] "1" "1626739200" "1617321600" "Friday" NA "21"
[2,] "1" "1626739200" "1617408000" "Saturday" NA "24"
[3,] NA NA NA NA NA "0"
[4,] NA NA NA NA NA "0"
[5,] NA NA NA NA NA "0"
[6,] NA NA NA NA NA "0"
[7,] "1" "1626739200" "1617840000" "Thursday" "Hol" "21"
[8,] "1" "1626739200" "1617926400" "Friday" NA "11"
[9,] "1" "1626739200" "1618012800" "Saturday" NA "14
df_grouped is not a data frame but a matrix and matrices can't contain mixed class variables so you get that values when date-time variables get coerced into text. Maybe you are trying to do something like this?
I need the order to be shown in the sequence: id, date1, date2, Week, D and DR01. As it appeared, it is date 2 before id and date
is it possible to leave the date formats in dd-mm-yyyy?
I need to create a variable, that can be df_grouped for this you created, because I'll need to call this variable in another part of the code, how does it look then?
The code below is a modified version of @andresrcs's code that rearranges the columns and stores the result in the variable df_grouped. It also shows how to change the date columns into character columns with a particular format. Since the formatting results in characters, those columns cannot be used in calculations as dates without transforming them back. If you want the dates shown in a particular format, I suggest you do that at the last step after all calculations are complete.
By the way, the original spreadsheet had the dates in the m/d/Y format when I opened it, I suppose because Google detected my US locale. That shows the dangers of using ambiguous date formats.
df <- readxl::read_excel('~/R/Play/df.xlsx')
df_grouped <- df %>%
mutate(across(starts_with("date"), as.Date)) %>%
group_by(date2) %>%
summarise(Id = first(Id),
date1 = first(date1),
Week = first(Week),
D = first(D),
DR01 = sum(DR01)) %>%
select(Id,date1,date2,Week,D,DR01)
`summarise()` ungrouping output (override with `.groups` argument)
df_grouped
# A tibble: 6 x 6
Id date1 date2 Week D DR01
<dbl> <date> <date> <chr> <chr> <dbl>
1 1 2021-07-20 2021-04-02 Friday NA 21
2 1 2021-07-20 2021-04-03 Saturday NA 24
3 1 2021-07-20 2021-04-08 Thursday Hol 21
4 1 2021-07-20 2021-04-09 Friday NA 11
5 1 2021-07-20 2021-04-10 Saturday NA 14
6 1 2021-07-20 2021-08-01 Thursday NA 13
df_grouped <- df_grouped %>% mutate(date1=format(date1,"%d/%m/%Y"),
date2=format(date2,"%d/%m/%Y"))
df_grouped
# A tibble: 6 x 6
Id date1 date2 Week D DR01
<dbl> <chr> <chr> <chr> <chr> <dbl>
1 1 20/07/2021 02/04/2021 Friday NA 21
2 1 20/07/2021 03/04/2021 Saturday NA 24
3 1 20/07/2021 08/04/2021 Thursday Hol 21
4 1 20/07/2021 09/04/2021 Friday NA 11
5 1 20/07/2021 10/04/2021 Saturday NA 14
6 1 20/07/2021 01/08/2021 Thursday NA 13