Dplyr lead and lag vs min and max by group

Hi everyone,
I am very new to R (coming from many years of using SAS). and I am trying to understand some of the functions in R.
I am very much confused by lead and lag functions in dplyr. In the following code, I was expecting them to generate the same result as first and last dates as I am grouping by id then arranging by visit date

df <- data.frame(ID = rep(1:10, each = 4) ,
Visit_Date = as.Date(sample(1:365, 10, replace=TRUE), origin="1970-01-01"),
stringsAsFactors = TRUE)

df2 <- data.frame(df %>%
group_by(ID) %>%
arrange(Visit_Date) %>%
mutate(last_visit = lag(Visit_Date)) %>%
mutate(first_visit=lead(Visit_Date)) %>%
mutate(min_visit=min(Visit_Date)) %>%
mutate(max_visit=max(Visit_Date)) %>%
ungroup()%>%
arrange(ID))

Could anyone explain this for me please?

OK I know get it, please discard my previous question

If you found a solution, can you please post it here too for anyone who is looking for similar problem in the future?

1 Like