I know there should be a simple solution to this, but I can't seem to figure this bit out. What I want to is to try to summarize or group by the date in a data frame and having the total values in each count column to be aligned to each of the dates. See code below.
data <- data.frame(letter = c("a", "b", "c", "d", "e"),
date = c("2020-01-01", "2020-01-02", "2020-01-02", "2020-01-03", "2020-01-03"),
count = c(1,NA, 3, 2, NA),
count2 = c(2, 1, NA, NA, 2)
)
print(data)
# letter date count count2
# 1 a 2020-01-01 1 2
# 2 b 2020-01-02 NA 1
# 3 c 2020-01-02 3 NA
# 4 d 2020-01-03 2 NA
# 5 e 2020-01-03 NA 2
data2 <- data%>%
distinct(date, .keep_all = TRUE) %>%
arrange(date)
print(data2)
# letter date count count2
# 1 a 2020-01-01 1 2
# 2 b 2020-01-02 NA 1
# 3 d 2020-01-03 2 NA
#What it should look like
# letter date count count2
# 1 a 2020-01-01 1 2
# 2 b 2020-01-02 3 1
# 3 d 2020-01-03 2 2
It looks like what distinct is doing is selecting the first date row and only using that value in the count fields, but I can't seem to figure out where to get the summation. Thanks for your help.