Hi,
Thanks for your reply.
I've tried my best to simplify the ask and use code formatting.
So the Raw data I receive only has two headers and would look similar to the below.
df <- tibble(
Report_date = c("01-01-2022", "01-01-2022", "01-01-2022", "01-01-2022", "01-01-2022", "02-01-2022", "02-01-2022", "02-01-2022", "02-01-2022", "02-02-2022", "03-01-2022", "03-01-2022", "03-01-2022", "03-01-2022", "03-01-2022"),
City = c("London","Edinburgh","Edinburgh","Chester", "Wick", "London", "London", "Brighton", "Chester", "Swindon", "Edinburgh", "Wick", "Wick", "Brighton", "Swindon")
)
df
Report_date City
1 01-01-2022 London
2 01-01-2022 Edinburgh
3 01-01-2022 Edinburgh
4 01-01-2022 Chester
5 01-01-2022 Wick
6 02-01-2022 London
7 02-01-2022 London
8 02-01-2022 Brighton
9 02-01-2022 Chester
10 02-02-2022 Swindon
11 03-01-2022 Edinburgh
12 03-01-2022 Wick
13 03-01-2022 Wick
14 03-01-2022 Brighton
15 03-01-2022 Swindon
What i am looking to achieve is have the rows as the cities, the columns as the dates, and the values as the running cumulative total of City. So it will look something like this
City 01-01-2022 02-01-2022 03-01-2022
1 Brighton NA 1 2
2 Chester 1 2 2
3 Edinburgh 2 3 4
4 London 1 3 3
5 Swindon NA 1 2
6 Wick 1 1 4
My attempt to get this to work is
df3 <- df %>%
mutate(numeric_count = 1)%>%
group_by(City)%>%
arrange(Report_date)%>%
mutate(CumulativeSum = cumsum(numeric_count))%>%
group_by(City, Report_date)%>%
summarise(n = n(),
Cumulativetotal = sum(CumulativeSum))%>%
group_by(City)%>%
arrange(Report_date)%>%
mutate(TotalSum = cumsum(n))%>%
select(-n, -Cumulativetotal)%>%
group_by(Report_date)%>%
pivot_wider(names_from=Report_date, values_from=TotalSum)%>%
arrange(City)
This gives me the following output
City 01-01-2022 02-01-2022 02-02-2022 03-01-2022
1 Brighton NA 1 NA 2
2 Chester 1 2 NA NA
3 Edinburgh 2 NA NA 3
4 London 1 3 NA NA
5 Swindon NA NA 1 2
6 Wick 1 NA NA 3
But I essentially need the NA's to be filled with the value to the left. its ok if the NA's before the first value remain though