Add and mutate columns in data frame reflecting cumulative data of existing variables

Hi
I'm still very new to R so apologies for tedious questions.
I have the dataframe:

Code   Country     Date.x     Continent NewCases NewDeaths
1  ARG Argentina 2020-03-14 South America        3         1
2  ARG Argentina 2020-03-14 South America        3         1
3  ARG Argentina 2020-03-14 South America        3         1
4  ARG Argentina 2020-03-14 South America        3         1
5  ARG Argentina 2020-03-14 South America        3         1
6  ARG Argentina 2020-03-14 South America        3         1
  Recovered     Date.y NewTests Population    GDP GDPCapita
1         1 2020-03-04        0   44494502 637486     14400
2         1 2020-03-06        0   44494502 637486     14400
3         1 2020-03-07        0   44494502 637486     14400
4         1 2020-03-08        0   44494502 637486     14400
5         1 2020-03-09        0   44494502 637486     14400
6         1 2020-03-16        0   44494502 637486     14400

and I'm looking to add 4 new cumulative variables, "Cumulative_Cases", "Cumulative_Deaths", "Cumulative_Recovered" & "Cumulative_Tests", for each country up to the date of the observation.

My understanding is I need to firstly, arrange the dataframe by date and country? Then for each new variable groupby country then mutate using the cumsum function?

Yes, :slight_smile: you're on the right track.

p.s. You had a good instinct to share something about the data of concern, you even styled it which is appreciated ! however, the way you did it does not facilitate easy copy and pasting of your data, so a user can easily add it to their R session.

You can read this guide to see how such things are possible via datapasta package, or base::dput()
FAQ: How to do a minimal reproducible example ( reprex ) for beginners

Thanks @nirgrahamuk really appreciate the tips on how to provide a better example I'll definitely keep it in mind for next time!
I've tried using this code to arrange my dataframe

group_by(MASTER_COVID, Country) %>%
               arrange(date, .by_group = TRUE)

But I am receiving the error

Error: arrange() failed at implicit mutate() step. 
x Could not create a temporary column for `..2`.
i `..2` is `date`.

Could you enlighten me as to why?

I thought you had date.x and date.y but now you reference date?

Yes sorry, I decided to streamline my data with all dates into one column

in principle theres no problem with the approach

library(tidyverse)
mtcars %>%
group_by(gear, cyl) %>%
  arrange(mpg, .by_group = TRUE)

Therefore if you have problems with your script, it must be data related.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.