Loop for cumsum

Hello,
I have a dataframe with values for Ages from 0 to 110 and I would like to create a new dataframe or to add a column in my existing one summing the values of one column.

For example : For age = 2 I want the value to be the sum of the values for age=0,1 and 2 of my existing data and for age = 110 I want it to be the sum of the whole values of my existing data.

I tried this code but all I get is an empty vector.

i <- 0
repeat {
p_lifetable_1955 <-cumsum (data1955$Fem_px[0 : i])
  i <- i + 1
  if(i == 111) break
}

Could someone tell me what is going wrong?

Thanks in advance

the tidy way of running sum can be coded like following

library(tidyverse)

data1955<-tibble(age=1:110,
                 Fem_px=sample(1:10,110,replace = TRUE)
                         )

data1955 %>%
  mutate(running_sum= cumsum(Fem_px))
2 Likes

A quick addendum to @melih_guven's terrific answer - make sure you sort your column by age before summing, otherwise you will get unexpected results.

Also, for what it's worth, running a loop over a dataframe is typically not a good idea. Loops are expensive in R and running them over dataframes can lead to errors that are difficult to debug. Using vectorized functions like those provided by dplyr or data.table are almost always better choices.

2 Likes

Loops are not necessarily poor, more so the execution of them is, with people not fully understanding what a loop is doing hence why base functions in the family apply() or {purrr} are more often suggested. Additionally, {data.table} and {dplyr} make using vectorized functions easier (e.g. case_when()). Also see the following SO post about difference between {data.table} vs. {dplyr} and when one might be more advantageous to use over another.

To directly answer @Diego17 question as to why you are getting an empty vector - this is occurring as the loop rewrites over i every single time with the last operation being break hence an empty vector. If you created an empty list results_list <- list() to dump i into or a copy of the dataframe to dump i into then you will end up with a vector of values. I hope this helps.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.