problems with a function - add calculated field

Helo, i am having a problem to add a column in a for instrunctions.
I am using a "for instructions" to download create 5 dataframes , and inside the FOR instruction i have a Functoin to add a calculated column.

If anyone, could help

my code is bellow:

require("Quandl")
require("dygraphs")
require("magrittr")
require("PerformanceAnalytics")
require("quantmod")
require("lubridate")

# Trazendo data atual do sistema
hoje <- Sys.Date()
hoje 

# TRatamento da data para formato de nome do arquivo
ano <- format(hoje, format = "%Y")
ano
mes <- format(hoje, format = "%m")
mes
dia <- format(hoje, format = "%d")
dia
anomes <- paste(ano, mes)
anomes
gsub("\\s", "", anomes)

# preparação do nome do arquivo
data_arquivo <- paste(ano, mes, dia)
data_arquivo <- gsub("\\s", "", data_arquivo)
data_arquivo

endereco <- sprintf("dados/ipca_%s.csv", data_arquivo)
endereco

# Definir sua api key
Quandl.api_key('XXXXXXXXXXXXX')

# Função para inflação acumulada


ano1 = format(df$data[1], format = "%Y")
ano1 = as.integer(ano1)
anomes <- paste(ano1, mes)
anomes
jur1 = 0
acum_jur = 0
Acum <- function(){
  ano1 = format(df$data[1], format = "%Y")
  ano1 = as.integer(ano1)
  anomes <- paste(ano1, mes)
  anomes
  jur1 = 0
  acum_jur = 0
  
  for (x in df$data){
    dia3 = as.Date(x)
    ano2 = (format(dia3, format = "%Y"))
    ano2 <- as.integer(ano2)
    mes <- format(dia3, format = "%m")
    jur = df$tx_jur[df$data == dia3]
    if (ano2 == ano1)
    {
      anomes <- paste(ano2, mes)
      acum_jur1 = (((1 + (jur/100)) * acum_jur) + jur)
      acum_jur <- acum_jur1
      df$acum_jur[df$data == dia3] <- round(acum_jur1, 2)
    }
    else
    {
      ano1 = ano1 + 1
      anomes <- paste(ano2, mes)
      acum_jur1 = df$tx_jur[df$data == dia3]
      acum_jur <- acum_jur1
      df$acum_jur[df$data == dia3] <- round(acum_jur1, 2)
    }
  }
}

#Coletar dados

tab <- read.csv("dados/tabelas.csv", sep = ";" )
View(tab)

count = 1
for (i in tab[, 1]){
  ident <- tab[count, 1]
  tabela1 <- tab[count, 2]
  tabela2 <- tab[count, 3]
  df1 <- tab[count, 4]
  tabela <- Quandl(ident)
  tabela3 <- Quandl(ident, transform = "diff")
  df <- merge(tabela, tabela3, by.x = "Date", by.y = "Date" , all = TRUE)
  colnames(df) <- c("data", "tx_jur", "var_jur")
  Acum()
  assign(tabela1, tabela)
  assign(tabela2, tabela3)
  assign(df1, df)
  count = count + 1
}

Hi!

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

I hope this example illustrate my doubt. I would like to add a cumsum column

lista <- list(a = (1 : 5), b = c(0.5, 0.6, 0.8, 1.2, 1.4))
lista

dt <- function(){
for(x in lista$a){
x.cum <- cumsum(lista$b)
}
}
dt()
lista

cumsum is a vectorized function, so you do not need a for loop if I understand correctly what you want to do. I'm not sure where you want to store the result of the cumsum, so I mad a new element of lista.

lista <- list(a = (1 : 5), b = c(0.5, 0.6, 0.8, 1.2, 1.4))
lista
$a
[1] 1 2 3 4 5

$b
[1] 0.5 0.6 0.8 1.2 1.4

lista$c <- cumsum(lista$b)
lista
$a
[1] 1 2 3 4 5

$b
[1] 0.5 0.6 0.8 1.2 1.4

$c
[1] 0.5 1.1 1.9 3.1 4.5

Your code says to loop through every item in list under the a name (which is 1,2,3,4,5) and each time make a cumulative sum vector of the contents of lista$b. but these are put into a local variable x.cum which is not returned by the function and has no effect on the global environment.

are you trying to produce

[[1]]
[1] 0.5 1.1 1.9 3.1 4.5

[[2]]
[1] 0.5 1.1 1.9 3.1 4.5

[[3]]
[1] 0.5 1.1 1.9 3.1 4.5

[[4]]
[1] 0.5 1.1 1.9 3.1 4.5

[[5]]
[1] 0.5 1.1 1.9 3.1 4.5

?
That can be done without for loop at all.

purrr::map(lista$a,
           ~cumsum(lista$b))

but in a loop


dt <- function(){
  x.cum <- vector("list", length(lista$a))
  i=1;
  for(x in lista$a){
    x.cum[[i]] <- cumsum(lista$b)
    i <- i+1
  }
  x.cum
}

dt()

I am using the for instruction because i for the column A, i have date type, and each time i have a new year the cumsum need to restar and the sum is on inflation, so needs to be sum of interest

Thats example is better:
lista <- list(a = c('1991-10-01', '1991-11-01', '1991-12-01', '1992-01-01'), b = c(0.5, 0.6, 0.8, 1.2, 1.4))
lista

Occurs that between the date '1991-12-01' and '1992-01-01' i need to restar the cumsum

If I understand what you want, I think the simplest thing is to change lista into a data frame. I had to add a date to the data you provided so that the a and b elements of lista are the same length.

lista <- list(a = c('1991-10-01', '1991-11-01', '1991-12-01', '1992-01-01','1992-02-01'), 
              b = c(0.5, 0.6, 0.8, 1.2, 1.4))
listaDF <- as.data.frame(lista)
listaDF
#>            a   b
#> 1 1991-10-01 0.5
#> 2 1991-11-01 0.6
#> 3 1991-12-01 0.8
#> 4 1992-01-01 1.2
#> 5 1992-02-01 1.4
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(dplyr,warn.conflicts = FALSE)
listaDF <- listaDF %>% mutate(a=ymd(a),
                              Year=year(a))
listaDF
#>            a   b Year
#> 1 1991-10-01 0.5 1991
#> 2 1991-11-01 0.6 1991
#> 3 1991-12-01 0.8 1991
#> 4 1992-01-01 1.2 1992
#> 5 1992-02-01 1.4 1992
listaDF <- listaDF %>% group_by(Year) %>% 
  mutate(CumSum=cumsum(b))
listaDF
#> # A tibble: 5 x 4
#> # Groups:   Year [2]
#>   a              b  Year CumSum
#>   <date>     <dbl> <dbl>  <dbl>
#> 1 1991-10-01   0.5  1991   0.5 
#> 2 1991-11-01   0.6  1991   1.1 
#> 3 1991-12-01   0.8  1991   1.9 
#> 4 1992-01-01   1.2  1992   1.2 
#> 5 1992-02-01   1.4  1992   2.60

Created on 2021-03-14 by the reprex package (v0.3.0)

1 Like

Thanks, my friend.
I am new working on R and thanks for helps everybody

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.