problems with a function - add calculated field

Hummel · March 14, 2021, 4:10pm

Helo, i am having a problem to add a column in a for instrunctions.
I am using a "for instructions" to download create 5 dataframes , and inside the FOR instruction i have a Functoin to add a calculated column.

If anyone, could help

my code is bellow:

require("Quandl")
require("dygraphs")
require("magrittr")
require("PerformanceAnalytics")
require("quantmod")
require("lubridate")

# Trazendo data atual do sistema
hoje <- Sys.Date()
hoje 

# TRatamento da data para formato de nome do arquivo
ano <- format(hoje, format = "%Y")
ano
mes <- format(hoje, format = "%m")
mes
dia <- format(hoje, format = "%d")
dia
anomes <- paste(ano, mes)
anomes
gsub("\\s", "", anomes)

# preparação do nome do arquivo
data_arquivo <- paste(ano, mes, dia)
data_arquivo <- gsub("\\s", "", data_arquivo)
data_arquivo

endereco <- sprintf("dados/ipca_%s.csv", data_arquivo)
endereco

# Definir sua api key
Quandl.api_key('XXXXXXXXXXXXX')

# Função para inflação acumulada


ano1 = format(df$data[1], format = "%Y")
ano1 = as.integer(ano1)
anomes <- paste(ano1, mes)
anomes
jur1 = 0
acum_jur = 0
Acum <- function(){
  ano1 = format(df$data[1], format = "%Y")
  ano1 = as.integer(ano1)
  anomes <- paste(ano1, mes)
  anomes
  jur1 = 0
  acum_jur = 0
  
  for (x in df$data){
    dia3 = as.Date(x)
    ano2 = (format(dia3, format = "%Y"))
    ano2 <- as.integer(ano2)
    mes <- format(dia3, format = "%m")
    jur = df$tx_jur[df$data == dia3]
    if (ano2 == ano1)
    {
      anomes <- paste(ano2, mes)
      acum_jur1 = (((1 + (jur/100)) * acum_jur) + jur)
      acum_jur <- acum_jur1
      df$acum_jur[df$data == dia3] <- round(acum_jur1, 2)
    }
    else
    {
      ano1 = ano1 + 1
      anomes <- paste(ano2, mes)
      acum_jur1 = df$tx_jur[df$data == dia3]
      acum_jur <- acum_jur1
      df$acum_jur[df$data == dia3] <- round(acum_jur1, 2)
    }
  }
}

#Coletar dados

tab <- read.csv("dados/tabelas.csv", sep = ";" )
View(tab)

count = 1
for (i in tab[, 1]){
  ident <- tab[count, 1]
  tabela1 <- tab[count, 2]
  tabela2 <- tab[count, 3]
  df1 <- tab[count, 4]
  tabela <- Quandl(ident)
  tabela3 <- Quandl(ident, transform = "diff")
  df <- merge(tabela, tabela3, by.x = "Date", by.y = "Date" , all = TRUE)
  colnames(df) <- c("data", "tx_jur", "var_jur")
  Acum()
  assign(tabela1, tabela)
  assign(tabela2, tabela3)
  assign(df1, df)
  count = count + 1
}

nirgrahamuk · March 14, 2021, 5:34pm

Hi!

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

Hummel · March 14, 2021, 6:02pm

I hope this example illustrate my doubt. I would like to add a cumsum column

lista <- list(a = (1 : 5), b = c(0.5, 0.6, 0.8, 1.2, 1.4))
lista

dt <- function(){
for(x in lista$a){
x.cum <- cumsum(lista$b)
}
}
dt()
lista

FJCC · March 14, 2021, 6:15pm

cumsum is a vectorized function, so you do not need a for loop if I understand correctly what you want to do. I'm not sure where you want to store the result of the cumsum, so I mad a new element of lista.

lista <- list(a = (1 : 5), b = c(0.5, 0.6, 0.8, 1.2, 1.4))
lista
$a
[1] 1 2 3 4 5

$b
[1] 0.5 0.6 0.8 1.2 1.4

lista$c <- cumsum(lista$b)
lista
$a
[1] 1 2 3 4 5

$b
[1] 0.5 0.6 0.8 1.2 1.4

$c
[1] 0.5 1.1 1.9 3.1 4.5

nirgrahamuk · March 14, 2021, 6:16pm

Your code says to loop through every item in list under the a name (which is 1,2,3,4,5) and each time make a cumulative sum vector of the contents of lista$b. but these are put into a local variable x.cum which is not returned by the function and has no effect on the global environment.

are you trying to produce

[[1]]
[1] 0.5 1.1 1.9 3.1 4.5

[[2]]
[1] 0.5 1.1 1.9 3.1 4.5

[[3]]
[1] 0.5 1.1 1.9 3.1 4.5

[[4]]
[1] 0.5 1.1 1.9 3.1 4.5

[[5]]
[1] 0.5 1.1 1.9 3.1 4.5

?
That can be done without for loop at all.

purrr::map(lista$a,
           ~cumsum(lista$b))

but in a loop


dt <- function(){
  x.cum <- vector("list", length(lista$a))
  i=1;
  for(x in lista$a){
    x.cum[[i]] <- cumsum(lista$b)
    i <- i+1
  }
  x.cum
}

dt()

Hummel · March 14, 2021, 6:18pm

I am using the for instruction because i for the column A, i have date type, and each time i have a new year the cumsum need to restar and the sum is on inflation, so needs to be sum of interest

Hummel · March 14, 2021, 6:27pm

Thats example is better:
lista <- list(a = c('1991-10-01', '1991-11-01', '1991-12-01', '1992-01-01'), b = c(0.5, 0.6, 0.8, 1.2, 1.4))
lista

Occurs that between the date '1991-12-01' and '1992-01-01' i need to restar the cumsum

FJCC · March 14, 2021, 7:51pm

If I understand what you want, I think the simplest thing is to change lista into a data frame. I had to add a date to the data you provided so that the a and b elements of lista are the same length.

lista <- list(a = c('1991-10-01', '1991-11-01', '1991-12-01', '1992-01-01','1992-02-01'), 
              b = c(0.5, 0.6, 0.8, 1.2, 1.4))
listaDF <- as.data.frame(lista)
listaDF
#>            a   b
#> 1 1991-10-01 0.5
#> 2 1991-11-01 0.6
#> 3 1991-12-01 0.8
#> 4 1992-01-01 1.2
#> 5 1992-02-01 1.4
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(dplyr,warn.conflicts = FALSE)
listaDF <- listaDF %>% mutate(a=ymd(a),
                              Year=year(a))
listaDF
#>            a   b Year
#> 1 1991-10-01 0.5 1991
#> 2 1991-11-01 0.6 1991
#> 3 1991-12-01 0.8 1991
#> 4 1992-01-01 1.2 1992
#> 5 1992-02-01 1.4 1992
listaDF <- listaDF %>% group_by(Year) %>% 
  mutate(CumSum=cumsum(b))
listaDF
#> # A tibble: 5 x 4
#> # Groups:   Year [2]
#>   a              b  Year CumSum
#>   <date>     <dbl> <dbl>  <dbl>
#> 1 1991-10-01   0.5  1991   0.5 
#> 2 1991-11-01   0.6  1991   1.1 
#> 3 1991-12-01   0.8  1991   1.9 
#> 4 1992-01-01   1.2  1992   1.2 
#> 5 1992-02-01   1.4  1992   2.60

^{Created on 2021-03-14 by the reprex package (v0.3.0)}

Hummel · March 14, 2021, 9:40pm

Thanks, my friend.
I am new working on R and thanks for helps everybody

system · March 21, 2021, 9:55pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.