Error: Internal error: Trace data is not square. ->How to calculate year on year growth

Dear Everyone,

I am currently trying to calculate a new variable (both dummy and absolute) called Bath in R, but I seem to get the following error:

Error: Internal error: Trace data is not square.

Does anyone know how I can fix this error and/or another method that lets me define this variable.

I have a reprex down below:

library(dplyr)

gvkey <- c(1, 1, 1, 1, 2,2,2, 4, 4 )
fyear <- c(2005,2006,2007,2008, 2007,2008,2009 , 2011,2012)
nibi <- c(100, 110, 120, 130, 500, 550, 600, 50, 60)
lagAT <- c(1000,1500,1300,1200, 300,500, 800, 70, 40)

Thesis <- data.frame(gvkey, fyear, nibi, lagAT)

Thesis <- Thesis%>%
group_by(gvkey) %>%
arrange(fyear) %>%
mutate(Bath= Thesis %>% filter((nibi -lag(nibi)) /lagAT<0) %>%
mutate(BATH = if_else(((nibi -lag(nibi)) /lagAT)< median((nibi -lag(nibi)) /lagAT,
na.rm = TRUE), (nibi-lag(nibi))/lagAT, 0),
dBATH = if_else(((nibi -lag(nibi)) /lagAT)< median((nibi -lag(nibi)) /lagAT,
na.rm = TRUE), 1, 0)) %>%
select(gvkey, fyear,BATH, dBATH, nibi, lagAT, nibi-lag(nibi)))

Hello, I'm afraid I don't see how its possible to help you 'fix' your code, when you havent explained the intent of your code.... i.e. the calculation you wish to perform.

Thank you very much for you reply.
So my code needs to calculate the following:

Bath = the change in firm s NIBI from t -1 to t deflated by total assets at the end of t - 1, when this change is below the median of non-zero negative values of this variable, and 0 otherwise.

So I want to add two variables, 1 dummy variable, in which 1 represents the change in NIBI is lower then the median of non-zero negative values, and 0 otherwise
(which I tried with : dBATH = if_else(((nibi -lag(nibi)) /lagAT)< median((nibi -lag(nibi)) /lagAT,
na.rm = TRUE), 1, 0)) %>%) ,

and one variable with the absolute value if the change in NIBI is lower then the median of non-zero negative values, and 0 otherwise (which I tried with
mutate(BATH = if_else(((nibi -lag(nibi)) /lagAT)< median((nibi -lag(nibi)) /lagAT,
na.rm = TRUE), (nibi-lag(nibi))/lagAT, 0)

I added the following condition to make sure only non-zero negative values were taken into account for the calculation:
mutate(Bath= Thesis %>% filter((nibi -lag(nibi)) /lagAT<0) %>%

I hope this helps, if you need any further info, please ask me.

Thank you in advance!

library(tidyverse)

gvkey <- c(1, 1, 1, 1, 2,2,2, 4, 4 )
fyear <- c(2005,2006,2007,2008, 2007,2008,2009 , 2011,2012)
nibi <- c(100, 110, 120, 130, 500, 550, 600, 50, 60)
lagAT <- c(1000,1500,1300,1200, 300,500, 800, 70, 40)

(Thesis <- data.frame(gvkey, fyear, nibi, lagAT) %>% arrange(gvkey,fyear))
(Thesis2 <- arrange(Thesis,
                   gvkey,
                   fyear) %>% group_by(gvkey) %>% mutate(nibi_change = (nibi-lag(nibi))/lagAT))

# calculate median of positive valaues of nibi_change per gvkey

(median_nibis <- Thesis2 %>% filter(nibi_change>0) %>% summarise(med_nibi_change = median(nibi_change)))

(Thesis3 <- left_join(Thesis2 , median_nibis , by = "gvkey") %>% mutate(
  dBath = ifelse(nibi_change < med_nibi_change,1,0),
   Bath = ifelse(nibi_change < med_nibi_change,nibi_change,0
)))

# A tibble: 9 x 8
# Groups:   gvkey [3]
  gvkey fyear  nibi lagAT nibi_change med_nibi_change dBath     Bath
  <dbl> <dbl> <dbl> <dbl>       <dbl>           <dbl> <dbl>    <dbl>
1     1  2005   100  1000    NA               0.00769    NA NA      
2     1  2006   110  1500     0.00667         0.00769     1  0.00667
3     1  2007   120  1300     0.00769         0.00769     0  0      
4     1  2008   130  1200     0.00833         0.00769     0  0      
5     2  2007   500   300    NA               0.0812     NA NA      
6     2  2008   550   500     0.1             0.0812      0  0      
7     2  2009   600   800     0.0625          0.0812      1  0.0625 
8     4  2011    50    70    NA               0.25       NA NA      
9     4  2012    60    40     0.25            0.25        0  0
1 Like

Thank you very much!
I can see that your code gives me the answer that I am looking for, however, when I try to copy your code I do seem to get the following error:

Error: by can't contain join column gvkey which is missing from RHS
Backtrace:

Do you know what this error means or how I can fix this?

When you try to copy the code, or when you try to adapt it?
I specify that the join should happen on gvkey. Seems ira missing from your version of the median_nibis frame

1 Like

I get the error when I try to copy your code.
Now I have changed the by function to

Thesis3 <- left_join(Thesis2 , median_nibis , by = c("gvkey") %>% mutate(
dBath = ifelse(nibi_change < med_nibi_change,1,0),
Bath = ifelse(nibi_change < med_nibi_change,nibi_change,0))

But R is taking a long time to create Thesis3 (don't get an error but also no results yet)

Since I am still quite a nooby to R, could you please tell me what the missing IRA means? And how I could fix it?
Thank you in advance

It was a typo of the word is

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.