Error: Internal error: Trace data is not square. ->How to calculate year on year growth

Dear Everyone,

I am currently trying to calculate a new variable (both dummy and absolute) called Bath in R, but I seem to get the following error:

Error: Internal error: Trace data is not square.

Does anyone know how I can fix this error and/or another method that lets me define this variable.

I have a reprex down below:

library(dplyr)

gvkey <- c(1, 1, 1, 1, 2,2,2, 4, 4 )
fyear <- c(2005,2006,2007,2008, 2007,2008,2009 , 2011,2012)
nibi <- c(100, 110, 120, 130, 500, 550, 600, 50, 60)
lagAT <- c(1000,1500,1300,1200, 300,500, 800, 70, 40)

Thesis <- data.frame(gvkey, fyear, nibi, lagAT)

Thesis <- Thesis%>%
group_by(gvkey) %>%
arrange(fyear) %>%
mutate(Bath= Thesis %>% filter((nibi -lag(nibi)) /lagAT<0) %>%
mutate(BATH = if_else(((nibi -lag(nibi)) /lagAT)< median((nibi -lag(nibi)) /lagAT,
na.rm = TRUE), (nibi-lag(nibi))/lagAT, 0),
dBATH = if_else(((nibi -lag(nibi)) /lagAT)< median((nibi -lag(nibi)) /lagAT,
na.rm = TRUE), 1, 0)) %>%
select(gvkey, fyear,BATH, dBATH, nibi, lagAT, nibi-lag(nibi)))

Hello, I'm afraid I don't see how its possible to help you 'fix' your code, when you havent explained the intent of your code.... i.e. the calculation you wish to perform.

It was a typo of the word is

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

library(tidyverse)

gvkey <- c(1, 1, 1, 1, 2,2,2, 4, 4 )
fyear <- c(2005,2006,2007,2008, 2007,2008,2009 , 2011,2012)
nibi <- c(100, 110, 120, 130, 500, 550, 600, 50, 60)
lagAT <- c(1000,1500,1300,1200, 300,500, 800, 70, 40)

(Thesis <- data.frame(gvkey, fyear, nibi, lagAT) %>% arrange(gvkey,fyear))
(Thesis2 <- arrange(Thesis,
                   gvkey,
                   fyear) %>% group_by(gvkey) %>% mutate(nibi_change = (nibi-lag(nibi))/lagAT))

# calculate median of positive valaues of nibi_change per gvkey

(median_nibis <- Thesis2 %>% filter(nibi_change>0) %>% summarise(med_nibi_change = median(nibi_change)))

(Thesis3 <- left_join(Thesis2 , median_nibis , by = "gvkey") %>% mutate(
  dBath = ifelse(nibi_change < med_nibi_change,1,0),
   Bath = ifelse(nibi_change < med_nibi_change,nibi_change,0
)))

# A tibble: 9 x 8
# Groups:   gvkey [3]
  gvkey fyear  nibi lagAT nibi_change med_nibi_change dBath     Bath
  <dbl> <dbl> <dbl> <dbl>       <dbl>           <dbl> <dbl>    <dbl>
1     1  2005   100  1000    NA               0.00769    NA NA      
2     1  2006   110  1500     0.00667         0.00769     1  0.00667
3     1  2007   120  1300     0.00769         0.00769     0  0      
4     1  2008   130  1200     0.00833         0.00769     0  0      
5     2  2007   500   300    NA               0.0812     NA NA      
6     2  2008   550   500     0.1             0.0812      0  0      
7     2  2009   600   800     0.0625          0.0812      1  0.0625 
8     4  2011    50    70    NA               0.25       NA NA      
9     4  2012    60    40     0.25            0.25        0  0
1 Like

Thank you very much for you reply.
So my code needs to calculate the following:

Bath = the change in firm s NIBI from t -1 to t deflated by total assets at the end of t - 1, when this change is below the median of non-zero negative values of this variable, and 0 otherwise.

So I want to add two variables, 1 dummy variable, in which 1 represents the change in NIBI is lower then the median of non-zero negative values, and 0 otherwise
(which I tried with : dBATH = if_else(((nibi -lag(nibi)) /lagAT)< median((nibi -lag(nibi)) /lagAT,
na.rm = TRUE), 1, 0)) %>%) ,

and one variable with the absolute value if the change in NIBI is lower then the median of non-zero negative values, and 0 otherwise (which I tried with
mutate(BATH = if_else(((nibi -lag(nibi)) /lagAT)< median((nibi -lag(nibi)) /lagAT,
na.rm = TRUE), (nibi-lag(nibi))/lagAT, 0)

I added the following condition to make sure only non-zero negative values were taken into account for the calculation:
mutate(Bath= Thesis %>% filter((nibi -lag(nibi)) /lagAT<0) %>%

I hope this helps, if you need any further info, please ask me.

Thank you in advance!

Thank you very much!
I can see that your code gives me the answer that I am looking for, however, when I try to copy your code I do seem to get the following error:

Error: by can't contain join column gvkey which is missing from RHS
Backtrace:

Do you know what this error means or how I can fix this?

When you try to copy the code, or when you try to adapt it?
I specify that the join should happen on gvkey. Seems ira missing from your version of the median_nibis frame

1 Like

I get the error when I try to copy your code.
Now I have changed the by function to

Thesis3 <- left_join(Thesis2 , median_nibis , by = c("gvkey") %>% mutate(
dBath = ifelse(nibi_change < med_nibi_change,1,0),
Bath = ifelse(nibi_change < med_nibi_change,nibi_change,0))

But R is taking a long time to create Thesis3 (don't get an error but also no results yet)

Since I am still quite a nooby to R, could you please tell me what the missing IRA means? And how I could fix it?
Thank you in advance