conditional lagging variables

Dear all, I have a question in R studio regarding conditional lagging variables. The data structure is as following: paired variables of S&P1500 CEO characteristics according to a company key and financial year. For one company you can have multiple values of the same financial year (multiple CEO's in that year). I would like to lookup the value of a third variable (called AT) of the last value of the previous financial year within the same key (same company). Any idea how I can best solve this problem in Rstudio? Best, Hossein

Hi,

Welcome to the RStudio community!

In order for us to help you with your question, please provide us a minimal reproducible example where you provide a minimal (dummy) dataset and code that can recreate the issue.
A screenshot is not useful as we cannot extract any data from it (unless be manually recreating the whole table).

Once we have a reprex, we can go from there. For help on creating a Reprex, see this guide:

Good luck!
PJ

Thank you for your reply Pieterjanvc. Attached a sample dataset.
I would like to lookup the value of the third variable (called AT) of the last value of the previous financial year within the same key (same company).
Any help would be appreciated.
Thank you.

Kind regards,
Hossein

|GVKEYNEW|YEARNEW|at|
|10884|2009|20.448.260|
|10884|2009|20.448.260|
|10884|2009|20.448.260|
|10884|2009|20.448.260|
|10884|2009|20.448.260|
|10884|2009|20.448.260|
|10884|2010|20.671.460|
|10884|2010|20.671.460|
|10884|2010|20.671.460|
|10884|2010|20.671.460|
|10884|2010|20.671.460|
|10884|2011|19.379.590|
|10884|2011|19.379.590|
|10884|2011|19.379.590|
|10884|2011|19.379.590|
|10884|2011|19.379.590|
|10884|2011|19.379.590|
|10884|2012|16.977.620|
|10884|2012|16.977.620|
|10884|2012|16.977.620|
|10884|2012|16.977.620|
|10884|2012|16.977.620|
|10884|2012|16.977.620|
|10884|2012|16.977.620|
|10884|2013|16.591.380|
|10884|2013|16.591.380|
|10884|2013|16.591.380|
|10884|2013|16.591.380|
|10884|2013|16.591.380|
|140760|2009|178.770.000|
|140760|2009|178.770.000|
|140760|2009|178.770.000|
|140760|2009|178.770.000|
|140760|2009|178.770.000|
|140760|2009|178.770.000|
|140760|2010|178.520.000|
|140760|2010|178.520.000|
|140760|2010|178.520.000|
|140760|2010|178.520.000|
|140760|2010|178.520.000|
|140760|2011|198.440.000|
|140760|2011|198.440.000|
|140760|2011|198.440.000|
|140760|2011|198.440.000|
|140760|2011|198.440.000|
|140760|2011|198.440.000|
|140760|2012|202.240.000|
|140760|2012|202.240.000|
|140760|2012|202.240.000|
|140760|2012|202.240.000|
|140760|2012|202.240.000|
|140760|2013|206.640.000|
|140760|2013|206.640.000|
|140760|2013|206.640.000|
|140760|2013|206.640.000|
|140760|2013|206.640.000|
|140760|2013|206.640.000|
|140760|2014|219.810.000|
|140760|2014|219.810.000|
|140760|2014|219.810.000|
|140760|2014|219.810.000|
|140760|2014|219.810.000|
|140760|2014|219.810.000|
|140760|2015|219.200.000|
|140760|2015|219.200.000|
|140760|2015|219.200.000|
|140760|2015|219.200.000|
|140760|2015|219.200.000|
|140760|2016|197.360.000|
|140760|2016|197.360.000|
|140760|2016|197.360.000|
|140760|2016|197.360.000|
|140760|2016|197.360.000|
|140760|2016|197.360.000|
|140760|2016|197.360.000|
|140760|2017|213.330.000|
|140760|2017|213.330.000|
|140760|2017|213.330.000|
|140760|2017|213.330.000|
|140760|2017|213.330.000|
|24539|2009|16.274.040|
|24539|2009|16.274.040|
|24539|2009|16.274.040|
|24539|2009|16.274.040|
|24539|2009|16.274.040|
|24539|2010|17.664.220|
|24539|2010|17.664.220|
|24539|2010|17.664.220|
|24539|2010|17.664.220|
|24539|2010|17.664.220|
|24539|2011|19.953.680|
|24539|2011|19.953.680|
|24539|2011|19.953.680|
|24539|2011|19.953.680|
|24539|2011|19.953.680|
|24539|2012|20.915.800|
|24539|2012|20.915.800|
|24539|2012|20.915.800|
|24539|2012|20.915.800|
|24539|2012|20.915.800|
|24539|2013|22.377.810|
|24539|2013|22.377.810|
|24539|2013|22.377.810|
|24539|2013|22.377.810|
|24539|2013|22.377.810|
|24539|2014|23.320.200|
|24539|2014|23.320.200|
|24539|2014|23.320.200|
|24539|2014|23.320.200|
|24539|2014|23.320.200|
|24539|2015|25.072.650|
|24539|2015|25.072.650|
|24539|2015|25.072.650|
|24539|2015|25.072.650|
|24539|2015|25.072.650|
|24539|2016|33.733.280|
|24539|2016|33.733.280|
|24539|2016|33.733.280|
|24539|2016|33.733.280|
|24539|2016|33.733.280|

Very rudimentary tidyverse solution using the slice_ functions in conjunction with groups. The example data above was saved to a data.frame called 'df' in my code below.

The max_year column is just a helper column to make the logic a little clearer; get the max year for each key so that you can get the previous year as max_year - 1, then use a filter to only keep rows showing the previous year for that key.

Assuming your actual data is sorted in a meaningful order (newest dates at the top, oldest at bottom), then slice_head will return the most recent row from the prev year and the at along with it.

df %>% 
    group_by(GVKEYNEW) %>% 
    mutate(max_year = max(YEARNEW)) %>% 
    filter(YEARNEW == max_year-1) %>% 
    slice_head(n=1)
1 Like

Thank you very much for your help. I will try it and let you know how it went.
Kind regards,
Hossein

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.