How can I do this in R? Excel equivalent of nested index match...

Hi All,

I'm an Excel guy trying to convert over to R for my investment analysis. Here is where I'm stuck (random data below... not actual performance numbers):

|date |voo1_return|agg1_return|tsla1_return|voo5_return|agg5_return|tsla5_return|
|2020-08-15| (0.00808) | (0.01336) | (0.00402) | 0.07177 | 0.06700 | 0.01008 |
|2020-08-16| 0.01370 | 0.01037 | 0.01946 | 0.08099 | 0.06400 | 0.02741 |

I have a data frame with dates in the first column and different stock returns over different trailing periods in each column thereafter (rows may be the appropriate semantics?). Within my data frame, I want to add a column called "max1_return" that is based on a logic:

If the lag of the "5_return" from the tickers above is greater than the lag of the "voo5_return", then I want the corresponding (not lagged) "1_return". If the lag of the "5_return" is less than "voo5_return", then I want to know which lag "5_return" is greatest, and use that corresponding ticker (not lagged) for the "1_return".

Using the data I typed above as an example, my new column should be labeled "max1_return" and the results should be 0.01370 on 8/16 because the voo5_return from 8/15 was the largest.

|date |voo1_return|agg1_return|tsla1_return|voo5_return|agg5_return|tsla5_return|max1_return|
|2020-08-15| (0.00808) | (0.01336) | (0.00402) | 0.07177 | 0.06700 | 0.01008 |
|2020-08-16| 0.01370 | 0.01037 | 0.01946 | 0.08099 | 0.06400 | 0.02741 |0.1370

I run into trouble writing the correct formula or embedding it into the mutate function. Any insight would be greatly appreciated!

Thank you!

  • Jeremy

This seems like a good place to use case_when() but I do not understand your explanation of the logic. Can you show an explicit calculation of the 'lag of the "5_return" from the tickers above' and the "lag of the v005_return"?

        date voo1_return agg1_return tsla1_return voo5_return agg5_return tsla5_return
  2020-08-15   (0.00808)   (0.01336)    (0.00402)     0.07177       0.067      0.01008
  2020-08-16     0.01370     0.01037      0.01946     0.08099       0.064      0.02741


Thanks for the response! I'll try to explain:

voo is the benchmark
all others (agg, tsla) are the other stock tickers
1_return is trailing 1 day return
5_return is trailing 5 day return

Assume today is 8/16 - the most recent data

If the lag of 5_day return (5 day return as of 8/15) of a ticker is greater than that of the benchmark, then I want the ticker's 1_day return. Hypothetical: If the agg5_return of 0.067 on 8/15 was higher than the voo5_return on 8/15 (it's obviously not), then I would want the code to return the agg1_return on 8/16 in the new "max1_return" column

If the lag 5_day return (5 day return as of 8/15) of a ticker is less than that of the benchmark, then I want to determine which ticker (including the benchmark) has the highest lag 5_day return and then use that corresponding ticker's 1_day return. In this example, all 5_return tickers are below the benchmark on 8/15, so the code would just return the benchmark 1_day of 0.01370 on 8/16

Does this make more sense? If not, I will try to upload my code with the backup datafile.

Thanks again!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.