Using data(FANG), say I know that there is a smoothed relationship between volume and opening price. Also I know the length of the most predictive rolling mean varies by stock. For some it is short, day 2 days. For others 10. I’d like to create multiple rolling means of lengths between 2 and 10 days for each stock.
So far I tried the tibbletime package and got a start so that I can calculate the multiple rolling means for one.
However, I can’t seem to figure out how to make this work when grouping by multiple stocks.
I tried adding:
FANG_with_multiroll<- FANG %>%
group_by(symbol) %>%
bind_cols(FANG, invoke_map(rollers, x =FANG$volume)
But that didn’t work. Any ideas would be appreciated. One I get it to work, I plan on finding the highest correlation or rsquared for each symbol. If you have ideas about better ways to do that too, I’m interested.
Can you please provide a minimal reprex (reproducible example)? The goal of a reprex is to make it as easy as possible for me to recreate your problem so that we can fix it: please help us help you!
If you've never heard of a reprex before, start by reading "What is a reprex", and follow the advice further down that page.
Here is one way that uses rlang's !!! to splice the rollers into mutate_at(). A neat solution. Making sure that the rollers are named is important here!
library(tibbletime)
library(dplyr)
library(purrr)
data(FANG)
# Create the column names
col_names <- map_chr(2:10, ~paste0("adjusted_", .x))
# Creating the rolling functions and assign them names
rollers <- map(2:10, ~rollify(mean, window = .x)) %>%
set_names(nm = col_names)
# We can create our named function list with funs() and splicing
funs(!!!rollers)
#> <fun_calls>
#> $ adjusted_2 : (function (...) ...
#> $ adjusted_3 : (function (...) ...
#> $ adjusted_4 : (function (...) ...
#> $ adjusted_5 : (function (...) ...
#> $ adjusted_6 : (function (...) ...
#> $ adjusted_7 : (function (...) ...
#> $ adjusted_8 : (function (...) ...
#> $ adjusted_9 : (function (...) ...
#> $ adjusted_10: (function (...) ...
# And then use it in mutate_at()
FANG %>%
# Group by symbol
group_by(symbol) %>%
# Splice in the rollers, and call them on the adjusted column
mutate_at("adjusted", funs(!!!rollers))
#> Warning: package 'bindrcpp' was built under R version 3.4.4
#> # A tibble: 4,032 x 17
#> # Groups: symbol [4]
#> symbol date open high low close volume adjusted adjusted_2
#> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 FB 2013-01-02 27.4 28.2 27.4 28 69846400 28 NA
#> 2 FB 2013-01-03 27.9 28.5 27.6 27.8 63140600 27.8 27.9
#> 3 FB 2013-01-04 28.0 28.9 27.8 28.8 72715400 28.8 28.3
#> 4 FB 2013-01-07 28.7 29.8 28.6 29.4 83781800 29.4 29.1
#> 5 FB 2013-01-08 29.5 29.6 28.9 29.1 45871300 29.1 29.2
#> 6 FB 2013-01-09 29.7 30.6 29.5 30.6 104787700 30.6 29.8
#> 7 FB 2013-01-10 30.6 31.5 30.3 31.3 95316400 31.3 30.9
#> 8 FB 2013-01-11 31.3 32.0 31.1 31.7 89598000 31.7 31.5
#> 9 FB 2013-01-14 32.1 32.2 30.6 31.0 98892800 31.0 31.3
#> 10 FB 2013-01-15 30.6 31.7 29.9 30.1 173242600 30.1 30.5
#> # ... with 4,022 more rows, and 8 more variables: adjusted_3 <dbl>,
#> # adjusted_4 <dbl>, adjusted_5 <dbl>, adjusted_6 <dbl>,
#> # adjusted_7 <dbl>, adjusted_8 <dbl>, adjusted_9 <dbl>,
#> # adjusted_10 <dbl>
That is great... I used your gist a few months back to do this calculate multiple means for a single set of obs. Am now trying to optimize rolling predictive capacity of a precipitation deviation rolling mean to predict groundwater levels at monitoring sites.
If you have any good tutorials on using the !!! operator, it would be appreciated.