How to calculate moving averages with weighted values

Hello,

I'm having trouble calculating a moving average of values that should be weighted to account for different sample sizes. Any feedback is greatly appreciated. Here is an example below:

 suppressWarnings(suppressPackageStartupMessages({
   library(tidyverse)
   library(slider)
 }))

 # data
 df <- tibble(resp_date = seq(as.Date("2020-01-01"), 
                              as.Date("2020-04-09"), 
                              by = "day"),
              n = round(runif(100, min = 100, max = 200), 0),
              percent = round(runif(100, min = 75, max = 100), 2))
 
 # I am able to get a simple moving average of percent with the slider package
 df %>% 
   mutate(`7 day avg` = slide_index_dbl(.i = resp_date,
                                        .x = percent,
                                        .f = mean,
                                        .before = 6))
#> # A tibble: 100 x 4
#>    resp_date      n percent `7 day avg`
#>    <date>     <dbl>   <dbl>       <dbl>
#>  1 2020-01-01   159    86.7        86.7
#>  2 2020-01-02   103    75.1        80.9
#>  3 2020-01-03   114    86.3        82.7
#>  4 2020-01-04   109    87.4        83.9
#>  5 2020-01-05   106    94.2        85.9
#>  6 2020-01-06   129    89.4        86.5
#>  7 2020-01-07   157    93.3        87.5
#>  8 2020-01-08   180    86.4        87.4
#>  9 2020-01-09   169    83.5        88.6
#> 10 2020-01-10   154    85.6        88.5
#> # ... with 90 more rows
#> # i Use `print(n = ...)` to see more rows
 
 # the n is different for each day though, so I'd like to weight the moving
 # average by n. I've tried weighted.mean(), but get the following error message.
 
 df %>% 
   mutate(`7 day avg` = slide_index_dbl(.i = resp_date,
                                        .x = percent,
                                        .f = weighted.mean(percent, n),
                                        .before = 6))
#> Error in `mutate()`:
#> ! Problem while computing `7 day avg = slide_index_dbl(...)`.
#> Caused by error in `slide_index_impl()`:
#> ! Can't convert `.f`, a number, to a function.

Created on 2022-08-03 by the reprex package (v2.0.1)

I think it would be

df %>%
  mutate(`7 day avg` = slide_index_dbl(
    .x = .,
    .i = resp_date,
    .f = ~ weighted.mean(
      x = .x$percent,
      w = .x$n
    ), .before = 6
  ))
1 Like

Yes, that seems to work. Thank you, thank you. If possible, can you explain a bit what is going on? In particular, what does .x = . do? And then, what is the meaning of .x in .x$percent?

Kind regards

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.