Calculating and Plotting Weekly/Monthly Averages Based on Variable SelectInput

Hello Everyone,

I have the following data frame example:

'''

 dates <- seq(as.Date('2018-12-29'), as.Date('2020-01-01'), by = 'days') 
 Variable1 <- sample.int(20, 369, replace = TRUE)
 Variable2 <- sample (20,369, replace = TRUE)

 df <- data.frame(dates, Variable1, Variable2)

'''

My RShiny app has a selectInput (ID="selectvariable") with choices = "Variable1", "Variable2" and a selectInput (ID= "selectduration") with choices = "daily", "weekly", "monthly"

My goal is to create a ggplot with dates on the x axis and the input of "selectvariable" on the y axis. If the input of "selectduration" == "daily" it should plot the selected variable as it appears in the data frame.
If the input of "selectduration" == "weekly" it should plot THE WEEKLY AVERAGE for each week across the data frame for the selected variable.
If the input of "selectduration" == "monthly" it should plot THE MONTHLY AVERAGE for each month across the data frame for the selected variable.

It is a fairly simple app, however I am stuck on how to create these "weekly" and "monthly" averages and plot them. I do not think that creating additional columns of weekly/monthly averages is an appropriate choice because in reality there are many other variables. I want this to be as automated as possible.

I would love to hear what everyone has to say! Thank you!

I recommend studying tidy evaluation.
Chapter 12 Tidy evaluation | Mastering Shiny (mastering-shiny.org)

1 Like

Thank you for the suggestion! I have studied this article (it actually helped me answer an unrelated question so that was nice!) however, the question I have in this post remains. Perhaps I was unclear but I am wondering how you can create time-based averages across the data set. I have ran into apply.weekly() but this has been unsuccessful for me so far.

Do you mean something like this?

library(tsibble)
library(dplyr)

df <- data.frame(dates = seq(as.Date('2018-12-29'), as.Date('2020-01-01'), by = 'days'),
                 Variable1 = sample.int(20, 369, replace = TRUE),
                 Variable2 = sample (20,369, replace = TRUE))

df %>% 
    as_tsibble(index = dates) %>% 
    index_by(year_week = ~ yearweek(.)) %>% # weekly aggregates
    summarise(across(where(is.integer), mean, rm.na = TRUE))
#> # A tsibble: 54 x 3 [1W]
#>    year_week Variable1 Variable2
#>       <week>     <dbl>     <dbl>
#>  1  2018 W52     18         8.5 
#>  2  2019 W01     13.6      12.9 
#>  3  2019 W02     11.6       9.86
#>  4  2019 W03      9.71     12.3 
#>  5  2019 W04      9.43     14   
#>  6  2019 W05      8.57     11.9 
#>  7  2019 W06     11         6.14
#>  8  2019 W07     10.6       9.43
#>  9  2019 W08     12.3       8.14
#> 10  2019 W09     11.1       7.71
#> # … with 44 more rows

Created on 2021-03-14 by the reprex package (v1.0.0.9002)

1 Like

Yes, I think this is very close! However, does produce an error for me currently. Would you mind briefly reviewing this aspect of my applicable code? I have included it below.

'''

 output$plot1 <- renderPlot({

  df_filtered <- filter(df, dates >= format(input$daterange[1]) & dates <= format(input$daterange[2])) %>% 
                              filter (client == input$clientInput)

 #as you can see here, there is also a date range filter and a client filter in this data set (I did not include those in my original post because I did not want to overcomplicate it

    if (input$DurationInput == "Daily") {
            df_filtered1 <- df_filtered
 }

 if(input$DurationInput =="Weekly"){
     df_filtered1<- df_filtered %>% 
                                 as_tsibble(index = dates) %>% 
                                 index_by(year_week = ~ yearweek(.)) %>%
                                summarise(across(where(is.integer), mean, rm.na = TRUE))
 }

 #Here I will also include a monthly average (  if(input$DurationInput == "monthly")  ) 
  #and a 'season' average (an average from August to August each year)

    ggplot(df_filtered1, aes(dates, .data[[input$VariableInput]])) +
      geom_point(size=3)+
     geom_line(color = "Red")+
       theme_bw()+
      theme(axis.line = element_line(colour = "black"))
   })

'''

The error I am currently getting is "Warning: Error in FUN: object 'dates' not found" (I am unsure if there are additional errors behind this one.

Thank you very much for your time and knowledge.

you can see from andresrcs example that the dates field gets dropped in favour of the year_week variable that gets made by summarising it. You need to account for that.

1 Like

Oh right! this makes complete sense. However, that seems to create a difficulty when attempting to dynamically plot these values. Is there an easy way around this difficulty?

'year_week ' is a programmer choice, you could choose a generic name like newdates or dates if you wanted ?

There are many approaches you can take, for example, you could also keep the dates column, is all up to you

library(lubridate)
library(dplyr)

df <- data.frame(dates = seq(as.Date('2018-12-29'), as.Date('2020-01-01'), by = 'days'),
                 Variable1 = sample.int(20, 369, replace = TRUE),
                 Variable2 = sample (20,369, replace = TRUE))

df %>% 
    mutate(dates = ceiling_date(dates, unit = "weeks")) %>%
    group_by(dates) %>% 
    summarise(across(where(is.integer), mean, rm.na = TRUE))
#> # A tibble: 54 x 3
#>    dates      Variable1 Variable2
#>    <date>         <dbl>     <dbl>
#>  1 2018-12-30      9        12   
#>  2 2019-01-06      9        10   
#>  3 2019-01-13     12.7       7.71
#>  4 2019-01-20      9.57     10.6 
#>  5 2019-01-27      9.43      8   
#>  6 2019-02-03      5.14      8.71
#>  7 2019-02-10     10.1      13.7 
#>  8 2019-02-17      9.57     11.1 
#>  9 2019-02-24     14.6      11.1 
#> 10 2019-03-03     10.9      14   
#> # … with 44 more rows

Created on 2021-03-14 by the reprex package (v1.0.0.9002)

1 Like

Thank you for both of your help. This solution works almost perfectly! I appreciate all the time you have put into this!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.