Sourcing helper file with function in Shiny app file

Hi All,

I'm trying to source a helper file, containing a function, within a reactive expression in my Shiny app file. However, I'm noticing that when I arrange the data in a descending manner, in the helper file, the sort order does not carry over to the reactive expression output.

Please see below for the arrange code within the helper file,

  mathData <- mathData %>%
    mutate(showsMathSum = sum(showsMath, na.rm =TRUE)) %>%
    group_by(showsMathTopic) %>%
    arrange(desc(showsMathSum))```

Please see below for the server assignment and reactive expression assignment within the app file,

server <- function(input, output) {
  
  testMathInput <- reactive({
    allTheMath(testMathDF,
               yearMath_select,
               channelMath_select)
})

All files can be found here,

https://drive.google.com/drive/folders/1kBJNJ0B2iy601q393aQU6R5M1XwNP22Q?usp=sharing

Many thanks for your insight, and happy to hear your thoughts,

Andrew

Hi @ajmasnyj I think you might get a bit more help in the future if you post all the code in your post here or host it in a github gist (https://gist.github.com/). Having to download the files from a google drive is a bit sketchy for people from a security standpoint and is a barrier to quickly seeing if they can help.

I have a bit of time to kill and am bad at following security advice though so let's take a look!

In your helper file allTheMath() function you have:

mathData <- testMathDF %>%
 group_by(showsMathTopic) %>%
 summarise(showsMathSum = sum(showsMath, na.rm = TRUE))

mathData <- mathData %>%
 mutate(showsMathSum = sum(showsMath, na.rm =TRUE)) %>%
 group_by(showsMathTopic) %>%
 arrange(desc(showsMathSum))

The second chunk is overwriting what the first chunk does and seems to just add all of the showsMath numbers together, so they cant really be arranged since they are all evaluating to they same amount (63 in this case). Could this be the root of the problem?

I think removing those two blocks and changing it to something more like:

mathData <- testMathDF %>%
 group_by(showsMathTopic) %>%
 summarise(showsMathSum = sum(showsMath, na.rm = TRUE)) %>%
 arrange(desc(showsMathSum))

May return the results more how you're expecting them. Good luck!

Thanks for the advice, @scottbrenstuhl ! Haha, that totally makes sense about not downloading from my google drive.

Also, thank you for your reply - and I see now how the second chunk is overwriting the first. What I hoped to achieve was an arranged data frame or tibble, that uses the result of the summarize() function as the sort order.

Apologies for asking again, but if you wouldn't mind, here's simpler code that may illustrate my question,

xA <- c(0.5, 0.25, 0.75, 0.1, 0.15, 0.20, 0.80)
xB <- c("CFA", "CFA", "MBA", "CFA", "MBA", "CFA", "MBA")
xC <- c(2017, 2016, 2017, 2015, 2016, 2014, 2015)
xABC <- data.frame(xB, xC, xA)

xABCpipe1 <- xABC %>%
  group_by(xB) %>%
  summarise(xAMean = mean(xA, na.rm = TRUE)) %>%
  arrange(desc(xAMean))

Returning xABCpipe1, gives me,

# A tibble: 2 x 2
  xB     xAMean
  <fctr>  <dbl>
1 MBA     0.567
2 CFA     0.262

However, I'd like to have it return the contents of xABC, but sorted based on the descending grouping of xABCpipe1 (by xB, then by xC), and then pass this into the Shiny app,
MBA 2017 0.75
MBA 2016 0.15
MBA 2015 0.80
CFA 2017 0.50
CFA 2016 0.25
CFA 2015 0.10
CFA 2014 0.20

Again, many thanks for your time, and happy to hear your thoughts,

Andrew

I think I see what you're getting at.

xA <- c(0.5, 0.25, 0.75, 0.1, 0.15, 0.20, 0.80)
xB <- c("CFA", "CFA", "MBA", "CFA", "MBA", "CFA", "MBA")
xC <- c(2017, 2016, 2017, 2015, 2016, 2014, 2015)
xABC <- data.frame(xB, xC, xA)

xABCpipe1 <- xABC %>%
  group_by(xB, xC) %>%
  summarise(xAMean = mean(xA, na.rm = TRUE)) %>%
  arrange(desc(xB), desc(xC))

xABCpipe1

will return:

I think maybe the confusion is coming from not grouping by all of the columns that you want to persist, and then needing to add each of the importing sorting columns of the output of the group_by/summarise to the arrange call. Does this help for what you're trying to acomplish?

Ah, that makes sense and we're on the right track, @scottbrenstuhl , thank you!

Just to clarify, "arrange(desc(xB), desc(xC))" ensures that the years in xC are always listed in descending order, however, it would also have to ensure that the averages of each xB group will always be sorted in descending order as well.

If another xB group, "JD", is added, the result gets thrown off,

xA <- c(0.5, 0.25, 0.75, 0.1, 0.15, 0.20, 0.80, 0.55, 0.75, 0.9)
xB <- c("CFA", "CFA", "MBA", "CFA", "MBA", "CFA", "MBA", "JD", "JD", "JD")
xC <- c(2017, 2016, 2017, 2015, 2016, 2014, 2015, 2017, 2016, 2015)
xABC <- data.frame(xB, xC, xA)

xABCpipe1 <- xABC %>%
  group_by(xB, xC) %>%
  summarise(xAMean = mean(xA, na.rm = TRUE)) %>%
  arrange(desc(xB), desc(xC))

Which results in,

# A tibble: 10 x 3
# Groups:   xB [3]
       xB    xC xAMean
   <fctr> <dbl>  <dbl>
 1    MBA  2017   0.75
 2    MBA  2016   0.15
 3    MBA  2015   0.80
 4     JD  2017   0.55
 5     JD  2016   0.75
 6     JD  2015   0.90
 7    CFA  2017   0.50
 8    CFA  2016   0.25
 9    CFA  2015   0.10
10    CFA  2014   0.20

The "MBA" average, ((0.75 + 0.15 + 0.8)/3) = 0.567, while the "JD" average, ((0.55 + 0.75 + 0.90)/3) = 0.733, but in the resultant xABCpipe1, the "JD" rows appear below the "MBA" rows, alphabetically, but not by descending "group average".

Apologies for complicating this, and I really do appreciate your help thus far! Also, I'm open to other suggestions for ordering by groups, as my end goal is a tibble or data frame that's ready for visualizing with plotly and shiny.

Again, many thanks!

Andrew

Maybe someone can provide a better answer, but here's one possible hack you could incorporate:

xA <- c(0.5, 0.25, 0.75, 0.1, 0.15, 0.20, 0.80, 0.55, 0.75, 0.9)
xB <- c("CFA", "CFA", "MBA", "CFA", "MBA", "CFA", "MBA", "JD", "JD", "JD")
xC <- c(2017, 2016, 2017, 2015, 2016, 2014, 2015, 2017, 2016, 2015)
xABC <- data.frame(xB, xC, xA)

xABC %>%
  group_by(xB, xC) %>%
  summarise(xAMean = mean(xA, na.rm = TRUE)) %>%
  mutate(group_xAMean = mean(xAMean, na.rm = TRUE)) %>% # Calculate the group average of the average
  arrange(desc(group_xAMean), desc(xB), desc(xC)) %>% # Order by this variable initially
  select(-group_xAMean) # Remove variable if you want to keep the original structure

You certainly don't need to apologize for complicating things :slight_smile: starting with a simple version and adding complexity is a fantastic way to learn!

I think perhaps using more descriptive column names would have made this more clear, but the reason that xAMean isn't in descending order is that we aren't sorting by it. We have been sorting by the column xC (year) in descending order. To instead sort them by the mean, you can swap out xC in the arrange call for xAMean:

> xABCpipe1 <- xABC %>%
+   group_by(xB, xC) %>%
+   summarise(xAMean = mean(xA, na.rm = TRUE)) %>%
+   arrange(desc(xB), desc(xAMean))
> xABCpipe1
# A tibble: 10 x 3
# Groups:   xB [3]
       xB    xC xAMean
   <fctr> <dbl>  <dbl>
 1    MBA  2015   0.80
 2    MBA  2017   0.75
 3    MBA  2016   0.15
 4     JD  2015   0.90
 5     JD  2016   0.75
 6     JD  2017   0.55
 7    CFA  2017   0.50
 8    CFA  2016   0.25
 9    CFA  2014   0.20
10    CFA  2015   0.10

Thank you, @scottbrenstuhl and @jdb, for all of your time on your responses. I completely agree about being more descriptive in my column naming, and so I renamed the columns - and complicated things again haha!

My data frame now looks like,

credName <- c("CFA", "CFA", "MBA", "CFA", "MBA", "CFA", "MBA", "JD", "JD", "JD", "FINRA", "FINRA")
credBiz <- c("Bank", "NA", "Consulting", "NA", "Bank", "Consulting", "Law", "Law", "Law", "Law", "Bank", "Consulting")
credYear <- c(2017, 2016, 2017, 2015, 2016, 2014, 2015, 2017, 2016, 2015, 2016, 2017)
credPercent <- c(0.5, 0.25, 0.75, 0.1, 0.15, 0.20, 0.80, 0.55, 0.75, 0.9, 0.8, 0.2)

credDF <- data.frame(credName, credYear, credBiz, credPercent)

Ultimately, using mutate to create a new column, and then arranging by this new column, solved my initial problem. Thank you, @jdb !

  credpipe <- credDF %>%
    group_by(credName, credYear) %>%
    summarise(percentMean = mean(credPercent, na.rm = TRUE)) %>%
    mutate(group_percentMean = mean(percentMean, na.rm = TRUE)) %>%
    arrange(desc(group_percentMean), desc(credName), desc(credYear)) %>%
    select(-group_percentMean)

Which results in the groups sorted by the group average, as well as by year,

> credpipe
# A tibble: 12 x 3
# Groups:   credName [4]
   credName credYear percentMean
     <fctr>    <dbl>       <dbl>
 1       JD     2017        0.55
 2       JD     2016        0.75
 3       JD     2015        0.90
 4      MBA     2017        0.75
 5      MBA     2016        0.15
 6      MBA     2015        0.80
 7    FINRA     2017        0.20
 8    FINRA     2016        0.80
 9      CFA     2017        0.50
10      CFA     2016        0.25
11      CFA     2015        0.10
12      CFA     2014        0.20

Next question is how to carry this sorted tibble into my Shiny app. With a function created in a helper file, (the summarise/mutate/arrange piping from above, is the output of this function), I reference the function and assume that the output gets stored in the reactive credpipeInput(). However, when I plot the data, as "p", with renderPlotly(), the data is not in the sorted order from above.

server <- function(input, output) {
  
  credpipeInput <- reactive({
    credFunction(credDF,
                 credName_select,
                 credBiz_select,
                 credYear_select)
  })
  
  output$credtestplot <- renderPlotly(
    p <- credpipeInput() %>%
      plot_ly(y = ~percentMean, x = ~credName, color = ~factor(credYear),
              type = "bar")
  )
  
}

Happy to hear your thoughts - I can save the helper and app files to GitHub if needed.

Again, many thanks!

Andrew

It seems that plotly orders the axis in alphabetical order. Your best bet would be to keep the group_percentMean column, and reorder the credName by its values

credDF %>%
    group_by(credName, credYear) %>%
    summarise(percentMean = mean(credPercent, na.rm = TRUE)) %>%
    mutate(group_percentMean = mean(percentMean, na.rm = TRUE)) %>%
    arrange(desc(group_percentMean), desc(credName), desc(credYear)) %>%
    mutate(credName = forcats::fct_reorder(credName, desc(group_percentMean)) # Use fct_reorder() from the forcats package to reorder credName based on the descending group_percentMean

This should keep the order from the original DF

That works, @jdb , thank you so much!

The forcats::fct_reorder() function was a great call - it kept the order of the groups based on descending group average, while also keeping the years in descending order.

Any idea how to label each bar in the grouped bar chart with its respective percentage? Currently, all of the percentages are lumped together - I've tried numerous iterations of Plotly's yref, y, and yanchor without much success.

  output$credentialsChart <- renderPlotly({
    
    p <- credInput() %>%
      group_by(CredentialQ) %>%
      plot_ly(x = ~vizCredPrcnt,
              y = ~forcats::fct_reorder(CredentialQ, group_vizCredPrcntMean),
              color = ~factor(year, ordered = TRUE),
              colors = "RdYlBu",
              type = "bar",
              orientation = "h",
              text = ~paste(round(vizCredPrcnt*100, 2), "%")) %>%
      layout(xaxis = list(title = "Percent Held"),
             yaxis = list(title = "Credentials"),
              annotations = list(x = credInput()$vizCredPrcnt,
                                      y = forcats::fct_reorder(credInput()$CredentialQ, credInput()$group_vizCredPrcntMean),
                                      text = paste(round(credInput()$vizCredPrcnt * 100, 2), "%"),
                                      xref = "x",
                                      x = 0,
                                      xanchor = "auto",
                                      yref = "y",
                                      y = 0,
                                      yanchor = "auto",
                                      showarrow = FALSE),
             #margin parameter to show all y label text
             margin = list(l = 160))

Happy to hear your thoughts, and again, many thanks for your time!

Andrew