Passing ggplot2 options to a custom function

dplyr
ggplot2
rstudio

#1

Hi,

I have a local function which calculates counts by group vars and plots via ggplot2
Here is the function definition:

plot_by_two_groups <- function(comments_df, # should be a a dataframe
                        group_var_y, #variable to be plotted on y axis
                        group_var_x,
                        get_all_word_counts = F,
                        ...){

    group_var_y <- enquo(group_var_y)
    group_var_x <- enquo(group_var_x)
    
  
total_comments_by_group_var_y <- comments_df%>% 
                                    count(!! group_var_y) %>% rename(total_comments_by_group_var_y = n)

comments_df%>%
            count(!! group_var_x,!! group_var_y)%>%
            inner_join(total_comments_by_group_var_y)%>%
            mutate(perc_of_comments_by_group_var_y = n/total_comments_by_group_var_y)%>%
            ggplot(aes(x=!! group_var_y,y=perc_of_comments_by_group_var_y,fill=!! group_var_x))+
                  geom_bar(stat = "identity")+
                  geom_text(aes(label = paste0("(",n,")")),position = position_stack(vjust = 0.5),
                            fontface="bold",colour = "white") +
                  geom_text(aes(label=sprintf("%1.0f%%", 100*perc_of_comments_by_group_var_y)),
                                 position = position_stack(vjust = 0.2),fontface="bold",colour = "white")+
                  theme_BT()+
                    scale_y_continuous(labels=percent)+
               #    scale_fill_manual(values = c("Passive"="#E60050","Detractor"="#00A0D6","Promoter"="#6400AA",
               #                                 "other"="#DDDDDD"),
               #     guide = guide_legend(reverse = FALSE))+
                    coord_flip()+
                  theme(axis.title.y = element_blank()) +
                  labs(title = "Rating Distibution by CP",
                       y = "% of customers",
                      subtitle =paste("% of customers by", quo_name(group_var_y),
                                       "Total verbatim = ",nrow(comments_df)),
                       fill = NULL)

}

I have two challenges:

1- I wish to pass all the labs via function call i.e. labs(x="example text")
2- How do i pass fill colours depending on the number of data levels (see the commented lines of scale_fill_manuel)

thanks

when I call the function like:

#debugonce(plot_by_two_groups)
verbatims%>% 
                                    filter(between(visit_date, as.Date("2018-01-01"),as.Date("2018-09-17"))) %>%
plot_by_two_groups(., # should be a a dataframe
                        cp_code, #variable to be plotted on y axis
                        NPS_RATING,
                        get_all_word_counts = F)

It works fine

But when i do this:

#debugonce(plot_by_two_groups)
verbatims%>% 
                                    filter(between(visit_date, as.Date("2018-01-01"),as.Date("2018-09-17"))) %>%
plot_by_two_groups(., # should be a a dataframe
                        cp_code, #variable to be plotted on y axis
                        NPS_RATING,
                        get_all_word_counts = F,
                        xlab = "example xlab")

and change the lab(y=xlab) this gives the following error:

Error in as.character(x$label) : cannot coerce type 'closure' to vector of type 'character'


#2

Could you please turn this into a self-contained reprex (short for reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

install.reprex("reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

What to do if you run into clipboard problems

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ, linked to below.


#3
plot_by_two_groups <- function(comments_df, # should be a a dataframe
                        group_var_y, #variable to be plotted on y axis
                        group_var_x,
                        get_all_word_counts = F,
                        ...){

    group_var_y <- enquo(group_var_y)
    group_var_x <- enquo(group_var_x)
    
  
total_comments_by_group_var_y <- comments_df%>% 
                                    count(!! group_var_y) %>% rename(total_comments_by_group_var_y = n)

comments_df%>%
            count(!! group_var_x,!! group_var_y)%>%
            inner_join(total_comments_by_group_var_y)%>%
            mutate(perc_of_comments_by_group_var_y = n/total_comments_by_group_var_y)%>%
            ggplot(aes(x=!! group_var_y,y=perc_of_comments_by_group_var_y,fill=!! group_var_x))+
                  geom_bar(stat = "identity")+
                  geom_text(aes(label = paste0("(",n,")")),position = position_stack(vjust = 0.5),
                            fontface="bold",colour = "white") +
                  geom_text(aes(label=sprintf("%1.0f%%", 100*perc_of_comments_by_group_var_y)),
                                 position = position_stack(vjust = 0.2),fontface="bold",colour = "white")+
                  theme_BT()+
                    scale_y_continuous(labels=percent)+
                   scale_fill_manual(values = c("Passive"="#E60050","Detractor"="#00A0D6","Promoter"="#6400AA",
                                                "other"="#DDDDDD"),
                    guide = guide_legend(reverse = FALSE))+
                    coord_flip()+
                ylab(ylab)+
                #  theme(axis.title.y = element_blank()) +
                  labs(title = paste("Rating Distibution by", quo_name(group_var_y)),
                     #  y = "% of customers",
                      subtitle =paste("% of customers by", quo_name(group_var_y),
                                       "Total verbatim = ",nrow(comments_df)),
                       fill = NULL)

}


verbatims%>% 
                                    filter(between(visit_date, as.Date("2018-01-01"),as.Date("2018-09-17"))) %>%
plot_by_two_groups(., # should be a a dataframe
                        cp_code, #variable to be plotted on y axis
                        NPS_RATING,
                        get_all_word_counts = F,
                   ylab="custom y label")
#> Error in verbatims %>% filter(between(visit_date, as.Date("2018-01-01"), : could not find function "%>%"

Created on 2018-07-30 by the reprex package (v0.2.0).


#4

You're almost there! Your reprex is missing a couple of critical pieces:

  1. library() calls: reprex runs the code you give it in a separate R session, so you need to include your library() calls in the code you pass to reprex.
  2. Data. Helpers will not be able to run your code and thereby understand your problem unless you include code creating your verbatims data frame in the reprex — or, better, code creating a small toy version of the real data frame.

That said, I think I can help with a couple of your problems:

  1. You are passing arguments via ... but the code never accesses those arguments. That's the source of the error about coercing type "closure" — instead of the argument named xlab which you passed via ..., the code is accessing the function xlab. You might want to read up on how to use ... inside functions: https://adv-r.hadley.nz/functions.html#fun-dot-dot-dot

I'm not quite sure what you mean here, but labs() just takes a list with appropriately named elements. You can pass this list in with whatever argument name you like. A rather silly example:

library(ggplot2)

plot_with_labs <- function(my_labs) {
  ggplot(cars, aes(speed, dist)) +
    geom_point() +
    labs(my_labs)
}

plot_with_labs(
  my_labs = list(
    x = "The x axis!", 
    y = "The y axis!", 
    title = "The title!"
  )
)

Created on 2018-07-31 by the reprex package (v0.2.0).

Again, I'm not quite sure what you're imagining here. An example might help. You can pass in an appropriately named vector to be used in the call to scale_fill_manual(), but maybe you want a more general solution?

The simplest option would be to include a vector of colors that's longer than your greatest anticipated number of levels, and let the levels be matched to colors by position:

library(ggplot2)

plot_with_colors <- function(color_var) {
  color_var <- enquo(color_var)
  
  ggplot(mtcars, aes(hp, mpg, color = factor(!!color_var))) +
    geom_point() +
    scale_color_manual(
      values = c(
        "#5DA5DA",
        "#FAA43A",
        "#60BD68",
        "#F17CB0",
        "#B2912F",
        "#B276B2",
        "#DECF3F",
        "#F15854"
      )
    )
}

plot_with_colors(cyl)


plot_with_colors(carb)

Created on 2018-07-31 by the reprex package (v0.2.0).

If you want more control than that, it would help to have a more detailed example of how you want to match colors to levels.


#5

You just fixed my issues! thank you so much