How to use function effectively with different variable names

I am experiencing a situation where I have a code

Split_data %>% 
  filter(data_availability_6m == 1) %>% 
  select(Total_cost_m_6m, Total_cost_ou_6m, Total_cost_in_6m) %>% 
  mutate(Total_cost = (Total_cost_m_6m + Total_cost_ou_6m + Total_cost_in_6m))

The "6m" at the end of the variable represents 6 months. Likewise, I want to run this code for 1yr, 18m, and 2yr. Instead of running the code again and again, I want to create a function to overcome this repeated coding. How can I write a function on this?

I tried creating a function:

Cost_summary <- function(timeX)
  {
  Split_data %>% 
    filter(data_availability_{{ timeX }} == 1) %>% 
    select(Total_cost_m_{{timeX}}, Total_cost_ou_{{timeX}}, Total_cost_in_{{timeX}}) %>% 
    mutate(Total_cost = (Total_cost_m_{{timeX}} + Total_cost_ou_{{timeX}} + 
    Total_cost_in_{{timeX}}))

But when I run the function, I get the error as

Error: unexpected '{' in:
" Split_data %>%
filter(data_availability_{"

I know that we need "{{}}" brackets in functions when using tidyverse application. Not sure what I am doing wrong.
Thanks in advance for the help.

You cannot build variable names by placing text and variables next to each other. Code like

Var <- "Tuesday"
New_Var <- 6

will not make a new variable named New_Tuesday, no matter how you wrap Var. You can build a variable name from text but I suspect that is not the right approach. Can you post a little of your data? I suspect making it tidy will make your problem much easier. If your data frame is named DF, you can post the first 15 rows by posting the output of

dput(head(DF, 15))

If there are more columns than are needed to demonstrate the problem, you can make a reduced data frame with the select() function and use that in dput().

1 Like

You will inevitably struggle to compose elegant tidyverse code directly onto data that is un-tidy.
The first goal should be to tidy your data, then the use of tidyverse is a straightforward as can be ...
Likely you need to pivot _longer your data seperating the timeX component of the variable names to an independent variable.
12 Tidy data | R for Data Science (had.co.nz)

1 Like

Thanks, that was my guess too. I think I will work around cleaning more by separating the variables and re-run it.

Sincerely,
Srini

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.