How to do calculated columns with user defined functions?

I know I am missing something fundamental to R and dataframes but I cannot for the life of me figure out what it is. I have been working on this for 4 or 5 hours and have read hundreds of web pages but still don't understand what I'm doing wrong. If someone can put me out of my misery I would be eternally grateful.

I have data in several dataframes and given the source of the data there are times some data are missing. I have another table that has values that should be "plugged in" for those missing values, and in some cases I can derive the missing value by looking at the value that came before it in that column (using the lag() function). The dataframes and functions are pretty complex, so I made up a silly example to show where my lack of knowledge is killing me.

df <- tibble(
  a = round(rnorm(10) * 100, 0),
  b = round(rnorm(10) * 100, 0),
  c = c(14, NA, 38, 19, 23, 38, NA, 29, 77, 98)
)

lookup_df <- tibble(
  key = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9),
  val = c(27, 13, 98, 12, 87, 66, 29, 34, 12, 98)
)

silly_func <- function(alpha, bravo, charlie = 5) {
  if (alpha > 50) {
    return(bravo + charlie)
  } else {
    l_char = substr(as.character(alpha), 1, 1)
    
    r <- as.numeric(lookup_df %>%
                      filter(key = as.numeric(l_char)) %>%
                      select(val))
    
    return(r)
  }
}

df <- df %>%
  mutate(c = ifelse(is.na(c),
                    silly_func(a, b),
                    c))

df

Not surprisingly, this gives an error when run:

Error in `mutate()`:
! Problem while computing `c = ifelse(is.na(c), silly_func(a, b), c)`.
Caused by error in `if (alpha > 50) ...`:
! the condition has length > 1

I've tried iterating using for loops, purrr::map, pmap, map_df, etc. I've tried everything I can think of to get this to work and have come up short.

Thank you in advance for any clarity you may be able to provide.

David

Before I get started on this, what is charlie in silly_func()?

Sorry about that. I've edited the post to include the charlie parameter in the function definition.

I posted a better example here: Calling user defined function to calculate column values - General - RStudio Community

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.