Help with dplyr, ifelse and custom function

jb_3901 · September 9, 2020, 4:22am

I have a multi-part question I've been trying to figure on building on some of my functions I've been able to cobble together.

Here's my dataframe:

dat1 <- data.frame(
  stringsAsFactors = FALSE,
              var1 = c("know", "see", "know", "hear",  "hear", "see",  "see", "know"),
              var2 = c(1, 2, 3, 4, 5, 6, 7, 8),
  var3 = c(1, 1, 2, 2, 3, 3, 4, 4),
  subCat1 = c("0", "0", "0", "0", "0", "0", "0", "0"))

Here are the inputs I want to use in my functions:

myID <- c(1, 2, 4)
myCat= c("remembering", "learning", "forgetting")

Part One
I was able to get myFunA function to work:

myFunA <- function (df, x, y) {
df %>%
mutate(subCat1 = ifelse(var2 %in% x, y, subCat1))
}

dat2 <- dat1 %>% 
  myFun2 (myID,myCat)

I'm not exactly sure why it works, because this function doesn't include the argument for "False", but it does work.

But when I try to have the user input the colName for subCat1 ("z" in the function below) I get errors with everything I try that include variations of the following:

myFunB <- function (df, x, y, z) {
df %>%
mutate(z = ifelse(var2 %in% x, y, z)
}

datA <- dat1 %>% 
  myFunB(myID, myCat, subCat1)

I get a result, but not what I want. This is what I'm expecting. The result I want is "dat2".

I don't understand why when z is added to the function, ifelse no longer works.
I don't know how to add "do nothing"Z to the function.
And when I try to change the function to if without else, I get another error.

myFunB <- function (df, x, y, z) {
df %>%
mutate(z = if (var2 %in% x, y))
}

Part 2

I've also tried figuring out how to write variations of this function for multi conditions

myFunMulti <- function (df, x, y, z) {
df %>%
mutate(subCat1 = ifelse(var2 %in% x, && var3 %in% z), y, subCat1)
}

This is the result I'm expecting:


MultiExpected <- data.frame(
  stringsAsFactors = FALSE,
              var1 = c("know", "see", "know", "hear",  "hear", "see",  "see", "know"),
              var2 = c(1, 2, 3, 4, 5, 6, 7, 8),
  var3 = c(1, 1, 2, 2, 3, 3, 4, 4),
  subCat1 = c("remembering", "0", "learning", "0", "0", "0", "0", "forgetting"))

Thank you.

nirgrahamuk · September 9, 2020, 8:45am

The syntax for ifelse is ifelse(test, yes, no) I assume by argument for False you mean no, It is present in your example, its subCat1 which is after the 2nd comma.

unfortunately there is another typo which causes myFunB to error out, its a missing closing bracket on the mutate.
Difference between myFunA and myFunB is that myFunA you are passing existing objects to your function for it to use as they are, in myFunB you also pass a symbol that represents a name of a variable in df, this is quite a different thing, to make this work the easiest way is to use rlang library and they have a convenient double curly braces syntax to make simple cases relatively easy. look at


library(rlang)
myFunB <- function (df, x, y, z) {
  df %>%
    mutate(z = ifelse(var2 %in% x, y, {{z}}))
}

datA <- dat1 %>% 
  myFunB(myID, myCat, subCat1)

jb_3901:

I've also tried figuring out how to write variations of this function for multi conditions
myFunMulti <- function (df, x, y, z) {
df %>%
mutate(subCat1 = ifelse(var2 %in% x, && var3 %in% z), y, subCat1)
}

your brackets are again confused and you have an errant comma
fix:

datA <- dat1 %>% 
  myFunB(myID, myCat, subCat1)

myFunMulti <- function (df, x, y, z) {
  df %>%
    mutate(subCat1 = ifelse(var2 %in% x & var3 %in% {{z}}, y, {{z}}))
}

datX <- dat1 %>% 
  myFunMulti(myID, myCat, subCat1)

jb_3901 · September 9, 2020, 3:33pm

Thank you for your help with these, but I'm still not getting the results I'm expecting.

For datA, the correct information is being entered, but it's going into a new cell "z", not subCat1. This gets me partway to where I want to go! Thank you!

And I don't think the multi conditions work for the 2nd problem.

datA <- data.frame(
  stringsAsFactors = FALSE,
              var1 = c("know","see","know","hear",
                       "hear","see","see","know"),
              var2 = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L),
              var3 = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L),
           subCat1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L),
                 z = c("remembering","learning",
                       "subCat1","remembering","subCat1","subCat1","subCat1",
                       "subCat1")
)



datX <- data.frame(
  stringsAsFactors = FALSE,
              var1 = c("know","see","know","hear",
                       "hear","see","see","know"),
              var2 = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L),
              var3 = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L),
           subCat1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)
)

nirgrahamuk · September 9, 2020, 4:28pm

library(rlang)
myFunB <- function (df, x, y, z) {
  z2 <- as.character(ensym(z))
  df %>%
    mutate(z = ifelse(var2 %in% x, y, z2))
}

but then I'm not sure what relevance having a subCat1 column in your input data was...

jb_3901 · September 9, 2020, 5:23pm

Thank you, but I still can't seem to get this to work.

x should = myID
y should = myCat
z should = subCat1

dat1 <- data.frame(
  stringsAsFactors = FALSE,
              var1 = c("know", "see", "know", "hear",  "hear", "see",  "see", "know"),
              var2 = c(1, 2, 3, 4, 5, 6, 7, 8),
  var3 = c(1, 1, 2, 2, 3, 3, 4, 4),
  subCat1 = c("0", "0", "0", "0", "0", "0", "0", "0"))

myID <- c(1, 2, 4)
myCat <- c("remembering", "learning", "forgetting")

df1 <- df %>% 
    myFunB(myID, myCat, subCat1)

I get this error:

Error in UseMethod("mutate_") : **
** no applicable method for 'mutate_' applied to an object of class "function"

I tried running this again with mutate_ instead of mutate, but that didn't work either.

nirgrahamuk · September 9, 2020, 5:33pm

you are passing df to myFunB, df is a function in base R unless you override it to be an object of your own. In the code we exchanged between us df has been used as a param name for a function, and so within that function represents the dataframe passed in, but this has no relation to anything outside the function. You probably meant to use dat1 %>% myFunB etc.

jb_3901 · September 9, 2020, 6:04pm

Oh, yes, you're right. That was a typo on my part, So here's what I ran, but it still doesn't work.

dat1 <- data.frame(
  stringsAsFactors = FALSE,
              var1 = c("know", "see", "know", "hear",  "hear", "see",  "see", "know"),
              var2 = c(1, 2, 3, 4, 5, 6, 7, 8),
  var3 = c(1, 1, 2, 2, 3, 3, 4, 4),
  subCat1 = c("0", "0", "0", "0", "0", "0", "0", "0"))

myFunB <- function (df, x, y, z) {
  z2 <- as.character(ensym(z))
  df %>%
    mutate(z = ifelse(var2 %in% x, y, z2))
}

myID <- c(1, 2, 4)
myCat <- c("remembering", "learning", "forgetting")

dat2 <- dat1 %>% 
    myFunB(myID, myCat, subCat1)

result <- data.frame(
  stringsAsFactors = FALSE,
              var1 = c("know","see","know","hear",
                       "hear","see","see","know"),
              var2 = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L),
              var3 = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L),
           subCat1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L),
                 z = c("remembering","learning",
                       "subCat1","remembering","subCat1","subCat1","subCat1",
                       "subCat")
)

This almost does what I want, but instead of the data being entered into the "subCat1" column it's being entered into a new column named "z" and subCat1 is being entered as the "no" part of the ifelse statement.

The ifelse statement works for the following function, when "z" is hard coded into the function.
I want to the user to be able to enter the column name (and not have it hardcoded. This function still hard codes the name of the column into the function as "z", and it's overwriting all of the "no". I don't want any of the existing data to be overwritten (as in myFunA) but I want to be able the user to enter the column name, so I can get this result:

expected <- data.frame(
  stringsAsFactors = FALSE,
              var1 = c("know","see","know","hear",
                       "hear","see","see","know"),
              var2 = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L),
              var3 = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L),
           subCat1 = c("remembering","learning","0",
                       "remembering","0","0","0","0")
)

I'm trying to wrap my brain around why the ifelse statement works with myFunA, but then behaves differently in myFunB.

nirgrahamuk · September 9, 2020, 6:17pm

you want this ?


myFunB <- function (df, x, y, z) {
  df %>%
    mutate({{z}} := ifelse(var2 %in% x, y, {{z}}))
}

jb_3901 · September 9, 2020, 8:00pm

Yes! That is it exactly! Thank you so much!

I'm not quite sure what := does to make this work (but I can search for that!). This is exactly what I needed, and from here, I can figure out other ways to use this function. Many thanks!

system · September 16, 2020, 8:00pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.