Could you please explain why does case_when() execute own() function although condition isn't met - I know that the output is ok (NA), but in some cases (when condition isn't met) execution shouldn't happen to avoid errors. In the below example "own!" is printed x4 - my expectation: no prints.
case_when will execute all the LHS and RHS, then keep based on conditions. So your print statement is executed, one time by each a during the RHS evaluation. This is how it works with case_when.
Why do you need to print inside the own function ?
If it is only for logging purposes, you can use message and then suppressMessages when you don't want them to print
own <- function(x){
message("own!")
return(1)
}
library(magrittr)
dplyr::tibble(a = 1:4) %>%
dplyr::group_by(a) %>%
dplyr::mutate(x =
dplyr::case_when(
1 == 2 ~ suppressMessages(own(a)),
TRUE ~ NA_real_
)
)
#> # A tibble: 4 x 2
#> # Groups: a [4]
#> a x
#> <int> <dbl>
#> 1 1 NA
#> 2 2 NA
#> 3 3 NA
#> 4 4 NA
own <- function(x) {
print("own!")
return(1)
}
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
dplyr::tibble(a = 1:4) %>%
dplyr::group_by(a) %>%
mutate(x = if (1 == 2) own() else NA_real_)
#> # A tibble: 4 x 2
#> # Groups: a [4]
#> a x
#> <int> <dbl>
#> 1 1 NA
#> 2 2 NA
#> 3 3 NA
#> 4 4 NA
@mara and @cderv - I used print() as an example to show that this part is executed every time - the real code is more complicated (it doesn't print anything, just calculates). The thing is that in my case mutate with case_when is a part of longer dplyr flow (with pipes) and I've been wondering if there is a way to control "execution" of particular mutate. My "flow" is not a single-case script, but should handle more extended cases. One of these throws error - and this is ok in a sense that not all needed data is provided, thus I want not to execute this by adding case_when condition. Now (thx to @cderv) I know that RHS is executed always although LHS condition is not met, thus RHS causes error when own() function is executed in case in which it is not supposed to be run.
@hughparsonage This is indeed the solution I am looking for. I am not sure if usage of if() else() within dplyrmutate is "tidyverse-way" of doing this?
In this case, I would recommend refactoring the step that needs to be controlled into a separate function. That way, your logic around controlling execution is clearly separated and the pipe chain can still be a more or less linear flow of steps.
I usually do something similar to the following when I run into this sort of setup.
library(tidyverse)
# Write a tidy, pipe-compliant function that
# takes and returns a `tibble` or `data.frame`
choose_forking_path <- function(df, condition) {
if (condition) {
mutate(df, x = "own")
} else {
mutate(df, x = NA_real_)
}
}
Then, if an external_condition (or function argument, etc) is set, this function handles the logic around the specifics of the call to mutate().
external_condition <- FALSE
tibble(a = 1:4) %>%
group_by(a) %>%
choose_forking_path(external_condition)
#> # A tibble: 4 x 2
#> # Groups: a [4]
#> a x
#> <int> <dbl>
#> 1 1 NA
#> 2 2 NA
#> 3 3 NA
#> 4 4 NA
external_condition <- TRUE
tibble(a = 1:4) %>%
group_by(a) %>%
choose_forking_path(external_condition)
#> # A tibble: 4 x 2
#> # Groups: a [4]
#> a x
#> <int> <chr>
#> 1 1 own
#> 2 2 own
#> 3 3 own
#> 4 4 own
For completeness, you can even embed the if … else into your pipe chain, but I think this style gets messy and difficult to follow very quickly.
tibble(a = 1:4) %>%
group_by(a) %>%
{
if (external_condition) {
mutate(., x = "own")
} else {
mtuate(., x = NA_real_)
}
}
#> # A tibble: 4 x 2
#> # Groups: a [4]
#> a x
#> <int> <chr>
#> 1 1 own
#> 2 2 own
#> 3 3 own
#> 4 4 own