Filtering across columns

dplyr

#1

Using some random data with 5 observations and columns x, y and z.
tibble(x = runif(5), y = runif(5), z = runif(5))

How would I filter row-wise and return a new column with e.g. the max()?
In other words, for all columns in the 1st, 2nd etc row, return a new column "k" with the max from column x, y and z.

Thank you!


#2

If I understand correctly, I think the base function pmax is what will help you get the job done neatly

library(tidyverse)
set.seed(123)
df <- tibble(x = runif(5), y = runif(5), z = runif(5))
  
df %>% mutate(k = pmax(x, y, z))
#> # A tibble: 5 x 4
#>       x      y     z     k
#>   <dbl>  <dbl> <dbl> <dbl>
#> 1 0.288 0.0456 0.957 0.957
#> 2 0.788 0.528  0.453 0.788
#> 3 0.409 0.892  0.678 0.892
#> 4 0.883 0.551  0.573 0.883
#> 5 0.940 0.457  0.103 0.940

Created on 2018-08-06 by the reprex package (v0.2.0).


#3

Indeed it does. Thank you!


#4

For those looking for a more generalized solution looking at this post, you could use purrr::pmap from the tidyverse to apply any function to a set of column. Some examples:

library(tidyverse)
set.seed(123)
df <- tibble(x = runif(5), y = runif(5), z = runif(5))

# with max
df %>%
  mutate(new_col = pmap_dbl(., max))
#> Warning: le package 'bindrcpp' a été compilé avec la version R 3.4.4
#> # A tibble: 5 x 4
#>       x      y     z new_col
#>   <dbl>  <dbl> <dbl>   <dbl>
#> 1 0.288 0.0456 0.957   0.957
#> 2 0.788 0.528  0.453   0.788
#> 3 0.409 0.892  0.678   0.892
#> 4 0.883 0.551  0.573   0.883
#> 5 0.940 0.457  0.103   0.940

# with a more custom anonymous function function
df %>%
  mutate(new_col = pmap_dbl(., ~ 1 + max(.x)))
#> # A tibble: 5 x 4
#>       x      y     z new_col
#>   <dbl>  <dbl> <dbl>   <dbl>
#> 1 0.288 0.0456 0.957    1.29
#> 2 0.788 0.528  0.453    1.79
#> 3 0.409 0.892  0.678    1.41
#> 4 0.883 0.551  0.573    1.88
#> 5 0.940 0.457  0.103    1.94

# with a custom function with name argument as column
my_function <- function(x, y, z) {
  x + 2*y + 3*z
}
df %>%
  mutate(new_col = pmap_dbl(., my_function))
#> # A tibble: 5 x 4
#>       x      y     z new_col
#>   <dbl>  <dbl> <dbl>   <dbl>
#> 1 0.288 0.0456 0.957    3.25
#> 2 0.788 0.528  0.453    3.20
#> 3 0.409 0.892  0.678    4.23
#> 4 0.883 0.551  0.573    3.70
#> 5 0.940 0.457  0.103    2.16

Created on 2018-08-06 by the reprex package (v0.2.0).

See help for pmap for more understanding on how it works: