Difference between binary operators vs. functions inside a `mutate()`

Maybe this is a rookie's question, but I just noticed the following behavior:

  1. When using the + (binary) operator to compute a total across columns, the operation is done in a rowwise manner.
  2. That is not the case when you use a function, say the sum() function. In this case the function computes the subtotal of each column.

Please find below an example (reprex):

df <- tibble(x = 1:2, y = 3:4, z = 5:6)

df |> 
  mutate(total = sum(x,y,z))
# A tibble: 2 × 4
      x     y     z  total
  <int> <int> <int> <int>
1     1     3     5    21
2     2     4     6    21

df |>
  mutate(total = x + y + z)
# A tibble: 2 × 4
      x     y     z  total
  <int> <int> <int> <int>
1     1     3     5     9
2     2     4     6    12

See how the column total is different in each case?

My question is: why is this the case? What should I know about the R programming language that I am missing? Was this designed on purpose?

This is because the underlying functions; +() is vectorised and sum() is not.
You can see that without dplyr :


x = 1:2
y = 3:4

x+y # vectorised ; a result for each entry along x (and y)
# 4 6
sum(x+y)  # not vectorised ; a single result across all
# 10

I cannot believe I never thought of that. Thank you very much!!! :star_struck:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.