Why was % chosen as the operator delimiter


#1

In the same vein as this question regarding head, why was the percentage sign % chosen as the delimiter for user defined operators? And while we’re at it, why was a delimiter required at all?


#2

Does R even support Operator Overloading? I’m guessing it doesn’t or there probably wouldn’t be the need for a delimiter. I’m curious about this answer as well.

EDIT looking into it, it looks like R does support operator overloading but it’s bad taste to overload pre-defined operators (+, >, <, =, etc…) Still the question remains why % was chosen.


#3

I believe := is an example of overloading exploited by data.table and subsequently rlang.


#4

No idea on why %, but addressing your last question:

Operators are treated in a special way when parsing (official docs), so R probably wants to be sure it knows what’s an operator. Remember, R allows a symbol to be shared by a function and a non-function:

"%in%" <- 1
`%in%`
# [1] 1
`%in%`(2, 1:4)
# TRUE

Now imagine no delimiters were required. And let’s say package foo exports an operator named bar, which is basically just an identity check:

bar <- function(x, y) identical(x, y)

Consider this code:

library(foo)
bar <- NULL
bar bar NULL

Should it check if the newly-defined bar is NULL (which it is)? Or should it check if the function foo::bar is NULL (which it isn’t)? And this is just with single-symbol expressions on either side. Imagine trying to parse this:

bar bar bar NULL

#5

From @hadley’s Advanced R:

It is possible to override the definitions of these special functions [+, for, [, etc.], but this is almost certainly a bad idea. However, there are occasions when it might be useful: it allows you to do something that would have otherwise been impossible. For example, this feature makes it possible for the dplyr package to translate R expressions into SQL expressions. Domain specific languages uses this idea to create domain specific languages that allow you to concisely express new concepts using existing R constructs.

http://adv-r.had.co.nz/Functions.html


#6

“Old” S had %% and %/. “New” S had %%, %/% and %*%. When additional infix operators were needed, I suspect it seemed like a natural extension to allow anything in between the %.


#7

You can see that the parser must’ve supported it early in the evolution of R with the commit that implemented %in%: https://github.com/wch/r-source/commit/5f581abd52b10e3b1ac6d3de3bcbe5853fbd6e00


#8

Maybe this is just a reflection of the languages I’ve been in contact with, but operator overloading is something I tend to associate with object-oriented languages. I mean, R has OO and class systems, but I’m not sure that those systems are really tied into type safety at the parser level the way that, say, C++'s or even Python’s classes do.

JavaScript is an interesting comparison. It goes even further than R by not having a class system at all, and it doesn’t allow operator overloading. I couldn’t imagine operator overloading even making sense in that language.