R wants a scalar type. Or does it?

Like most R users, I try to write code that is both vectorized and follows functional-programming paradigms. But in plenty of cases we simply are left with various scalar values (e.g. flags, inputs in Shiny applications, and plenty of returned data from external sources).

As my R programs have become more 'mission-critical' at work -- specifically because I've advocated for using R in production -- I find that I now write A LOT of vector-length checking code, as bugs along the way in code paths can pretty silently turn my length-1 vector into a length-n vector. A common pattern, for example, is the isTRUE (and more recently, since its introduction in base R, isFALSE):

if (isTRUE(my_scalar == 1)) { do_something() }

Another pattern I've found myself using pretty often is using the length-1 vector selector:

some_function(my_scalar[1])  ## use [1] because some_function() doesn't do length-validation

(Situation -dependent, of course, but often [1] is preferable to [[1]] for atomic vectors as it 'does (probably) the right thing' for length-0 vectors: returns a single typed NA value.)

All of these patterns increase specificity in code, which is great. And to a seasoned R programmer reading code, perhaps they provide useful clues. But for most programmers, they decrease readability by leaking abstraction implementation details. Likewise, too much inclusion of type-checking assertions at the 'high-level' of a useful script/program detracts from understanding the overall architecture of code.

In my own code, I've begun to create my own type-checking (at runtime, of course) classes (sometimes S3, sometimes R6), but this is way too large a barrier for introductory programmers, or even seasoned programmers in other languages making the hop over to R.

I'm curious to hear about what other patterns/toolkits R programmers have used to help fight overly-abundant type-checking/forcing in their programs.

Tangentially-related: I've been increasingly using TypeScript for my various JavaScript projects, and now I find myself really really really wanting something like TS for R, i.e. a strongly-typed language extension that quickly transpiles to 'pure' R. Ideally this would avoid both littering code with assertions/casting everywhere and offer a performance impact as this checking could be done before code execution (though perhaps also at run-time, e.g. in a non-'strict' mode).

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.