I'm unclear how R works "under the hood" and that makes me wonder if I really should write some of the code that I can write.
For example. the following. The intention is to use a dataframe as the source of the data for a processing chain, if it exists, otherwise to load the data from a database. For this, I've written:
But does this build the processing chain efficiently or am I introducing inefficiences doing this?
Also, there seems no way of closing the database connection like this, that gives me some pause, though it doesn't seem to have caused any issues (yet).
Why can't you close it after the mutate( ToTpf = as.logical(ToTpf), ) ?
I would go with naming the function, because it is big enough that it's not easy to read. So an explicit name makes reading the whole code a lot easier.
In terms of performance, I don't see how this function would be less performant than other approaches. But note that execution speed is always hard to predict just by reading the code, the best approach is to first try the easiest way, then see if you have performance issues: if it's fast enough, no need to do more work, if it isn't, use profiling to find out which part is too slow.
Finally, I don't think you actually need a function, you could just put the if/else directly in the pipe. For example this code works as expected:
That last one is especially powerful if you need to recover different datasets, change the db name etc, as you can add all that as parameters. But even if you stay in a simple case, just reading the code you know if a second what that line is for, rather than reading the whole if/else block.