Local imports, roxygen, and NAMESPACE

KenWilliams · November 29, 2017, 4:19pm

I've been thinking about managing imports when developing a package. I use roxygen2 to manage my imports, and I like to keep the import declarations close to the places where the functions/methods are used, so my code often ends up looking like this:

#' @importFrom PackageA func1 func2
foo <- function(...) {
  ...
}

#' @importFrom PackageA func1 func3
bar <- function(...) {
  ...
}

Of course, the scope of the import isn't actually limited to the functions foo and bar, they're imported globally for the package I'm developing. So it's sort of a false sense of carefulness, and things can easily get out of sync.

One solution is to simply use double-colon syntax everywhere instead of importing, e.g. always use PackageA::func1, etc. IMO that's kind of an ugly solution, it goes all the way to the other extreme of importing locally to a single function call.

A great compromise would be what most other languages do - provide a local/lexical import mechanism so that the import is scoped to the code that actually needs it. We can fake it by doing something like:

foo <- function(...) {
  func1 <- PackageA::func1
  func2 <- PackageA::func2
  ...
}

bar <- function(...) {
  func1 <- PackageA::func1
  func3 <- PackageA::func3
  ...
}

but of course that doesn't use the namespace import mechanism, it just copies another instance of the functions into the current namespace, and it's also a little ugly IMO.

Has anyone thought about this problem more than I have and come up with a solution that plays nice with roxygen2 and the NAMESPACE file? Have there been any rumblings in the R-core community about creating localized imports as part of the language?

KenWilliams · November 29, 2017, 4:24pm

I should add - a really common desire would be to limit an import to file scope, not necessarily just function scope. AFAIK R doesn't have the concept of a file scope (or file lexical scope) at all, right?

Gabor · November 30, 2017, 2:01am

You might want to check this out: https://github.com/smbache/import

As for file scope, there is no such thing. When a package is installed, all R code of the package is lumped together in a single file, executed, and the result is stored in the installed package.

nwerth · November 30, 2017, 4:33pm

I didn't write Roxygen, but here's my take:

The value of listing each object's dependencies with @import and @importFrom in the comments isn't just to make sure the required objects are loaded with your package. It also means the NAMESPACE file can be totally automated. If you remove a dependency from a function, you just remove the @import line. With Roxygen, you won't have to check everything else in your package before editing the NAMESPACE file.

You're right, in that you still have to be careful about accidentally calling objects loaded through dependencies elsewhere in your code.

I don't think it's too ugly (besides with ugly and long package names). I do this a lot for two reasons:

Avoiding namespace conflicts, which is the intended purpose you mention
Acting as inline documentation for which package the object came from. For example, there are a lot of spatial packages that work together, so it's hard to keep track of which package has which function.

No importing is done by using pkg::object. From the official documentation:

If a package only needs a few objects from another package it can use a fully qualified variable reference in the code instead of a formal import. A fully qualified reference to the function f in package foo is of the form foo::f. This is slightly less efficient than a formal import and also loses the advantage of recording all dependencies in the NAMESPACE file (but they still need to be recorded in the DESCRIPTION file). Evaluating foo::f will cause package foo to be loaded, but not attached, if it was not loaded already—this can be an advantage in delaying the loading of a rarely used package.

So relying on foo::f has its use, but not really what you'd like.

One "solution" I can think of is working with namespace objects. I made a package named testing which only has the following file in R:

#' Use colon-notation to call file_ext() in the tools package
#' @export
colon_call <- function() {
  grid::unit(1, "mm")
}


#' Assume file_ext() is in the namespace
#' @export
assume_available <- function() {
  unit(2, "mm")
}


#' Crazy attempts
#' @export
using_with <- function() {
  grid <- asNamespace("grid")
  with(grid, {
    unit(3, "mm")
  })
}

I let Roxygen handle the NAMESPACE file. After doing a build-reload in RStudio:

library(testing)

colon_call()
# [1] 1mm
assume_available()
# Error in unit(2, "mm") : could not find function "unit"
using_with()
# [1] 3mm

Those are the results, no matter the order of function calls.

KenWilliams · November 30, 2017, 5:57pm

Totally agree. I know some people don't like their NAMESPACE or DESCRIPTION to be touched by automated hands, but I find it very valuable.

True, it's not technically an import (i.e. the namespace hoops behind the scenes are a little different), but the practical effect in terms of code management is very much the same as if there were a temporary import mechanism local to a single lexical invocation. IOW, if I wanted such a mechanism, I couldn't imagine a better syntax for it.

That's a nice proof of concept for localizing the scope, but of course has some drawbacks - to do this with several packages you'd have to use several nested with calls, and the asNamespace function is marked one of the "internal namespace support functions. Not intended to be called directly."

We could use the getNamespace function instead, though, and it looks like it's also relatively simple to (pseudo-)import only a subset of the items in the namespace:

with(mget(c('unit'), getNamespace('grid')),
     unit.c(unit(3, "mm")))
# Error in unit.c(unit(3, "mm")) : could not find function "unit.c"
with(mget(c('unit', 'unit.c'), getNamespace('grid')),
     unit.c(unit(3, "mm")))
# [1] 3mm

So perhaps this, or something like it, could provide a basis for some code that helps with import management. For multiple packages imported, the lists (results of mget) could even be concatenated, to avoid multiple with calls.

Drawback would still be that RStudio can't easily get the introspection right, since it's done at runtime instead of in a static NAMESPACE file, so it can't tell which names are undefined and warn you with little squiggly lines.

nwerth · November 30, 2017, 9:49pm

Just for fun, here's a way to locally "attach" a namespace:

#' Load objects from a name space into an environment
#' @param package String naming the package/name space to load
#' @param object_names Character vector naming the objects in the package to
#'   load. If NULL (default), all objects exported by the package are loaded.
#' @param envir Environment to load the objects to. Defaults to the current
#'   environment.
#' @return Invisibly, the altered environment
scope_namespace <- function(package,
                            object_names = NULL,
                            envir = parent.frame()) {
  ns <- getNamespace(package)
  if (is.null(object_names)) {
    object_names <- getNamespaceExports(ns)
  }
  for (obj in object_names) {
    assign(x = obj, value = getExportedValue(ns, obj), envir = envir)
  }
  invisible(envir)
}

And a demonstration:

secretive_fun <- function() {
  scope_namespace("tools")
  file_ext("my.txt")
}


print(.packages())
# [1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"
# [7] "base"

secretive_fun()
# [1] "txt"

print(.packages())
# [1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"
# [7] "base"

file_ext("your.txt")
# Error in file_ext("your.txt") : could not find function "file_ext"

I'm sure it's far from efficient, but it's a step up in ease-of-use from nested with calls.