Metapackage structure -- best practices

First post, so I hope this respects the local norms.

We have a metapackage that has a function similar to tidyverse, statnet; its only purpose is to make it easy to install and load a set of packages (for network analysis) that share a common data structure and API. We've been struggling a bit over the years with how to structure it with Depends, Imports, and Suggests -- and I turned to the tidyverse package for guidance.

I found myself perplexed by the way the packages Tidyverse refers to are included (or not) in 3 places: NAMESPACE, DESCRIPTION, and attach.R. If you recall, tidyverse treats all dependencies as IMPORTS (it DEPENDS only on R), and it identifies a subset of the tidyverse pkgs as "core", and attaches these.

Schematically the allocations are:

I'm assuming the orange pkgs (imported in DESCRIPTION only) are used by the tidyverse package itself; they aren't intended for the end user.

The pkgs (in gray) are all listed in Description as IMPORTS, to make sure they are installed, and I'm assuming the token importFrom in NAMESPACE is just there to avoid an Rcheck warning, is that right? And the goal here is to allow the user to access functions in these pkgs using the pkg:: syntax?

The core pkgs (in green and blue) are not all handled the same way, and this I'd like some help to understand. All are IMPORTS in DESCRIPTION (makes sense). But some are also importFrom in NAMESPACE, while others are not. Can anyone explain the reason for this difference?

Many thanks in advance

3 Likes

The orange packages are a mix of packages that the tidyverse meta-package is designed to install and some packages that are used internally. In fact, I think reprex may be the only "orange" package that is there purely for the purpose of getting it installed.

Yes, some individual function must (appear to be) used in order to justify the inclusion of each package in Imports. It is true that the user could access function in these packages using pkg::function() syntax, but it might be more common for the user to use library(pkg) ...function(). The gray packages are here to help get them installed, but the user will still need to attach them via library() or use pkg::function().

I think the core packages that don't have a specific function imported via an @importFrom directive all have at least one instance of direct usage within the tidyverse package itself, of the form corepackage::function(), which justifies their presence.

4 Likes

Thanks alot Jenny. That's very clear.

A couple of additional question then (I'm not sure how to quote your response the way you quoted mine above).

  • Both orange and gray packages are imported in DESCRIPTION to ensure installs, but only the gray have the token importFrom in NAMESPACE. If the user uses the library(pkg) function to load one of the gray packages, all of its exported functions will be added to the NAMESPACE, right? So what is the use case that requires the ImportFrom?

  • There's a comment at the top of the NAMESPACE file # Generated by roxygen2: do not edit by hand so all of these "decisions" are automated elsewhere (in tidyverse.R I think) by the documentation process. I took a look at the documentation for roxygen2, which led me to this line:

If you are using just a few functions from another package, the recommended option is to note the package name in the Imports: field of the DESCRIPTION file and call the function(s) explicitly using :: , e.g., pkg::fun() . Alternatively, though no longer recommended due to its poorer readability, use @importFrom , e.g., @importFrom pgk fun , and call the function(s) without :: .

Since we're aiming to revamp our metapackage using "best current practices", and we would like to attach our core functions the way tidyverse does, do you think we should avoid the importFrom mechanism? Or perhaps the better question is, under what conditios would we still need to use the importFrom mechanism?

@martina,

Thanks for bringing this up. You are not alone in finding this confusing:

The package namespace (as recorded in the NAMESPACE file) is one of the more confusing parts of building a package

-- R Packages / NAMESPACE

Here are some pointers that helped me understand this a little better. I hope they help you too.

  • It really depends on how you expect your package to be used. If it is a 'true' meta-package and really has no additional functionality itself I think just using Depends is the best option, so users can just install.packages("metapackage"); library(metapackage) and have all the child packages attached.

-- Jim Hester

  • Listing a package in either Depends or Imports ensures that it’s installed when needed. [W]here Imports just loads the package, Depends attaches it. There are no other differences.

-- R Packages / NAMESPACE

2 Likes