Find the minimum version of package dependency required

When I was using some more recent(ish) dplyr functions in my package, I realized I should probably adjust the dependency in my DESCRIPTION, but I wasn't sure which version of dplyr I needed to depend on now.

In this case, it was about the scoped variants of mutate and rename โ€“ I remember reading about them in update blogposts not too long ago, so I scoured through the NEWS.md and found them introduced in 0.7.0.

But then I realized that I have no way of actually checking my other package dependencies for specific requirements like that โ€“ the only thing I can tell you is that my package works with the current (and presumably the recent) versions of its dependencies, but that's about it.

I guess this is somewhat similiar to my previous question about which version of R I actually need to depend on, but in this case I feel like there's a great potential for automation.
My idea was simple:

  • Parse my NAMESPACE file for importFroms (I've been very explicit about documenting external functions I use, this might not be the case for anyone) (EDIT: I remembered that r-lib/itdepends exists, and itdepends::dep_usage_pkg(pkg = "mypkg") is probably the better choice for this)
  • The result is a list of packages + functions
  • Search a versioned reference of packages for the first appearance of that function
  • Profit (in the sense that I now know when that function arrived, so I know that I need at least version x.y.z of a package)

At least in theory. But then I realized that I have no such reference. I was banking on rdrr.io for a list of functions, but there's no version history as far as I can tell, so scraping that won't do me any good. I thought the crandb package could help, but it doesn't look like it has function information.

And now I'm wondering if it's even worth the effort.
On one hand, I still think that this should be easy to automate, the only missing piece is an index of package name + version + exported functions.
On the other hand, I don't think most other package authors are going this far (at least my googling hasn't lead to much on the topic), so I'm doubting if this is even something the community care or should care about.

1 Like

My advice would be to setup a Continuous Integration job that uses the oldest versions of all the dependencies. This will take some upfront work (which it already appears you are willing to do), but afterwards it will be easier to maintain. As long as the CI build continues to pass, you don't have to worry about changing your minimum dependencies. And if it fails, you can read the CI logs to figure out which package failed, and increase its minimum required version.

Alternatively, a quicker method to find older versions of dependencies would be to choose an MRAN snapshot from a few years ago and set it as the CRAN repository used on Travis. The example .travis.yml configuration file below runs two jobs:

  1. R 3.5.3 with the versions of R packages available two years ago (2017-10-14)
  2. R 3.6.1 with the latest versions of the R packages
language: r
sudo: false
cache: packages

matrix:
  include:
    - os: linux
      r: oldrel
      cran: https://mran.microsoft.com/snapshot/2017-10-14
    - os: linux
      r: release

A quick follow-up to my previous advice. I tested my package in the Docker image rocker/r-ver:3.2.5, which uses the MRAN snapshot from 2016-05-03. By installing the versions of packages from over 3 years ago, it quickly identified any exported functions that my package required but were not yet available in 2016. Then for each of those packages, I searched the NEWS.md file for the missing function and set the minimum version to the version where that function was first introduced.

:wave: here! We at rOpenSci Software Peer Review are thinking of adding guidance related to that, here's the relevant issue if you're interested: Advice on specifying dependency minimum versions ยท Issue #185 ยท ropensci/dev_guide ยท GitHub (I hope I'm not breaking guidelines by merely posting a link)

2 Likes