Splitting existing package into two separate ones

jacob · August 12, 2018, 4:46pm

I am the developer of the package jtools (https://cran.r-project.org/web/packages/jtools/index.html). The package has such a bland name because when I started working on it, I didn't know what it was for other than functions that I was re-using across my own projects that I thought others might be interested in.

Over time, there have been two different types of things that have attracted users to the package. One is summarizing regression models of various types, via a replacement for the summary function, some methods of visualization, and some methods for exporting summaries to external documents. The other is a fairly comprehensive suite of tools for probing interaction terms in regression models.

Since I started working on the package, I've learned to appreciate the idea that while one can have a package that is too narrowly focused, it is far easier to have a package that has too much going on. Moreover, because of the name I chose for the package, people looking for this stuff may not know that my package has it. I have been thinking for some time about splitting off the interaction-related functions into a new, informatively-named package but I'm not sure of the best way to go about it.

So that's my question:

Is this kind of thing a good idea?
How would you start the process of putting things into a new package?
How would you handle the process of deprecating and informing users?

For some reason or another, I've gotten the impression that a larger-than-normal portion of my package's users are social scientists who are not especially comfortable with R. It may partly be because my package does a number of things that are largely doable on one's own but would require a lot of programming and so people end up with my package because it holds your hand through things (that was my goal). With that in mind, I don't want to greatly confuse people and I'd prefer not to break old code for a while if I can help it.

nwerth · August 14, 2018, 1:47pm

I've only read your GitHub's README, but the package seems pretty cohesive: tools for interactive modeling. It could definitely use a better name; I often search for new packages by using Ctrl + F on CRAN's package listing page.

The only odd feature I noticed was the APA ggplot theme. Maybe you could submit it as a pull request for the ggthemes package.

For your questions:

In some cases, definitely. With your package, I envision the same person using it for the same tasks (though at different points along the pipeline). But if only half a package applies to a user's work, that user shouldn't have to bother updating the package for the unused parts.

As you worried, "miscellaneous" packages often bloat and need divided. And sometimes, even with packages started with a clear goal, some of the underlying "workhorse" features would be nice in other packages. Splitting them off prevents copy-pasting code.

If you've been writing well-contained functions, it should be a breeze to create a new package project and move the appropriate files. Make sure to update the DESCRIPTION file if the old package needs functions from the new one. If each package depends on the other, that's a sign you need to refactor.

As far as Git, I have no idea how to best split a project. I want to say clone the original for each of the new projects and keep their future histories independent. But having two non-linked projects with shared history is raising the "there's a better way" flag in my head.

Have a deprecation warning print when somebody loads the to-be-abandoned package (see ?.onLoad). Point them to the new packages.