README.md vs package vignette vs ?package documentation

What do you guys think a README.md should look like as compared to a basic package vignette or the package documentation?

I find package vignettes useful because they can be access directly from R via vignette(), and thus also be linked in other documentation. But I also find a extensive README.md useful because its a nice a quick overview on GITHUB.

I ended up going the following route:

  • Create a basic package vignette with usage examples etc..
  • Create a README.Rmd that includes the package vignette via the knitr chunk option child= (so the README.md ends up being exactly the same as the package vignette)
  • The ?package documentation contains only a very basic description of the package (similar to what goes into the DESCRIPTION file) and a suggestion to look at vignette("mypackage")

What are your takes on this issue?

4 Likes

I only put a usage guide in the README if there is no vignette (shame on me). Each of my README files has:

  • Summary of the package's purpose
  • Directions for building the package from source
  • Guidelines for contributors (though I'm starting to put this in a CONTRIBUTING file)

I haven't been making ?package documentation files. Are they often used? I figured detailed descriptions deserve a vignette, and summary stuff can be found with packageDescription("package").

This is indeed something I also always found a bit tricky and had a hard time finding a solution I liked.

The following is the solution I came up with when I started developing packages and I've stuck with since (there are other ways to handle it, but this is what I found works for me and my users). I'm open to changing my workflow if I see other good ideas in this discussion.

Overview

My philosophy is to have a fairly detailed README because many people find a package through github, and the README is the first thing you see there. It's very possible that I'm biased because I spent so much time in the R twitterverse which is very github friendly, and I've been heavily influenced by people like @jennybryan, so I always gravitated towards making the package documentation very github friendly. It's important for me to have the README act as a good intro to the package. The README also gets shown on other websites like CRAN, rdocumentation.org, METACRAN , etc. So I really care about a good README!

My second thought is that I also want to make sure every package has a main vignette. This main vignette should be very similar to the README. In fact, it should be essentially identical. So that's what I did - I wanted to be able to create both my main vignette and my README from the same source.

Next, if the package has some more specific vignettes for advanced or specific usecases, those can be vignettes as well, and I make sure to link to them from the main vignette/README.

Finally, you mention the ?package documentation. I personally never use them, but I did want to make sure my documentation is complete, so I include a short description in there and include a link to the full REDME.

Implementation

I'll show an example of how I do my documentation workflow with my package shinyjs.

  • I create the main vignette and name it <packagename>.Rmd, and I make it fairly details to explain the package, its usecases, and show examples. shinyjs example

  • I want to create my README from the same vignette, but I don't want to have to copy it manually because then I run into the possibility of forgetting to update either the vignette or the README. I wanted a solution that makes sure they're both in sync. Another small problem is that the file paths in this vignette would not work properly in the README because they're not in the same folder (image paths, links to other vignettes, etc.). So I created a Makefile that automatically creates my README. shinyjs example

  • When I want to update my README, I only update the shinyjs.Rmd vignette, and then I run make in a shell. It's super easy. I literally open up a shell in the package's root directory, run make (this works for both unix and Windows), and that's it, it'll create a README file from my vignette and correct all the paths.

  • If I need to have a few more vignettes, I make sure to link to them from the README/main vignette so that they're easy to find, because most people wouldn't know to search for vignettes. shinyjs example

  • I also create a file named <packagename>.R in the R/ folder, and make a @docType package to document the package. I personally think the value of this is fairly low (I might be wrong), but I do it anyway. I copy the description from the DESCRIPTION file, and at the end I add a link to the README. shinyjs example

I've actually been meaning to write a blog post about this soon so this was helpful to get down on paper. Now people can also critique my method and tell me how needlessly complicated or silly it is :slight_smile:

I hope this helps!

10 Likes

Here's my take that uses all of these documents in slightly different ways:

README.md

This shows who the package is for, what it does, it's health (test coverage, continuous integration), a short example of one of the main features of the package, and information for how the users can get help or contribute.

Vignettes

This should be more than just a simple example of the package, it should combine several features that make for a more complete view of what the package does.

package?pkg documentation

I've seen these as pithy descriptions of what the package does, but I think this is better used to sort and highlight functions that are available to users since the default index of functions is a bit meh. Here's an example of what I mean by this type of documentation: http://grunwaldlab.github.io/poppr/reference/poppr-package.html#data-import-export

1 Like

Two things that make this very useful

  • People are used to typing ?help-thingy so it makes sense to make it work for a package.
  • I use it to click on the link to the Index of all help. I know there are other ways to do that but I never remember them.

I'll add another tool: pkgdown which makes a website out of your existing package, with very minimal effort. pkgdown is not on CRAN yet. But, for example, the tidyverse packages that have websites show a pretty good set of examples of what's possible.

5 Likes

I had a similar/related conversation here with some good points raised by Duncan Murdock in response: http://blog.revolutionanalytics.com/2016/05/good-r-packages.html?cid=6a010534b1db25970b01b7c859cfe1970b#comment-6a010534b1db25970b01b7c859cfe1970b

I use package indices a lot, and usually get there via

help(p='dplyr')

(with partial parameter matching, because it's interactive and keystrokes matter). I think I once discovered a cool way with ? or ??, but I forgot how it worked before I tried it again.

+1 for pkgdown! I'm a big fan. I especially find the function reference incredibly useful to get a quick overview of how functions in the package are grouped, and to make sure I'm not missing out on any functions not highlighted by vignettes or the readme. Here's a link to dplyr's function reference and tidyquant's function reference for those that might be interested!

I think pkgdown sites are nice for reviewing/rereading the documentation of your own packages for errors. I have to confess I did never look at anyone elses pkgdowns. I prefer ?function for single functions, and the .pdf documentation for looking through several functions, since there it is all in one document.

Thats the approach I was thinking about. For me it works to have a ```{r child = vignettes/blah.Rmd}` codechunk in my README.Rmd (though this breaks pkgdown for some reasons).

On the other hand I find it somewhat awkward to have the vignette and README be identical. I also like dplyr approach where you just have a very basic package abstract and the installation instructions.

?package seems to be a the right place to document package options (if your package has options)

Maybe I'll end up going for the dplyr model and just put a link to the vignette in the README? (on Rpubs or something)

That's definitely a very viable option, with the child chunk. I've considered that, but for my packages so far having the vignette and README be identical made sense, it wasn't awkward. But if you do need them to be a bit different then your approach works.

And yes there's also the route that many rstudio packages take, which is including a very concise message in the README with a link to the website.

I'll partially agree with you. I do personally use ?function the majority of the time, but I never look at the pdf documentation if I don't have to. For some reason I find them harder to navigate and connect the pieces. Just personal preference I'd say.

One other unsung feature of pkgdown is that it runs the example code and shows the output in the example section of each function's documentation. I find this really valuable to get up to speed quickly without having to run the examples myself. Here is a dplyr group_by() example where the example code has been run so you can immediately see what the function does.

1 Like

I’ve seen these as pithy descriptions of what the package does, but I think this is better used to sort and highlight functions that are available to users since the default index of functions is a bit meh. Here’s an example of what I mean by this type of documentation: The poppr R package — poppr-package • poppr

That's a great idea! Much better than looking for related functions by wandering "See also" trails in documentation pages.

Formatting aside, I think that is the package index, which you can get to with help(package = "poppr").

Some packages do have too-large function families or OO-style methods (e.g. RSelenium) or other unique syntax (e.g. data.table, pipeR) that isn't easily documented in R in which case such an approach may be useful, though if there's a way to fix the docs, I'd always prefer that first.

Good question! I personally use ?pckgname all the time because I forget the syntax or want to see an example again and most packages have good help for those two things. When a package doesn't have it I get annoyed because it's just documenting your functions...

In regards to the README.md file for GitHub, I think it is a huge benefit to whether other people feel comfortable using your package. If I see a package on GitHub that has bare minimum intro, I'm usually not going to spend time figuring out what it does and how easy it is to use. That might not matter to you (which is totally ok), but if you want other people, specifically strangers, using your package then i believe making it seem useful/helpful/easy to use goes a long towards adoption.

Hmm that is a good point. I am currently working on a Package that mainly functions as an Rstudio addin, so I didn't add any usage examples to the README. Ill try to think of something to put in there.

1 Like

Yesterday i saw your post about pkgdown and i was excited so much that i used it too!! You can see another example for DescriptiveStats.OBeu package. Thank you!

1 Like

Another thing we've discussed is now that roxygen2 works with markdown, we might be able to automatically include README.md into the package documentation. I realised we didn't have an option issue for this, so I just created https://github.com/klutometis/roxygen/issues/669

2 Likes

Adding to this × what @hadley mentioned re. roxygen2 and README, one of the things that impacts ease of use is consistency across sources of documentation. I think that's part of the goal of pkgdown (right?!)

As a user, one of the most frustrating feelings is:
"I know I saw how to do this, but now I don't know where it is!" :bowing_man:

  • Part of this is about vocabulary (beginner's curse/curse of expertise): reference, articles, and news are all words that people have heard before, but not necessarily in this context. I think this is similar to some of what's being discussed in
    Building foundational skills for programming beginners - #5 by jennybryan
  • Part is consistency across sources vis-a-vis navigation. For example, DiagrammeR has gorgeous documentation…and I always have like twelve tabs of its docs open when I use it, because the elements of it feel like they're in different places in the docs, versus the GitHub README, versus docs within docs for GraphViz and Mermaid (e.g. I often want, but don't remember where the attribute styling is).

I'm a big fan of cross-referencing, and/but I don't know if this is pedagogically sound, or even actually useful to most people in developing a mental framework around a package, or concept (my thinking could probably benefit from a good dose of @jennybryan -style rectangling). There's also the challenge of dealing with various media/space constraints that might make this more or less practical.

In fact, now I'm not sure of whether this comment fits squarely in this thread re. docs, or fundamentals (@jessemaegan, I feel like this might be a learning how to learn thing).

2 Likes

You must be reading my mind today :laughing:

I've been thinking a lot about tidy data and what each of the components do, and am starting to try and create visuals to illustrate concepts, as that's the way that I learn best. I don't know how applicable it would be to this scenario, but perhaps some kind of visual mapping would be helpful.

Implementation would be tricky - but maybe it's a matter of creating a series of model pieces showing "this is how I map my knowledge" to inspire others?

1 Like