Template for documenting a dataset?

Sorry if this was asked somewhere else.

Is there a template for documenting data a bit like roxygen skeleton for functions?

I tend to copy-paste from further data docs and wonder whether there's a better way to prepare documentation for a dataset using roxygen2 tags.

2 Likes

@hadley's R Packages book has a section on documenting datasets with roxygen2:

#' Prices of 50,000 round cut diamonds.
#'
#' A dataset containing the prices and other attributes of almost 54,000
#' diamonds.
#'
#' @format A data frame with 53940 rows and 10 variables:
#' \describe{
#'   \item{price}{price, in US dollars}
#'   \item{carat}{weight of the diamond, in carats}
#'   ...
#' }
#' @source \url{http://www.diamondse.info/}
"diamonds"

There are two additional tags that are important for documenting datasets:

@format gives an overview of the dataset. For data frames, you should include a definition list that describes each variable. It’s usually a good idea to describe variables’ units here.

@source provides details of where you got the data, often a \url{}.

Never @export a data set.

There's also usethis::use_data() and usethis::use_data_raw(), but that does not appear to alter the documentation, it just prepares the actual data set.

So that's not really what you're looking for.

The roxygen skeleton for functions you're referring to is the Code > Insert Roxygen Skeleton button in RStudio, right?

Perhaps you want to add this as a feature suggestion to the RStudio IDE?

3 Likes

Thank you @maxheld83! Yes I was looking for something automatic. Your answer motivated me to open an issue in the usethis repo.

3 Likes