roxygen2: documenting multiple datasets translated in different languages in the same file

I'm making a package that includes two datasets, one being the translation in a different language of the other. I'd like to document both datasets in the same file.

I'm wondering what is the recommend approach to document the @format tag in this case. The only thing that differs are variable names. Is it preferred to have one @format for each dataset (example 1, which I find a bit redundant) or to have the different variables and their respective translations in the same item (example 2)? Or simply say in e.g. @details that variables in bar2 are translations from variables in bar1?

example 1:

#' @title Foo
#' @description blah blah
#' @name foo
#' @examples
#' bar1
#' bar2
NULL

#' @rdname foo
#' @format A tibble with x rows and y variables:
#' \describe{
#'   \item{variable1}{description of variable1}
#'   \item{variable2}{description of variable2}
#' }
"bar1"

#' @rdname foo
#' @format A tibble with x rows and y variables:
#' \describe{
#'   \item{translated_variable1}{description of variable1}
#'   \item{translated_variable2}{description of variable2}
#' }
"bar2"

example 2:

#' @title Foo
#' @description blah blah
#' @format A tibble with x rows and y variables:
#' \describe{
#'   \item{variable1,translated_variable1}{description of variable1}
#'   \item{variable2,translated_variable2}{description of variable2}
#' }
#' @name foo
#' @examples
#' bar1
#' bar2
NULL

#' @rdname foo
"bar1"

#' @rdname foo
"bar2"

I don't think that there are many examples for multilingual data sets in R packages. I personally do not know about any. I suspect that you are the pioneer here, and there are no standard practices.

So, check how the manual pages look, and choose the solution you like. :smiley:

Alright, thanks! I'm actually not very surprised about being a pioneer here, I'm not making the typical kinds of packages.

So I was initially thinking to go for the example 2 structure but having a single @format for two datasets automatically generates the following lines in the doc:

An object of class tbl_df (inherits from tbl, data.frame) with 4 rows and 10 columns.

An object of class tbl_df (inherits from tbl, data.frame) with 4 rows and 10 columns.

This isn't a big issue but I was wondering if there's any way to stop these lines from showing up in the doc since they're just duplicates and redundant with my own description of the data.

I've just found the answer to my interrogation.

Needed to use #' @format NULL as shown below:

#' @rdname foo
#' @format NULL
"bar1"

#' @rdname foo
#' @format NULL
"bar2"
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.