Since the data frames are examples meant to test the function, I think it would be more natural to save them as external data sets instead of internal data sets.
To do this, you can save them as binary R data files in the package subdirectory data/. For convenience, you can use usethis::use_data() to automate this. If df1 and df2 are defined in the current R session, you could run:
> usethis::use_data(df1, df2)
✔ Creating 'data/'
✔ Saving 'df1', 'df2' to 'data/df1.rda', 'data/df2.rda'
Then when you want to use the example data sets, you would run the following:
library(myPkg)
data(df1)
data(df2)
my_function(df1, df2)
And then if you subsequently modify df1 and df2 in the current R session, you can pass the updated data frames to the function:
# after modifying df1 and df2
my_function(df1, df2)
And this also gives you (and any other users of the package) the freedom to use other names:
my_function(df3, df4)
Here's a reproducible example using a modified version of the suggested function from @technocrat:
my_function <- function(x, y, j = NULL) {
dplyr::left_join(x, y, by = j)
}
data("diamonds", package = "ggplot2")
df1 <- diamonds[, 1:7]
df2 <- diamonds[, c(1:3, 8:10)]
my_function(df1, df2, j = c("carat", "cut", "color"))
See the chapter Data from R Packages for more details on including data sets in R packages.