Licenses for data in Tidyverse

Tidyverse contains a number of datasets without other attribution information such as the diamonds dataset. If the data is used separately, does it have its own license, for example a creative commons attribution license?

The whole tidyverse has a MIT licence ( MIT License • tidyverse). So I would guess, that the datasets included in the tidyverse have the same licence.

It's a reasonable question. The nycflights13 dataset (for example) doesn't have a license explicitly associated with it: GitHub - tidyverse/nycflights13: An R data package containing all out-bound flights from NYC in 2013 + useful metdata

1 Like

The license is CC0 in DESCRIPTION for nycflights13

1 Like

Thanks @mara ! Is nycflights13 the only dataset that is actually part of tidyverse? In the top-level tidyverse github it is the only one that has its own repo...

ggplot also has a number of datasets ggplot2/data-raw at main · tidyverse/ggplot2 · GitHub

Yeah. I believe so. I actually asked hadley about the datasets, and basically his stance is that all datasets should be CC0. So, if you want to cite something, cite the package, or just cite the dataset as CC0.

1 Like

Yeah, see my reply above Licenses for data in Tidyverse - #7 by mara

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.