I have been working on a package textdata with the goal to allow datasets to be downloaded and stored on disk and loading in as needed, instead of including them inside packages. This goal is to be able to provide larger files not easily hosted on CRAN, and deal with licenses issues such as this one.
The main idea is based on the way keras handles datasets such as with the function
keras::dataset_cifar10(). The functions will prompt with information about the dataset, including information about size and license so the user can make an informed decision if they want to download.
Most of the magic happens in load_dataset which will check if the file is already downloaded and load it if it is and download it if it isn't.
I have two main problems.
- I need assistance to make sure the path creation is done correctly (it works on my MacOS but I don't have knowledge or means to test on other operation systems). Specifically I'm worried about these two lines.
- I would like the user to have access to information regarding where datasets have been saved on their computer.
textdataallows the user to change the directory where the datasets are stored and it would nice to be able to do a full deletion of all datasets if you want to "uninstall".