On RStudio Package Manager Beta

Hey,

I've been testing the RSPM a little bit (https://rtask.thinkr.fr/blog/rstudio-package-manager/), so here's my first feedbacks โ€” nothing deeply technical for now, I'll have other feedbacks when I'll have tested it more extensively (I guess I'll update this post then)

Admin

Global

  • Can we create a curated CRAN from a specific date :date: ? (like, set a curated CRAN not to today's date but to a specific date). Would definitely be useful.

CLI

  • When you're adding something, you'll add source, when listing, you'll list sources (one singular, the other plural). This makes perfect sense, but as a user, you are likely to do add source then list source. Maybe a shortcut would be great :slight_smile:

  • An CLI access to stats would be nice (rspm stats --repos maybe). Even more, if you want to create reports for your managers: a console-printed version + an exportable CSV would be helpful.

  • Being able to provide a DESCRIPTION file that would be parsed to a curated CRAN would be awesome: I'll then copy my package on the server, do rspm add --description='/path/to/DESCRIPTION' --source=curated, and I'll then have a repo that I can send to someone that would have everything required (and nothing more) to install my package. This is quite easy to do right now (just have to parse de DESCRIPTION file), but a shortcut in RSPM would be nice.

  • rspm copy --old-repo --new-repo would be cool if you want to make an "old-repo + these sources" repo

  • In rspm help list : edit Commands to edit repo name and description: we can also edit sources

User interface

I'm pretty sure this would be integrated into the forthcoming version, but I'll list them anyway :slight_smile:

  • I'm not sure admin would want every user to have access to stats, or need to have a link to the admin guide (these should be accessible only to rspm admins). I also wonder if the Activity tab is to be accessible to the user?

  • With the current version, users have access to all repos in the web interface. This might not be the desired behavior: you might want a part of your team to have access to a specific repo, and the other team to another repo. I can also think of a situation where, inside the same company, a central service develops packages/apps that they sell/share to other services in the company. RSPM would be a perfect tool for this use case: as the person who developed the product, you now just have to share the install.package command with the good repo link. Though I might not want to share something I have developed for service A with service B. With the current version, as soon as I have the link to the repo UI, I can see everything -> http://192.168.0.10:4242/ redirects to http://192.168.0.10:4242/client, and http://192.168.0.10:4242/curated/latest (for ex) provides a link to http://192.168.0.10:4242/client.

  • Can we prevent the web client to be displayed?

=> To sum up, I think the current version is too much of a mix between what the admin needs (stats, activity, full repo) and what the end user needs to see (specific repo and documentation).

User

Interface

Let's say I have a repo called prod. If I click on the prod text in the menu bar (on the left of "Packages"), nothing happens except a small blurring of the main page.

=> It would be awesome to have room for custom README here, provided by the Admin, or to the setup guide

This leads me to the idea that it would be more user-friendly if the first thing you see when you browse a repo is a "How to use" or "How to configure" or something like that (maybe move the Setup page?) => As a user, right now, if I open the web client, the first thing I see is a list of package and the help page of the first :package: of the repo. For example, if it's synced with the CRAN, I have the help page of {A3}. That might be confusing for the user.

Also, some kind of contact page would be nice. As a user, I would know who to contact if I have trouble using the repo.

Setup

In the current setup guide, you have: " To check if R is already configured, run: options('repos') inside your R console. If the result includes http://192.168.0.10:4242/curated/latest, you're done!"
=> But what to do if it's not that?

The only other suggested option is through the RStudio interface, which is only available with the current daily of RStudio.

I guess including here some other options to use the RSPM would be nifty. Even more, as RSPM aims to be used in companies that want to control what the users install, so RStudio is not updated that frequently. I think we only need:

install.packages(
  "pkgtest", 
  repos = "http://192.168.0.10:4242/curated/latest"
)

To make everyone be able to use it.

Also, this would guide people in a context where they are not using RStudio: let's say I have created a Shiny App and want to deploy it on a server, manually or through Docker, I can't find (well, right now) any indication about how to do that. I guess if you're comfortable enough to do that you'd know how to change the repos parameter, but you're never too cautious :slight_smile:

Help page

On the help page of each package, there is a

45

That makes perfect sense, provided you have well configured you RStudio. Adding repos = "<adress>" would help, as it would work even if you don't have the repo adress globally set.

Would it be possible to have access to documentation from this page? (Vignettes, etc.)

Share from RStudio

  • I'm pretty sure this will also be included in the forthcoming version of RStudio, but a "point and click" interface to share your own package with the Connect button would be helpful.

Download from RStudio

  • Would be nice to have an interface similar to the current "Install package" window to dl the packages (or maybe it will be integrated inside the current window, "From CRAN / From source / From rspm")

Hope this helps !

Best,
C.

6 Likes

Wow, thank you for the thorough feedback! I've responded with some ideas based on our internal discussions. I've also asked for clarification on a few items. I didn't reply to every thing, some of the bullets without a response are great points that we plan to address!

Admin

Global

  • Can we create a curated CRAN from a specific date :date: ? (like, set a curated CRAN not to today's date but to a specific date). Would definitely be useful.

Unfortunately not. At this time the date is "frozen" when the curated source is created, and the update command brings the set of packages forward to the most recent CRAN sync available. Could you expand on the use case where it'd be useful to specify a specific date? We were imagining that most IT organizations would begin the process for creating a curated repository on an arbitrary date as opposed to having special "significant" CRAN dates in mind. The activity page does show the date when the source was created, making it easy to know how long it has been (e.g. in the case where you want to update once a year). This approach seemed to meet the needs of most IT organizations without getting into difficult questions that arise when using dates (e.g. does the day represent CRAN at the beginning or end of the day?).

CLI

  • When you're adding something, you'll add source , when listing, you'll list sources (one singular, the other plural). This makes perfect sense, but as a user, you are likely to do add source then list source . Maybe a shortcut would be great :slight_smile:

  • An CLI access to stats would be nice ( rspm stats --repos maybe). Even more, if you want to create reports for your managers: a console-printed version + an exportable CSV would be helpful.

What granularity would you be interested in? Behind the scenes, we store two databases, one with 1 row per package with information like package name, author, license, etc. The other database simply records what packages are downloaded (1 row with a package id and the timestamp of the download). We use these two together to build up the metrics page and perform quite a bit of caching to ensure the page is performant. If you were to download a csv, would you just want it to replicate the summary tables displayed in the product? Perhaps we could surface this data through an API instead of a csv file.

  • Being able to provide a DESCRIPTION file that would be parsed to a curated CRAN would be awesome: I'll then copy my package on the server, do rspm add --description='/path/to/DESCRIPTION' --source=curated , and I'll then have a repo that I can send to someone that would have everything required (and nothing more) to install my package. This is quite easy to do right now (just have to parse de DESCRIPTION file), but a shortcut in RSPM would be nice.

To make sure I understand this correctly, the end result is a repository with your internal package plus its dependencies from CRAN? I think this is an interesting use case with a few tricky edge cases. For example, what state of CRAN do you pull the dependencies from? When you update the package do you want to update the CRAN dependencies? The current work around would be to create a local source with your internal package, and a curated cran source with the dependencies. Creating the repository from these two sources requires you to be explicit about which versions of the CRAN packages get pulled in. You also have to be explicit with updates, either updating both sources together or updating one or the other (most likely updating both sources at the same time). If you wanted a more automatic process, we'd have to make choices about when to update each part. Short of that, we could provide a utility for parsing the Description file and outputting a csv with the dependencies, to make creating the curated cran source easier.

  • rspm copy --old-repo --new-repo would be cool if you want to make an "old-repo + these sources" repo

We likely won't add this one, as it makes the mental model for buildling repos from sources more complex. e.g. "If I copy a old repo to new repo, then remove sources from old repo, should those be removed from new repo as well?" Instead, I prefer explicitly stating what sources make up a repo (and in what order), even if it requires a bit more typing!

  • In rspm help list : edit Commands to edit repo name and description : we can also edit sources

:heavy_check_mark:

User interface

I'm pretty sure this would be integrated into the forthcoming version, but I'll list them anyway :slight_smile:

  • I'm not sure admin would want every user to have access to stats, or need to have a link to the admin guide (these should be accessible only to rspm admins). I also wonder if the Activity tab is to be accessible to the user?

We had envisioned both the activity tab and the metrics being useful to admins and users. As an example, the metrics page helps an admin audit their risk exposure, but it helps an R user discover packages popular in the organization.

  • With the current version, users have access to all repos in the web interface. This might not be the desired behavior: you might want a part of your team to have access to a specific repo, and the other team to another repo. I can also think of a situation where, inside the same company, a central service develops packages/apps that they sell/share to other services in the company. RSPM would be a perfect tool for this use case: as the person who developed the product, you now just have to share the install.package command with the good repo link. Though I might not want to share something I have developed for service A with service B. With the current version, as soon as I have the link to the repo UI, I can see everything -> http://192.168.0.10:4242/ redirects to http://192.168.0.10:4242/client , and http://192.168.0.10:4242/curated/latest (for ex) provides a link to http://192.168.0.10:4242/client .

Unfortunately R (and install.packages itself) does not have a great notion for authenticating requests to a repository. (Short of supplying basic auth credentials in the repository URL... which is not ideal for many reasons). We do plan to add authentication and user roles to RStudio Package Manager, but for the use case you've described we'll also need to provide a different client for installing the packages. This work is planned, but not in the short term.

  • Can we prevent the web client to be displayed?

I don't think we'll tackle this directly, though we do plan on adding authentication to the UI as well as user roles (as described above).

=> To sum up, I think the current version is too much of a mix between what the admin needs (stats, activity, full repo) and what the end user needs to see (specific repo and documentation).

User

Interface

Let's say I have a repo called prod . If I click on the prod text in the menu bar (on the left of "Packages"), nothing happens except a small blurring of the main page.

=> It would be awesome to have room for custom README here, provided by the Admin, or to the setup guide

This leads me to the idea that it would be more user-friendly if the first thing you see when you browse a repo is a "How to use" or "How to configure" or something like that (maybe move the Setup page?) => As a user, right now, if I open the web client, the first thing I see is a list of package and the help page of the first :package: of the repo. For example, if it's synced with the CRAN, I have the help page of {A3} . That might be confusing for the user.

Also, some kind of contact page would be nice. As a user, I would know who to contact if I have trouble using the repo.

The drawback to these changes is, once set up, it is very annoying for a user to have to click through a setup page every time they want to quickly search for a package. As a compromise, we created repository descriptions. These descriptions can be added by an admin at the CLI and show up in the left-hand sidebar underneath the selected repository. This left-hand panel is pinned by default (though a user can remove the pin if they'd like). Our hope is that these descriptions can provide the context for a repo as well as a place for contact information or specific setup instructions.

Setup

In the current setup guide, you have: " To check if R is already configured, run: options('repos') inside your R console. If the result includes http://192.168.0.10:4242/curated/latest , you're done!"
=> But what to do if it's not that?

The only other suggested option is through the RStudio interface, which is only available with the current daily of RStudio.

I guess including here some other options to use the RSPM would be nifty. Even more, as RSPM aims to be used in companies that want to control what the users install, so RStudio is not updated that frequently.

Our suspicion is that in these organizations, an admin would setup the repository option on the server, so that users do not have to think about it!

I think we only need:

install.packages(
  "pkgtest", 
  repos = "http://192.168.0.10:4242/curated/latest"
)

Setting the repository option is preferred to adding an argument to each install command because setting the option helps avoid accidentally installing packages from different repositories. The goal for most RSPM users is consistency across projects and across users. Setting the repo option is more permanent which helps achieve this consistency. The repo option is also required for the integration with RStudio Connect. However, you and others have made the good point that when you view the help for a specific package, it'd be useful to see the command with the repo argument. We'll look at a good way to present both options without adding too much clutter.

To make everyone be able to use it.

Also, this would guide people in a context where they are not using RStudio: let's say I have created a Shiny App and want to deploy it on a server, manually or through Docker, I can't find (well, right now) any indication about how to do that. I guess if you're comfortable enough to do that you'd know how to change the repos parameter, but you're never too cautious :slight_smile:

We wanted the Setup page to be as simple as possible for the majority of users. There is some documentation on how the repo option could be used with Docker or batch R tasks, but great point that more documentation is always better ... and I agree that a pointer from the setup page to more extensive docs would be helpful for these non-IDE use cases.

Help page

On the help page of each package, there is a

45

That makes perfect sense, provided you have well configured you RStudio. Adding repos = "<adress>" would help, as it would work even if you don't have the repo adress globally set.

Would it be possible to have access to documentation from this page? (Vignettes, etc.)

We are working on incorporating documentation, and will most likely start by presenting a package's ReadMe file.

Share from RStudio

  • I'm pretty sure this will also be included in the forthcoming version of RStudio, but a "point and click" interface to share your own package with the Connect button would be helpful.

:heavy_plus_sign: ... with some time :wink:

Download from RStudio

  • Would be nice to have an interface similar to the current "Install package" window to dl the packages (or maybe it will be integrated inside the current window, "From CRAN / From source / From rspm")

RStudio 1.2's install packages window, and the packages pane, each have a tighter integration with RSPM.

2 Likes

2 posts were split to a new topic: "License expiredโ€ message with evaluation of RStudio Package Manager v0.7.0

I tend to have to package my code up and deploy on servers or pre-built docker images that often run older versions of R, and frequently experience the โ€˜This package is not available for R 3.4.1โ€™ message every time I manually install from source.

End result being me editing the packages DESCRIPTION file to allow my version of R, and then the package working completely fine after that because the version dependency isnโ€™t recorded correctly..

Would be a godsend if RPSM could actually automate that as part of the install process, but provide the user a warning instead of just blocking the installation.