Pins - access for non-licenced users

Hi there,

I'm testing out using pins on RStudio Connect and it all looks great. However I'm confused as to how to proceed with my use case:

  1. I have an administrator account, so can register the server as a board and pin content to it
  2. I can then set the access setting for that pin as 'Anyone - no login required'
  3. For other R users to access the data in that pin, it looks like they have to register the board, but they can't do that without logging in and creating an API key I think?

So the question is, how can an R user without an RStudio Connect licence get data from a pin which is set to 'anyone - no login'?

Many thanks,
Chris

Hi @CMT,

Unfortunately this is a known issue with using pins on RStudio Connect at the moment. The reason is that pins basically searches for the content on Connect and then loads the content. Pins can't do that search if it can't authenticate.

There is an open issue on the github repo -- I've already added this thread there, but it'd be great if you want to up-vote it!

Can you share a little more about why you want to share a no-login pin? It's possible there may be a workaround I can suggest.

Thanks!

Hi @alexkgold,

Thanks for this. The use case is that we have shared resources that are in no way sensitive and are intensive to recreate from source code/files e.g. geographic boundary files which can be shared as sf objects across multiple pieces of analysis and content.

On the GitHub thread Cole has mentioned about having the vanity URL, which we do have and can share with users, but I haven't found a way to get the content using a combination pins and the vanity url.

I have been able to get the content without a login by using the vanity url and the /data.csv path on the end of it, but a) this removes the speed benefit of using pins and b) strips additional attributes e.g. the spatial geometries of the boundary data I mentioned.

Hope this helps - I'd be really keen for this feature, it would transform quite a lot of our content!

Thanks,
Chris

Hi Chris,

That's a great use case and thanks for upvoting on the git repo! It sounds like you've already found the csv -- pointing readr::read_csv at <vanity-url>/data.csv (mainly including in case someone else runs across this thread and it's helpful for them).

Sounds like the RDS version is what'd really be helpful for you. You can get that at <vanity-url>/data.rds with the download.file function.

For example, on our demo server, there's an Anyone pin (an XGBoost model) at https://colorado.rstudio.com/rsc/bike_rxgb/.

file_path <- tempfile()
download.file("https://colorado.rstudio.com/rsc/bike_rxgb/data.rds", 
             file_path)
readRDS(file_path)

Note that you could make file_path a real file path if you wanted to save the object permanently as opposed to just bringing it into the R session.

Thanks for the tip Alex. I don't think the RDS version is going to meet the need, for two reasons:

  1. Speed - I've not properly benchmarked, but I've pinned a large-ish simple feature collection (~50Mb) and with pins it loads in under a second. With the RDS solution the file takes around 5 seconds to download.

  2. There's something strange going on with readRDS - I get:

    Error in readRDS(file_path) : 
      ReadItem: unknown type 0, perhaps written by later version of R
    

    The R versions I'm using are the same across both machines, but different operating systems. Some users will have different R versions too.

I'm sure the second one of these can be fixed, but I think the point is that pins is so lovely and easy to use, anything like this is going to be a workaround in the same way as our current sharing methods.

Do you have a sense of when pins might include support for unauthenticated access to RStudio Connect?

Hi @CMT,

Sorry that solution didn't work for you. On the speed issue -- one reason it may be so fast with pins is that pins actually caches a local version of something it's downloaded before. This means that downloads after the first are likely to be relatively fast, but first downloads are probably going to be similar. Unfortunately, none of these solutions are able to take advantage of the caching, so they'll all be that "first time download" slow every time.

Two other things to try below. Hopefully one of these will work...even if speed isn't ideal...

Using Pins

[EDIT]: @CMT has suggested this is best option and that the name has to be set for each piece of content separately.
pins::pin("https://colorado.rstudio.com/rsc/bike_rxgb/data.rds", extract = FALSE, name = "rxgb") %>% readRDS()

(Not a typo to use pins::pin instead of the expected pin_get)

Using httr

Here's a little function I wrote using httr:

#' Get a pin by the vanity URL
#' @param vanity_url the url of the pin on RSC
#' @param outfile a file to write to (could be tempfile())
#' @return none, loads file
#' @examples
#' read_rds_pin("https://colorado.rstudio.com/rsc/bike_rxgb/", tempfile()))
read_rds_pin <- function(vanity_url, outfile) {
  outfile_str <- outfile
  outfile <- file(outfile, "wb")
  writeBin(httr::content(httr::GET(file.path(vanity_url, "data.rds"))), 
           outfile)
  close(outfile)
  readRDS(outfile_str)
}

Hopefully one of them will work for you...

In my testing, all 3 of these options took roughly the same amount of time...maybe a slight advantage to the pins version.

> system.time(lapply(1:10, function(x) read_rds_pin("https://colorado.rstudio.com/rsc/bike_rxgb/", tempfile())))/10
   user  system elapsed 
 3.1178  0.1735  3.9468 
> system.time(lapply(1:10, function(x) {pins::pin("https://colorado.rstudio.com/rsc/bike_rxgb/data.rds", extract = FALSE) %>% readRDS()}))/10
   user  system elapsed 
 2.8602  0.1785  3.0615 
> system.time(lapply(1:10, function(x) {file_path <- tempfile()
+  download.file("https://colorado.rstudio.com/rsc/bike_rxgb/data.rds", 
+                file_path)
+  readRDS(file_path)}))/10
   user  system elapsed 
 3.0960  0.1962  3.9921 

Hopefully one of these will work for you.

In terms of timing, I don't have a timeline, but the git issue on the pins package will be your best bet for updates.

1 Like

Hi @alexkgold

This is the one I think! :slight_smile: I'll continue to play around with this but it seems to do what I need so far. For anyone coming across this, one extra argument I'd include is the name - otherwise the data gets pinned to data on the local board, which is fine as long as you're not querying multiple resources. If you are, pins' caching gets in the way I think and serves you back the contents of local/data even when you're asking for a different resource.

Thanks for all your help, and I'll continue to keep an eye on the GitHub issue.

Cheers,
Chris

1 Like

Great! Looks like including the name might get you some of the cache-ing speed as well. That's cool!

I'll adjust the answer above to add the name argument for clarity. Thanks!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.