I have a question about
pins and a use case with AWS S3 boards that I'm not sure about in terms of recommendations. Here at Montoux, we develop a SaaS web based platform for life insurance companies to model their portfolios from an actuarial analysis point of view. We're developing a bunch of data science based models in R, and we're figuring out the mechanics of productising these into our platform.
We've previously been using
s3mpi (https://github.com/robertzk/s3mpi) as a way to allow us to fetch data from S3 and cache it locally - some of the data we consume is pretty large, so this has worked well for explorative analysis. However, we feel like most of the community traction is around
pins, so we're looking at how we can use it.
One area that's a little unknown to us is how we should approach data that hasn't yet been cached/pinned - for example, may have been uploaded directly by a user, or produced as part of some other data processing - ie. the metadata
data.txt hasn't been produced. One way to approach this would be to use
aws.s3 to pull the file and then
pin it, but this seems somewhat inefficient and it would be nice if there was a way for
pins to populate the cache from an existing S3 object. Is there anything I'm missing here, or is this a use case that is outside the scope of
I'd really appreciate any experiences or recommendations anyone has.