Recommendations on setting up your first R-Server using containers

As a new user, I would like to try to set up a R-Server, but I don't have a Linux based machine.

What recommendation's does the community have set up an R-server using containers like Docker or Kubernetes so I can compare them to cloud servers?
Are there sample setup's similar to this one for setting up Azure python containers?: https://github.com/jplane/azure-atl-containers

Because I know this topic is a bit complex, I have posted the background after my questions.

Background Information:
My Goals :

  1. Figure out how to easily set up a basic Docker/Kubernetes container that uses R 3.5 Ubuntu settings and RAM as well as the following package
    1.1) Figure basic terminology that I need to know about containers, cloud, etc and watch changed since 2018.
    1.2) Figure out how to measure load for my R-script on a local container, so I can better explain the advantages-disadvantage premium cloud and server-less options.
    1.3) Figure out how to set up a python Docker/Kubernetes container an have it communicate with my R container.
  2. Do a cost-benefit analysis with a cloud based R-Server set up after setting up things locally.

Background: Right now I am working for the local government using R to help courts predict various objective measures of fairness. In the first testing phase my team and I were able to successfully locally connect R and Tableau locally; however, we are currently evaluating the ability to connect Tableau Server with an R-server. Because I am not sure if I am allowed to share the video of our work publicly, I will try to give a background on the subject.

  • The measures of court fairness are scientific measures that Judges, DAs, Defense Attorney's use to evaluate if a court system is obeying the guidelines of the 4th and 6th Amendment to the United States Constitution.
  • We need both R and python to talk to each other, due to the following data science requirement: Unlike normal machine learning processes it turns out the science of events, process management and mining, is not something that is completely possible to do in R or python alone. In this domain, R is great at converting event-logs, handling massive data, and feature engineering however, it is horrible at the statistical calcs and modeling. In this domain, Python has TensorFlow and basic process discovery algorithms but is horrible at loading massive sparse event-logs or feature engineering (In particular CNN's cause problems). I bring this up because I don't want people to make assumptions and I want people to understand the scope. Currently, I have not found a cloud instance or service that allows one to both use python and R simultaneously, so my team does all work in R than just use reticulated and python code to adjust our statistical assumptions in R.
  • I work for local government, not big business. To do any type of testing, I need more information. So I am looking for help ultimately narrowing things down by reaching out to the R community.

Hi @RGIS,

Thanks for posting! If you're looking for the quickest way to get started in docker with RStudio's open source products, many people like using https://hub.docker.com/r/rocker/rstudio/.

Our professional (paid) products support both R and Python for developing in R and in Jupyter Notebooks and support both R and Python for hosting, viewing, and scheduling content. If you're interested, we'd be happy to set up some time to talk through how this all can work for your use case.

Consider setting up a Data Science Virtual Machine in Azure. This has R and Python preinstalled, along with several ML and stats packages including Tensorflow.

Thank you I have not gotten the configuration of docker quite right, so I plan to follow up to both you and Hong. I am probably going to have a lot of conceptual questions as I am writing documentation for my department on best practices as I learn. :smiley:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.