As a new user, I would like to try to set up a R-Server, but I don't have a Linux based machine.
What recommendation's does the community have set up an R-server using containers like Docker or Kubernetes so I can compare them to cloud servers?
Are there sample setup's similar to this one for setting up Azure python containers?: https://github.com/jplane/azure-atl-containers
Because I know this topic is a bit complex, I have posted the background after my questions.
Background Information:
My Goals :
- Figure out how to easily set up a basic Docker/Kubernetes container that uses R 3.5 Ubuntu settings and RAM as well as the following package
1.1) Figure basic terminology that I need to know about containers, cloud, etc and watch changed since 2018.
1.2) Figure out how to measure load for my R-script on a local container, so I can better explain the advantages-disadvantage premium cloud and server-less options.
1.3) Figure out how to set up a python Docker/Kubernetes container an have it communicate with my R container. - Do a cost-benefit analysis with a cloud based R-Server set up after setting up things locally.
Background: Right now I am working for the local government using R to help courts predict various objective measures of fairness. In the first testing phase my team and I were able to successfully locally connect R and Tableau locally; however, we are currently evaluating the ability to connect Tableau Server with an R-server. Because I am not sure if I am allowed to share the video of our work publicly, I will try to give a background on the subject.
- The measures of court fairness are scientific measures that Judges, DAs, Defense Attorney's use to evaluate if a court system is obeying the guidelines of the 4th and 6th Amendment to the United States Constitution.
- We need both R and python to talk to each other, due to the following data science requirement: Unlike normal machine learning processes it turns out the science of events, process management and mining, is not something that is completely possible to do in R or python alone. In this domain, R is great at converting event-logs, handling massive data, and feature engineering however, it is horrible at the statistical calcs and modeling. In this domain, Python has TensorFlow and basic process discovery algorithms but is horrible at loading massive sparse event-logs or feature engineering (In particular CNN's cause problems). I bring this up because I don't want people to make assumptions and I want people to understand the scope. Currently, I have not found a cloud instance or service that allows one to both use python and R simultaneously, so my team does all work in R than just use reticulated and python code to adjust our statistical assumptions in R.
- I work for local government, not big business. To do any type of testing, I need more information. So I am looking for help ultimately narrowing things down by reaching out to the R community.