Rstudio need to install on all EMR nodes?


I have AWS spark/emr with 6 nodes, I install Rstudio on master node, do I need to install on all rest of nodes?



It depends on what you want to do. You almost certainly do not need Rstudio on the nodes. But you might need R if you are wanting R code to run on the nodes. If you want to control the Spark cluster using one of the R packages like sparklyr then I don't think they need R on the nodes as the local R just sends native commands to the Spark Cluster.

If you provide some context around what you are trying to do, others may be able to give you more color.