What is the best way to use Hadoop and R together for analysis?

priaynkad309 · April 25, 2018, 2:42am

How to improve and sample the data sets from HDFS into R

cderv · April 25, 2018, 6:05am

I think working with spark may be one of the best option. You should consult the website from RStudio
http://spark.rstudio.com/

Spark can be an analytic engine to work with a hadoop cluster.
This explanation about Datascience with a Data Lake can help.

You could also have some data engineer help you expose data (Hive table or something else) to help you access this data remotely (with impala odbc driver for example)

There is others solutions I think, but I let someone else on the community talk about them.