Building Spark ML pipelines with sparklyr


#1

This is a companion discussion topic for the original entry at:

Building Spark ML pipelines with sparklyr

https://www.rstudio.com/resources/videos/building-spark-ml-pipelines-with-sparklyr/


We provide an overview of the recently implemented Pipelines API in sparklyr, an R package for interfacing with Apache Spark. This new feature allows users to build and tune data transformation and machine learning pipelines that are interoperable with Scala and Python, simplifying handoffs between data science and data engineering. We go over the components of pipelines and walk through practical examples.

Building Spark ML pipelines with sparklyr, Kevin Kuo, @kevinykuo

Kevin Kuo - Software Engineer
Kevin is a software engineer focused on building R interfaces to big data and machine learning tools like Spark and TensorFlow. He has experience applying data analytics in a variety of settings from insurance claims analytics to predictive maintenance of industrial assets. Outside of data science, Kevin enjoys wine tasting and crafting fancy cocktails.


closed

This topic has been closed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.
#2