What is the R equivalent to AutoML?


#1

What is the R equivalent to AutoML?

It seems an interesting area of activities.

Thanks Enzo


#2

Do you mean the new Google product, AutoML?

If so then its integrated with the existing APIs such as cloudml and RoogleVision, so those R libraries will let you call the models people make online.

The idea is that it gives a webUI to existing models, so anyone can train up the model.


#3

I realise I should have been more clear. My apologies.

There is an increasing attempt to identify methods in meta-learning, algorithm selection, and algorithm configuration that can a) speed-up the ML process; b) possibly simplify the overall set of tasks for data scientist in training (this is a slightly more doubtful kind of goal).

There is a website dedicated to this: http://www.ml4aad.org/automl/

There is a paying-for package with H2O and open source ones in Weka (AutoWeka) and python (with the package Auto-sklearn in GitHub - I cannot paste more than one link as I’m a new user).

I know Google has announced AutoML, possibly on similar goals.

My questions is more R focused. Is there any activity e.g. around Caret (different but the most similar thing to scikit-learn in R) or other packages on the line of the principles outlined by http://www.ml4aad.org/automl/ and published in a variety of papers at NIPS etc. since 2015?


#4

H2O is free and open source


#5
  • caret and other packages have the interfaces to hyperparameter optimization. Plenty of activity there, supports parallel processing etc etc.

  • rBayesianOptimization is a package for Bayesian optimization (see presentation here)

  • recipes has a pretty extensive feature engineering/preprocessing engine.

You might want to look at the workshop slides from our conference to get a good survey of things. This presentation should give a overview of where things are headed.


#6

Kevin, H2O is open source, but “driverless AI” (from H2O the company) is a licensed product ($$$). “driverless AI” is H2O AutoML product. Enzo


#7

Max, honoured to get your answer (yes, I’m a fan!): exactly what I was looking for.
I agree that caret has been a precursor of recent approaches in many ways (and I think this is not entirely recognised).
On the other hand movements from players like H2O (but also Weka) make me think that possibly the R community should do a bit more, even if it was only making more public and widely known what exist already.
Maybe we need a second edition of your great book :wink:
Thanks Enzo


#8

Right, but http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html should
get you at least part of the way there, for free, depending on your use
case.


#9

Caret has hyperoptimization built in and I have seen a number of web pages running all the models. Mine for example . For ensembling & stacking exists caretEnsemble. Really missing feature elimination though.

Python has just sklearn but 4 automl packages on top of it. https://alternativeto.net/list/3478/automl-when-you-want-to-be-completely-out-of-the-thinkg-loop-


#10

I'm also recently interested in the AutoML space. Some packages that look interesting are:
https://github.com/mlr-org/automlr and https://github.com/ja-thomas/autoxgboost

I haven't tried either yet but according the autoxgboost paper, it performed decently against AUTO-WEKA and auto-sklearn. It is one of the few single-learner AutoML frameworks.