What is the R equivalent to AutoML?
It seems an interesting area of activities.
Thanks Enzo
What is the R equivalent to AutoML?
It seems an interesting area of activities.
Thanks Enzo
Do you mean the new Google product, AutoML?
If so then its integrated with the existing APIs such as cloudml and RoogleVision, so those R libraries will let you call the models people make online.
The idea is that it gives a webUI to existing models, so anyone can train up the model.
I realise I should have been more clear. My apologies.
There is an increasing attempt to identify methods in meta-learning, algorithm selection, and algorithm configuration that can a) speed-up the ML process; b) possibly simplify the overall set of tasks for data scientist in training (this is a slightly more doubtful kind of goal).
There is a website dedicated to this: http://www.ml4aad.org/automl/
There is a paying-for package with H2O and open source ones in Weka (AutoWeka) and python (with the package Auto-sklearn in GitHub - I cannot paste more than one link as I'm a new user).
I know Google has announced AutoML, possibly on similar goals.
My questions is more R focused. Is there any activity e.g. around Caret (different but the most similar thing to scikit-learn in R) or other packages on the line of the principles outlined by http://www.ml4aad.org/automl/ and published in a variety of papers at NIPS etc. since 2015?
H2O is free and open source
caret
and other packages have the interfaces to hyperparameter optimization. Plenty of activity there, supports parallel processing etc etc.
rBayesianOptimization
is a package for Bayesian optimization (see presentation here)
recipes
has a pretty extensive feature engineering/preprocessing engine.
You might want to look at the workshop slides from our conference to get a good survey of things. This presentation should give a overview of where things are headed.
Kevin, H2O is open source, but "driverless AI" (from H2O the company) is a licensed product ($$$). "driverless AI" is H2O AutoML product. Enzo
Max, honoured to get your answer (yes, I'm a fan!): exactly what I was looking for.
I agree that caret has been a precursor of recent approaches in many ways (and I think this is not entirely recognised).
On the other hand movements from players like H2O (but also Weka) make me think that possibly the R community should do a bit more, even if it was only making more public and widely known what exist already.
Maybe we need a second edition of your great book
Thanks Enzo
Right, but http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html should
get you at least part of the way there, for free, depending on your use
case.
Caret has hyperoptimization built in and I have seen a number of web pages running all the models. Mine for example . For ensembling & stacking exists caretEnsemble. Really missing feature elimination though.
Python has just sklearn but 4 automl packages on top of it. https://alternativeto.net/list/3478/automl-when-you-want-to-be-completely-out-of-the-thinkg-loop-
I'm also recently interested in the AutoML space. Some packages that look interesting are:
https://github.com/mlr-org/automlr and https://github.com/ja-thomas/autoxgboost
I haven't tried either yet but according the autoxgboost paper, it performed decently against AUTO-WEKA and auto-sklearn. It is one of the few single-learner AutoML frameworks.