Do `tensorflow`, `keras`, `cloudml` and `tfestimators` topics belong in the Tidyverse Category?

EconomiCurtis · February 13, 2018, 7:18pm

Hi All,

Curious to get your thoughts

Do you think topics related to tensorflow, keras, cloudml and tfestimators belong as tags in this tidyverse category?
Or should we create a new category dedicated to these topics?

andrie · February 13, 2018, 7:29pm

In my opinion, these deep learning topics are not part of tidyverse, and should have a category of its own. I propose "deep learning" as the category name, so it can encompass:

keras
tensorflow
CNTK, theano, mxnet and the other keras back ends
tensorflow estimators
publication to cloudml and RStudio Connect using rsconnect
the other utility packages, including tfdatasets and tfruns

This is a very large set of topics, quite distinct from the tidyverse.

martin.R · February 13, 2018, 7:32pm

I agree that there should be a separate category.

Should it be "machine learning", rather than "deep learning", though to capture any other ML topics which don't belong in the tidyverse?

andrie · February 13, 2018, 7:35pm

I agree there is an argument to bundle "deep learning" with "machine learning". However, deep learning is specialised (translate as weird, if you want) and broad enough to warrant it's own category, IMO.

martin.R · February 13, 2018, 7:40pm

Fair enough, that makes sense, especially as they require completely different packages and infrastructure.

mara · February 13, 2018, 9:22pm

I agree, and I think this actually belongs in #meta!

cderv · February 13, 2018, 9:58pm

I also agree that deep learning and topics can be very specific.

However there is no machine learning category yet so if there is a separate deep learning one, shouldn't it be a machine learning too? (About caret, modelr, yardstick, rsample,...)

EconomiCurtis · February 14, 2018, 8:13am

Hi,

Thank you for all your comments.

Since many discussions with tags related to deep-learning, ml, and modeling have been put under tidyverse, (and I've personally always thought of many of these modeling and ml packages as natural parts of the tidyverse), I wanted I'd touch base with y'all here before making any change.

I set-up a Machine Learning and Modeling category here: Machine Learning and Modeling - Posit Community and I am moving all older related topics to this category.

The description currently reads:

For discussions related to modeling, machine learning and deep learning.

Not an exhaustive list, but some related topics include caret, modelr, yardstick, rsample, parsnip, tensorflow, keras, cloudml, and tfestimators.

Although there's no argument that deep-learning, machine learning, and modeling are broad areas and often distinct from one another, I include them together in one category to avoid confusion with folks newer to community. I feel if we just had deep-learning without ml and modeling, you'd see those discussions end up there anyway. If we had single categories for each I strongly suspect people might be confused as to where to post. We can split up categories later when the need is apparent.

* updated to address @Tazinho's concern.

Tazinho · February 14, 2018, 12:30pm

I like your introduction post on the ML and Modelling category. I also think that deep learning and modelling are in many ways the same. For example in supervised learning most steps are general and the algorithms are just a tool. Also the field of modelling is quite diverse and deep learning is just one of many families of algos.

I like those pkgs that you mention and I see that your formulation is not exclusive. It would be cool, if this becomes a place, where also developers of ml frameworks start their discussions.

So, I am not a native speaker, and apologize, if I am wrong, however, I think especially for newcomers in the field, the formulation might imply that other related pkgs (for example those which are not developed by RStudio) are out of scope in this category. Maybe this could be formulated weaker or broader without advertising specific pkgs in general.

DaveH · February 14, 2018, 9:10pm

... good thought. I like the idea of example packages/frameworks though, since it helps define the intended scope. Perhaps add some other frameworks that are clearly not RStudio owned, eg. SparkML and h2o (Tensorflow is already in the list).