Advice for Improving Community Response to Modeling and Machine Learning Topics


#1

Motivation

I'd like to open a conversation in #meta about making guides for asking and answering questions in the #ml category, for topics related to modeling and machine learning.
Should #ml have faqs, guides, videos to help folks distinct from that of what's on the FAQ? Are there any patterns of behavior particular to #ml that we should be discouraging? And if so, what should they be?

As an example of what I have in mind, #tidyverse and #general have guides for asking questions (FAQ: Tips for writing R-related questions), reprexes (FAQ: What's a reproducible example (`reprex`) and how do I do one?), and other common issues that arise there (https://community.rstudio.com/faq). Plus rstudio folks and sustainers (moderators) are generally familiar with such policies (e.g. we push all to ask questions as reprexes, not be snarky, etc. via friendly replies, messages and so on.)

This stuff has really helped us be nice and welcoming, while efficient with people's time, and helps avoid the kind of "help-vampirism" that can cause burn-out in communities.

For #ml

  • For topics in ML (not to mention shiny, for which a lot of this stuff is being developed now), simple reprexes often don't cut it. Guides might not be well suited for #ml. Modeling questions can be quite involved.
  • With the goal of a single topic-thread asking and solving a relatively focused question (it's more discoverable and useful later on that way), and multipart questions like this should probably be discouraged.
  • But connected-multipart-questions that build on one-another might be frequent in #ml, so how should be handle them?
  • When should a question be posted to Stack Overflow, Cross Validated, stan or elsewhere?
  • Active answers here want to discourage the pattern of @name mentioning people "too" frequently (for reasons discussed FAQ: Should I @name mention other users in my post?), so should this policy change in #ml? or should the justification change?

Just as an example of some of these issues (and with permission from OP Adam to cite it): this thread Decision Tree Rpart() Summary Interpretation got a little out of hand. I feel the Adam got a lot of value and the responder, which is great. Max got a ton of goodwill. But I'm worried it's not particularly helpful to others in the future, despite how much time was spent on it.
How should I have interjected sooner to address these concerns? Should we just chill-out and let things be?

I'd appreciate your thoughts and advice.