set.seed() in context of tidymodels

I know we have to set seed every time we want to get a reproducible random result. But what is the reason, to set seed in regard to tuning models like here?
Where is the random part and for what part is set.seed() necessary?

the first encounter with randomness in the text you linked to relates to test/train data splitting.

I know. But why and where in detail?

You would need to know about the computations that are being done.

For example, you would set the seed before initial_split() since it uses random numbers.

You might also use it before calling one of the tune_*() functions (or similar) if

  • the model uses random numbers (like random forests)
  • you are using the grid function to have tidymodels make a tuning parameter grid for you

and so on. It really depends.

If you are never going to change your script, you can se the seed at the top. This is a pretty bad assumption, so we often set it multiple times in a script.

3 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.