Thanks, I'm getting around to looking into it and it seems interesting.
So let's say there are 100 categorical predictors, and vtreat reduces it to 10 predictors. How would its performance change when predicting the y? Looking at the Readme here, I couldn't really find benchmarks on, for example, computation efficiency, consistency, etc. From a UMAP paper, it makes a comparison of consistency for 10% of sample size vs. full data set (on page 35).
I see that there is a video lecture on it, but I could not find a screencast of example coding for categorical dimensional reduction. Do you happen to know one?