I could be misunderstanding, but it seems like prepping and juicing a recipe that includes
step_lda by default only produces per-document per-topic probabilities. How can I extract the beta probabilities as well to analyze the topics themselves?
Here's an example of what I was doing:
devtools::install_github("EmilHvitfeldt/scotus") library(scotus) scotus_lda_rec <- recipe(~ ., data = scotus_sample) %>% step_lda(text) set.seed(123) scotus_lda_prep <- prep(scotus_lda_rec) scotus_lda <- juice(scotus_lda_prep)
Then to get the top topic per document I'd do something like this:
scotus_lda2 <- scotus_lda %>% pivot_longer(lda_text_w1:lda_text_w10) %>% group_by(id) %>% top_n(1, value) %>% select(id, top_topic = name) %>% left_join(scotus_lda) %>% left_join(scotus_sample %>% select(id, text))
But it'd also be great to get the top terms per topic -- any help is appreciated!