Decision Trees (Logit Leaf Model LLM) issue

Hello !

I get the following error after implemeting packages of LLM (Logit Leaf Model) alongside partykit, stringr, RWeka, rjava, ... I used the model LLM (llm.cv) but get the following error:

glm.fit: algorithm did not converge

It seems to be linked to logistic regression but can't manage to solve this. My code was:
churn.llm<-llm.cv(X=ChurnIV,Y=ChurnDV$churn_dummy, cv = 5, threshold_pruning = 0.01, nbr_obs_leaf = 100)

Thanks!

Ophélie

Can you please provide a minimal reprex (reproducible example)? The goal of a reprex is to make it as easy as possible for me to recreate your problem so that I can fix it: please help me help you!

If you've never heard of a reprex before, start by reading "What is a reprex", and follow the advice further down that page.

1 Like

Hello,

Thanks for your help! I actually solved the previous issue (with some other help) of the glm fit error but can't get to visualise my model (which is a bit annoying of course). I used the LLM package (https://cran.r-project.org/web/packages/LLM/LLM.pdf). Here's the reprex:

## Libraries

library(tidyverse)
library(LLM)
library(readxl)
library(janitor)

## reading the churnDV

churnDV_df <- read_excel("ChurnDV.xlsx")
#> Error: `path` does not exist: 'ChurnDV.xlsx'

glimpse(churnDV_df)
#> Error in glimpse(churnDV_df): object 'churnDV_df' not found

## convert the true/false entry to factor

churnDV_df <- churnDV_df %>% mutate(churn_dummy = factor(churn_dummy, levels = c("true", "false")))
#> Error in eval(lhs, parent, parent): object 'churnDV_df' not found
                                    
## reading the churnIV dataset

churnIV_df <- read_excel("ChurnIV.xlsx") %>% 
  clean_names()
#> Error: `path` does not exist: 'ChurnIV.xlsx'

glimpse(churnIV_df)
#> Error in glimpse(churnIV_df): object 'churnIV_df' not found

## converted the churn_dummy variable into a binary variable

churnIV_df <- churnIV_df %>% mutate(churn_dummy = ifelse(churn_dummy == 1, "true", "false"),
                                    churn_dummy = factor(churn_dummy)) %>% 
  as.data.frame()
#> Error in eval(lhs, parent, parent): object 'churnIV_df' not found

## Actual Model Implementation

churn.llm <- llm(X = churnIV_df[, -c(14)], Y = churnIV_df$churn_dummy,
                 threshold_pruning = 0.10, nbr_obs_leaf = 500)
#> Error in nrow(X): object 'churnIV_df' not found
churn.llm
#> Error in eval(expr, envir, enclos): object 'churn.llm' not found

## Actual Model Implementation
churn.llm.cv <- llm.cv(X = churnIV_df[, -c(14)], Y = churnIV_df$churn_dummy, cv = 10,
                 threshold_pruning = 0.10, nbr_obs_leaf = 500)
#> Error in cbind(Y, X): object 'churnIV_df' not found

churn.llm.cv
#> Error in eval(expr, envir, enclos): object 'churn.llm.cv' not found

## Save the output of the model to a html file
churn.viz <- table.llm.html(object = churn.llm, headertext = "LLM applied to churn", footertext = "R output", roundingnumbers = 2)
#> Error in table.llm.html(object = churn.llm, headertext = "LLM applied to churn", : object 'churn.llm' not found
## Optionaly write it to your working directory

write(churn.viz, "Visualization_LLM_on_churn.html")
#> Error in cat(x, file = file, sep = c(rep.int(sep, ncolumns - 1), "\n"), : object 'churn.viz' not found```

That code can't find your file.

Also, a reproducible example is better; since we don't have that file, we can't really help.

Hello,

I'm sorry but I can't seem to do that. I tried the dput function but as my dataset contains 80 000 lines or so, it only shows the last ones... I already tried changing max.print to 100 000 and setting the maximum number of lines displayed in the console up to 100 000 or 0 and I still get that issue.

I do not know how to share my file otherwise. It's my first time using R outside of academoic classes.

Thanks

Ophélie

The only way I can share my data with you is via the Drive;

Is that OK?

The error is "Error in x[[jj]][iseq] <- vjj : replacement has length zero"

I can get a little further if I make the columns names valid R variable names but otherwise it fails for no obvious reason. You can reproduce the issue with the first 100 rows so I suggest using dput to file a bug report with the maintainer of LLM.


library(tidyverse)
library(LLM)
library(readxl)

dat <- read_excel("~/Downloads/ChurnIV.xlsx")

dat_factor_y <-
  dat %>% 
  mutate(churn_dummy = 
           factor(ifelse(churn_dummy == 1, "true", "false"), 
                  levels = c("true", "false"))) %>% 
  rename_at(vars(dplyr::contains("%")), funs(gsub("%", "pct", .))) %>% 
  rename_at(vars(dplyr::contains(" ")), funs(gsub(" ", "_", .))) 


churn.llm <- llm(X = dat_factor_y %>% dplyr::select(-churn_dummy), 
                 Y = dat_factor_y$churn_dummy,
                 threshold_pruning = 0.10, 
                 nbr_obs_leaf = 500)
#> Error in `[.default`(Y, rownbrs): invalid subscript type 'list'

smaller <- 
  dat_factor_y %>% 
  head(100)

churn.llm <- llm(X = smaller %>% dplyr::select(-churn_dummy), 
                 Y = smaller$churn_dummy,
                 threshold_pruning = 0.10, 
                 nbr_obs_leaf = 500)
#> Error in `[.default`(Y, rownbrs): invalid subscript type 'list'

Created on 2018-12-30 by the reprex package (v0.2.1)

1 Like

I did manage to get the model though (so the llm function) but cannot visualise it, which makes it a bit unuseless. I can also get the llm.cv and receive the confusion matrix of the results but, can't seem to use the "table.ll.html" function.

Thanks for trying, I'm getting a bit out of schedule so I guess I'll just give up on the LLM model...

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.