Modeling collision_severity; subscript out of bounds error...

Hi,

I'm working with some traffic data and I'm trying to make a model evaluation on which model would best fit the data set. I believe there is some basic step I'm missing because I've restrained to a maximum of 3 predictors setting up dummy variables , centering and scaling.

I'm trying to determine which model best fits to describe accident severity or collision_severity based off three other predictors (all factors): type_of_collision, pcf_violation_category & chp_vehicle_type_at_fault.

Please find below a reprex of my attempt:

tibble::tribble(
                                                    ~collision_severity, ~type_of_collision,      ~pcf_violation_category,                ~chp_vehicle_type_at_fault,
                                                 "property damage only",        "sideswipe", "unsafe starting or backing",                  "passenger car, station",
                                                 "property damage only",         "rear end",      "following too closely",                  "passenger car, station",
                                                 "property damage only",        "broadside",  "traffic signals and signs",                  "passenger car, station",
                                                 "property damage only",       "hit object",                   "speeding",                  "passenger car, station",
                                                 "property damage only",        "sideswipe",           "improper turning",                               "mini-vans",
                                                 "property damage only",        "sideswipe",    "automobile right of way",                  "passenger car, station",
                                                 "property damage only",       "hit object",                        "dui",                        "pickups & panels",
                                                 "property damage only",        "sideswipe",         "unsafe lane change",                          "two axle truck",
                                                                 "pain",         "rear end",                   "speeding",                  "passenger car, station",
                                                 "property damage only",        "sideswipe",           "improper turning",                              "motorcycle",
                                                        "severe injury",       "pedestrian",                   "speeding",                  "passenger car, station",
                                                 "property damage only",        "sideswipe",           "improper turning",                          "two axle truck",
                                                         "other injury",         "rear end",                   "speeding",                  "passenger car, station",
                                                                 "pain",        "broadside",    "automobile right of way",                   "sport utility vehicle",
                                                        "severe injury",         "rear end",         "unsafe lane change",                  "passenger car, station",
                                                         "other injury",       "overturned",           "improper turning",                              "motorcycle",
                                                 "property damage only",        "sideswipe",         "wrong side of road", "unknown hit and run vehicle involvement",
                                                 "property damage only",         "rear end",      "following too closely",                  "passenger car, station",
                                                 "property damage only",         "rear end",                   "speeding", "unknown hit and run vehicle involvement",
                                                                 "pain",         "rear end",                   "speeding",                   "sport utility vehicle",
                                                 "property damage only",        "broadside",    "automobile right of way", "unknown hit and run vehicle involvement",
                                                         "other injury",        "broadside",         "wrong side of road",                                 "bicycle",
                                                 "property damage only",          "head-on",    "automobile right of way",                  "passenger car, station",
                                                         "other injury",        "broadside",    "automobile right of way",                        "pickups & panels",
                                                 "property damage only",        "sideswipe",           "improper turning", "unknown hit and run vehicle involvement",
                                                 "property damage only",        "sideswipe",    "automobile right of way",                  "passenger car, station",
                                                 "property damage only",         "rear end",      "following too closely",                  "passenger car, station",
                                                 "property damage only",         "rear end",                   "speeding",                   "sport utility vehicle",
                                                 "property damage only",        "broadside",    "automobile right of way",                        "pickups & panels",
                                                 "property damage only",         "rear end",                   "speeding", "unknown hit and run vehicle involvement",
                                                 "property damage only",         "rear end",                   "speeding", "unknown hit and run vehicle involvement",
                                                 "property damage only",         "rear end",                   "speeding",                  "passenger car, station",
                                                         "other injury",         "rear end",                   "speeding",                  "passenger car, station",
                                                 "property damage only",       "pedestrian",       "pedestrian violation",                              "pedestrian",
                                                 "property damage only",        "sideswipe",         "unsafe lane change",                  "passenger car, station",
                                                 "property damage only",        "sideswipe",         "unsafe lane change",                  "passenger car, station",
                                                 "property damage only",        "broadside",    "automobile right of way",                  "passenger car, station",
                                                 "property damage only",         "rear end",                   "speeding",                  "passenger car, station",
                                                 "property damage only",         "rear end",                   "speeding", "unknown hit and run vehicle involvement",
                                                 "property damage only",         "rear end",                   "speeding",                  "passenger car, station",
                                                                 "pain",         "rear end",                   "speeding",                   "sport utility vehicle",
                                                 "property damage only",         "rear end",                   "speeding",                         "pickup w/camper",
                                                 "property damage only",         "rear end",                   "speeding",                               "mini-vans",
                                                 "property damage only",        "sideswipe",         "unsafe lane change",                  "passenger car, station",
                                                                 "pain",         "rear end",      "following too closely",                        "pickups & panels",
                                                                 "pain",         "rear end",                   "speeding",                  "passenger car, station",
                                                 "property damage only",         "rear end",         "unsafe lane change",                  "passenger car, station",
                                                                 "pain",        "sideswipe",                        "dui",                  "passenger car, station",
                                                         "other injury",       "hit object",           "improper turning",                  "passenger car, station",
                                                                 "pain",          "head-on",    "automobile right of way",                  "passenger car, station",
                                                                 "pain",         "rear end",                   "speeding",                  "passenger car, station",
                                                 "property damage only",        "sideswipe",         "unsafe lane change",                        "pickups & panels",
                                                 "property damage only",         "rear end",      "following too closely", "unknown hit and run vehicle involvement",
                                                                 "pain",        "sideswipe",         "unsafe lane change",                        "pickups & panels",
                                                 "property damage only",       "hit object",                   "speeding",                  "passenger car, station",
                                                 "property damage only",         "rear end",           "improper turning",                  "passenger car, station",
                                                                 "pain",         "rear end",                   "speeding",                        "pickups & panels",
                                                 "property damage only",         "rear end",                   "speeding", "unknown hit and run vehicle involvement",
                                                 "property damage only",        "sideswipe",         "unsafe lane change",                          "two axle truck",
                                                 "property damage only",         "rear end",                   "speeding",                               "mini-vans",
                                                                 "pain",         "rear end",                   "speeding", "unknown hit and run vehicle involvement",
                                                 "property damage only",         "rear end",                   "speeding",                  "passenger car, station",
                                                 "property damage only",        "sideswipe",         "unsafe lane change",                        "pickups & panels",
                                                         "other injury",       "pedestrian",    "pedestrian right of way",                  "passenger car, station",
                                                                 "pain",        "broadside",    "automobile right of way",                  "passenger car, station",
                                                                 "pain",         "rear end",                   "speeding",                   "sport utility vehicle",
                                                                 "pain",        "broadside",  "traffic signals and signs",                  "passenger car, station",
                                                 "property damage only",        "sideswipe",         "unsafe lane change",                  "passenger car, station",
                                                         "other injury",         "rear end",           "improper turning",                  "passenger car, station",
                                                 "property damage only",        "broadside",    "automobile right of way",                        "pickups & panels",
                                                 "property damage only",         "rear end",                   "speeding",                  "passenger car, station",
                                                 "property damage only",          "head-on",    "automobile right of way",                  "passenger car, station",
                                                 "property damage only",       "hit object",           "improper turning",                   "sport utility vehicle",
                                                         "other injury",        "sideswipe",           "improper turning",                  "passenger car, station",
                                                                 "pain",         "rear end",                   "speeding",                                      "00",
                                                 "property damage only",         "rear end",                   "speeding",                        "pickups & panels",
                                                                 "pain",          "head-on",    "automobile right of way",                  "passenger car, station",
                                                 "property damage only",       "hit object",           "improper turning",                  "passenger car, station",
                                                         "other injury",        "sideswipe",    "automobile right of way", "unknown hit and run vehicle involvement",
                                                 "property damage only",        "broadside",    "automobile right of way",                  "passenger car, station",
                                                                 "pain",         "rear end",                   "speeding",                  "passenger car, station",
                                                         "other injury",       "pedestrian",         "wrong side of road",                                 "bicycle",
                                                 "property damage only",        "sideswipe",  "traffic signals and signs",                  "passenger car, station",
                                                 "property damage only",        "broadside",  "traffic signals and signs",                  "passenger car, station",
                                                                 "pain",        "sideswipe",         "unsafe lane change",                  "passenger car, station",
                                                 "property damage only",         "rear end",                   "speeding", "unknown hit and run vehicle involvement",
                                                         "other injury",          "head-on",                        "dui",                        "pickups & panels",
                                                 "property damage only",        "sideswipe",           "improper turning",                  "passenger car, station",
                                                                 "pain",         "rear end",                   "speeding",                               "mini-vans",
                                                 "property damage only",        "sideswipe",         "unsafe lane change",                  "passenger car, station",
                                                                 "pain",         "rear end",                   "speeding",                        "pickups & panels",
                                                         "other injury",        "broadside",    "automobile right of way",                  "passenger car, station",
                                                 "property damage only",         "rear end",                   "speeding",                          "two axle truck",
                                                 "property damage only",        "sideswipe",           "improper turning",                  "passenger car, station",
                                                 "property damage only",         "rear end",                   "speeding",                   "sport utility vehicle",
                                                                 "pain",        "broadside",    "automobile right of way",                  "passenger car, station",
                                                                 "pain",         "rear end",                   "speeding",                  "passenger car, station"
                                                 )
#> # A tibble: 97 x 4
#>    collision_severity   type_of_collision pcf_violation_cate~ chp_vehicle_type_~
#>    <chr>                <chr>             <chr>               <chr>             
#>  1 property damage only sideswipe         unsafe starting or~ passenger car, st~
#>  2 property damage only rear end          following too clos~ passenger car, st~
#>  3 property damage only broadside         traffic signals an~ passenger car, st~
#>  4 property damage only hit object        speeding            passenger car, st~
#>  5 property damage only sideswipe         improper turning    mini-vans         
#>  6 property damage only sideswipe         automobile right o~ passenger car, st~
#>  7 property damage only hit object        dui                 pickups & panels  
#>  8 property damage only sideswipe         unsafe lane change  two axle truck    
#>  9 pain                 rear end          speeding            passenger car, st~
#> 10 property damage only sideswipe         improper turning    motorcycle        
#> # ... with 87 more rows
# Splitting data into train and test data sets.
dfml_split <-
  rsample::initial_split(df_ml, strata = collision_severity)
#> Error in eval_select_impl(NULL, .vars, expr(c(!!!dots)), include = .include, : object 'df_ml' not found
dfml_train <- rsample::training(dfml_split)
#> Error in analysis(x): object 'dfml_split' not found
dfml_test <- rsample::testing(dfml_split)
#> Error in assessment(x): object 'dfml_split' not found

# Creating folds:
dfml_folds <-
  rsample::vfold_cv(dfml_train, strata = collision_severity,
                    v = 2)
#> Error in eval_select_impl(NULL, .vars, expr(c(!!!dots)), include = .include, : object 'dfml_train' not found


# Creating Recipes:
# Collision Severity as a dependent variable (response), all independent variables (predictors)
dfml_recipe <-
  recipes::recipe(collision_severity ~ type_of_collision,
                  data = dfml_test) %>%
  recipes::step_dummy(all_nominal_predictors()) %>% 
  recipes::step_center(all_numeric_predictors()) %>%
  recipes::step_scale(all_numeric_predictors())
#> Error in recipes::recipe(collision_severity ~ type_of_collision, data = dfml_test) %>% : could not find function "%>%"


# Collision Severity as a dependent variable (response), independent: type_of_collision
  dfml_recipe2 <-
    recipes::recipe(collision_severity ~ type_of_collision + pcf_violation_category,
                    data = dfml_test) %>%
    recipes::step_dummy(all_nominal_predictors()) %>% 
    recipes::step_center(all_numeric_predictors(), -all_outcomes()) %>%
    recipes::step_scale(all_numeric_predictors(), -all_outcomes())
#> Error in recipes::recipe(collision_severity ~ type_of_collision + pcf_violation_category, : could not find function "%>%"


# Collision Severity as a dependent variable (response), independent: type_of_collision
dfml_recipe3 <-
  recipes::recipe(
    collision_severity ~ type_of_collision + pcf_violation_category + chp_vehicle_type_at_fault,
    data = dfml_test
  ) %>%
  recipes::step_dummy(all_nominal_predictors()) %>% 
  recipes::step_center(all_numeric_predictors(), -all_outcomes()) %>%
  recipes::step_scale(all_numeric_predictors(), -all_outcomes())
#> Error in recipes::recipe(collision_severity ~ type_of_collision + pcf_violation_category + : could not find function "%>%"

# Review of my recipes:
dfml_recipe
#> Error in eval(expr, envir, enclos): object 'dfml_recipe' not found
dfml_recipe2
#> Error in eval(expr, envir, enclos): object 'dfml_recipe2' not found
dfml_recipe3
#> Error in eval(expr, envir, enclos): object 'dfml_recipe3' not found




# Setting Model Specs:
log_reg <- parsnip::logistic_reg() %>%
  parsnip::set_engine("glm")
#> Error in parsnip::logistic_reg() %>% parsnip::set_engine("glm"): could not find function "%>%"

lda_spec <- discrim::discrim_linear() %>%
  parsnip::set_mode("classification") %>%
  parsnip::set_engine("MASS")
#> Error in discrim::discrim_linear() %>% parsnip::set_mode("classification") %>% : could not find function "%>%"

tree_spec <- baguette::bag_tree() %>%
  parsnip::set_mode("classification") %>%
  parsnip::set_engine("rpart")
#> Error in baguette::bag_tree() %>% parsnip::set_mode("classification") %>% : could not find function "%>%"

knn_spec <- parsnip::nearest_neighbor() %>%
  parsnip::set_engine("kknn") %>%
  parsnip::set_mode("classification")
#> Error in parsnip::nearest_neighbor() %>% parsnip::set_engine("kknn") %>% : could not find function "%>%"

bays_spec <- parsnip::decision_tree() %>%
  parsnip::set_mode("classification") %>%
  parsnip::set_engine("rpart")
#> Error in parsnip::decision_tree() %>% parsnip::set_mode("classification") %>% : could not find function "%>%"

svm_spec <- parsnip::svm_poly() %>%
  parsnip::set_mode("classification") %>%
  parsnip::set_engine("kernlab")
#> Error in parsnip::svm_poly() %>% parsnip::set_mode("classification") %>% : could not find function "%>%"

rf_spec <-  parsnip::rand_forest() %>%
  parsnip::set_mode("classification") %>%
  parsnip::set_engine("ranger")
#> Error in parsnip::rand_forest() %>% parsnip::set_mode("classification") %>% : could not find function "%>%"

xgboost_spec <-parsnip::boost_tree() %>%
  parsnip::set_mode("classification") %>%
  parsnip::set_engine("xgboost")
#> Error in parsnip::boost_tree() %>% parsnip::set_mode("classification") %>% : could not find function "%>%"


# Workflow:
dfml_wfset <- workflowsets::workflow_set(
  list(dfml_recipe, dfml_recipe2, dfml_recipe3),
  list(
    log_reg,
    # lda_spec,
    tree_spec,
    knn_spec,
    bays_spec,
    svm_spec,
    rf_spec,
    xgboost_spec
  ),
  cross = TRUE
)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip
#> Error in workflowsets::workflow_set(list(dfml_recipe, dfml_recipe2, dfml_recipe3), : object 'dfml_recipe' not found

dfml_wfset
#> Error in eval(expr, envir, enclos): object 'dfml_wfset' not found

# Processing:
# doParallel::registerDoParallel()
set.seed(123)
dfml_rs <- workflowsets::workflow_map(
  dfml_wfset,
  "fit_resamples",
  resamples = dfml_folds,
  metrics = metric_set(sensitivity, specificity, kap)
)
#> Error in rlang::list2(...): object 'dfml_folds' not found
#> Execution stopped; returning current results
#> Error in rlang::is_list(x): object 'dfml_wfset' not found

dfml_rs
#> Error in eval(expr, envir, enclos): object 'dfml_rs' not found

Created on 2021-12-10 by the reprex package (v2.0.1)

Your reprex fails from the initial_split() onwards because it can't find the data df_ml. Did you mean to assign the tibble to df_ml?

Your outcome variable collision_severity has 4 different categories but you use logistic regression which is designed for a binary outcome.

Centering and scaling only make sense for numeric variables but all the predictors in your example dataset are categorical. You might want to look into step_novel() and step_other() instead.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.