Dear tidyverse community,
I'm heavily relying on tidyverse core packages, and generally very happy with it -- especially with tidy eval. I have implemented a nested group k-fold cross-validation routine based on rsample and the Cubist algorithm. After an recent update of R and packages two days ago my working memory starts to fill up until my Linux freezes -- my toolset is broken, in particular this part:
join_predobs_y_vars <- function(..., object) {
y_vars <- rlang::ensyms(...)
# Unquoting `.x`: see 20.6.1 Map-reduce to generate code
# in https://adv-r.hadley.nz/quasiquotation.html
dfs_unnested <- purrr::map(.x = y_vars,
~ rename_predobs_y_var(object = object, y_var = !!.x))
# // pb: 20180711: Remove `by = "resample_id"` argument because
# there may be other common character or factor variables to join by
purrr::reduce(dfs_unnested, function(x, y) dplyr::inner_join(x = x, y = y))
}
# Function to extract and nest resampling prediction results by a column
# variable contained in a data frame
rename_predobs_y_var <- function(object, y_var) {
# use ensym() instead of enquo() to caputure the y_var argument
# supplied by the user; ensym checks the captured expression is a string or
# a symbol, and will return a symbol in both cases
quo_var <- rlang::ensym(y_var)
new_names <- paste0(rlang::quo_name(quo_var), c("_pred", "_obs"))
vars <- c("pred", "obs")
names(vars) <- new_names
df_renamed <- object %>%
tidyr::unnest(!! quo_var) %>%
# note that there will probably be soon a tidyselect feature request
# to support "bang bang", !!, and
# the "triple bang", !!!, is needed, see
# https://github.com/tidyverse/dplyr/issues/3030
dplyr::rename(!!! vars)
df_renamed
}
This function is part of a bigger custom nested resampling and model fitting workflow:
Here is the error:
> cubist_outer_predobs_cm <- join_predobs_y_vars(
+ object = cubist_nested_results_predobs,
+ vg_theta_s, vg_theta_r, vg_alpha, vg_n,
+ kosugi_theta_s, kosugi_theta_r, kosugi_sigma, kosugi_h_mi)
Joining, by = "resample_id"
Joining, by = "resample_id"
Joining, by = "resample_id"
Joining, by = "resample_id"
Joining, by = "resample_id"
Error: std::bad_alloc
> traceback()
8: stop(list(message = "std::bad_alloc", call = NULL, cppstack = NULL))
7: inner_join_impl(x, y, by_x, by_y, aux_x, aux_y, na_matches, environment())
6: inner_join.tbl_df(x = x, y = y)
5: dplyr::inner_join(x = x, y = y) at resampling-cubist-rules.R#660
4: fn(out, elt, ...)
3: reduce_impl(.x, .f, ..., .init = .init, .dir = .dir)
2: purrr::reduce(dfs_unnested, function(x, y) dplyr::inner_join(x = x,
y = y)) at resampling-cubist-rules.R#660
1: join_predobs_y_vars(object = cubist_nested_results_predobs, vg_theta_s,
vg_theta_r, vg_alpha, vg_n, kosugi_theta_s, kosugi_theta_r,
kosugi_sigma, kosugi_h_mi)
I'm very much fan of open source and open science, but I unfortunately cannot share the research data in this case because the data the publication is based on is intellectual property of a big scientific institution. However, I could try coming up with an artificial data set.
I very much struggling to resolve the core dump issue. I was unfortunately not able to find the cause. Has anybody a nice idea? I thought it might be related to some c++ compilation issues / Rcpp?
I can create a reprex today. Maybe some of you already have a hint:
Here is my session info:
> sessioninfo::session_info()
─ Session info ──────────────────────────────────────────────────
setting value
version R version 3.6.0 (2019-04-26)
os Ubuntu 18.04.2 LTS
system x86_64, linux-gnu
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Zurich
date 2019-05-20
─ Packages ──────────────────────────────────────────────────────
! package * version date lib
P assertthat 0.2.1 2019-03-21 [?]
backports 1.1.4 2019-04-10 [1]
base64url 1.4 2018-05-14 [1]
broom * 0.5.2 2019-04-07 [1]
caret * 6.0-84 2019-04-27 [1]
cellranger 1.1.0 2016-07-27 [1]
class 7.3-15 2019-01-01 [4]
P cli 1.1.0 2019-03-19 [?]
codetools 0.2-16 2018-12-24 [4]
colorspace 1.4-1 2019-03-18 [1]
P crayon 1.3.4 2017-09-16 [?]
Cubist * 0.2.2 2018-05-21 [1]
data.table * 1.12.2 2019-04-07 [1]
P digest 0.6.18 2018-10-10 [?]
doFuture * 0.8.0 2019-03-17 [1]
doParallel * 1.0.14 2018-09-24 [1]
dplyr * 0.8.1 2019-05-14 [1]
drake * 7.3.0 2019-05-19 [1]
e1071 1.7-1 2019-03-19 [1]
forcats * 0.4.0 2019-02-17 [1]
foreach * 1.4.4 2017-12-12 [1]
future * 1.13.0 2019-05-08 [1]
future.apply * 1.2.0 2019-03-07 [1]
generics 0.0.2 2018-11-29 [1]
ggplot2 * 3.1.1 2019-04-07 [1]
globals * 0.12.4 2018-10-11 [1]
P glue 1.3.1 2019-03-12 [?]
gower 0.2.1 2019-05-14 [1]
gridExtra * 2.3 2017-09-09 [1]
gtable 0.3.0 2019-03-25 [1]
haven 2.1.0 2019-02-19 [1]
here * 0.1 2017-05-28 [1]
P hms 0.4.2 2018-03-10 [?]
httr 1.4.0 2018-12-11 [1]
igraph 1.2.4.1 2019-04-22 [1]
ipred 0.9-9 2019-04-28 [1]
iterators * 1.0.10 2018-07-13 [1]
P jsonlite 1.6 2018-12-07 [?]
lattice * 0.20-38 2018-11-04 [4]
lava 1.6.5 2019-02-12 [1]
lazyeval 0.2.2 2019-03-15 [1]
listenv 0.7.0 2018-01-21 [1]
lubridate 1.7.4 2018-04-11 [1]
P magrittr 1.5 2014-11-22 [?]
MASS 7.3-51.1 2018-11-01 [4]
Matrix 1.2-17 2019-03-22 [4]
ModelMetrics 1.2.2 2018-11-03 [1]
modelr 0.1.4 2019-02-18 [1]
munsell 0.5.0 2018-06-12 [1]
nlme 3.1-139 2019-04-09 [4]
nls.multstart * 1.0.0 2018-03-06 [1]
nnet 7.3-12 2016-02-02 [4]
pillar 1.4.0 2019-05-11 [1]
P pkgconfig 2.0.2 2018-08-16 [?]
plyr 1.8.4 2016-06-08 [1]
prodlim 2018.04.18 2018-04-18 [1]
purrr * 0.3.2 2019-03-15 [1]
P R6 2.4.0 2019-02-14 [?]
Rcpp 1.0.1 2019-03-17 [1]
readr * 1.3.1 2018-12-21 [1]
readxl 1.3.1 2019-03-13 [1]
recipes 0.1.5 2019-03-21 [1]
reshape2 1.4.3 2017-12-11 [1]
rlang 0.3.4 2019-04-07 [1]
rpart 4.1-15 2019-04-12 [4]
rprojroot 1.3-2 2018-01-03 [1]
rsample * 0.0.4 2019-01-07 [1]
rstudioapi 0.10 2019-03-19 [1]
rvest 0.3.4 2019-05-15 [1]
scales 1.0.0 2018-08-09 [1]
sessioninfo 1.1.1 2018-11-05 [1]
simplerspec * 0.1.0 2019-05-19 [1]
storr 1.2.1 2018-10-18 [1]
P stringi 1.4.3 2019-03-12 [?]
P stringr * 1.4.0 2019-02-10 [?]
survival 2.43-3 2018-11-26 [4]
tibble * 2.1.1 2019-03-16 [1]
tidyr * 0.8.3 2019-03-01 [1]
tidyselect 0.2.5 2018-10-11 [1]
tidyverse * 1.2.1 2017-11-14 [1]
timeDate 3043.102 2018-02-21 [1]
withr 2.1.2 2018-03-15 [1]
xml2 1.2.0 2018-01-24 [1]
source
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.5.2)
CRAN (R 3.6.0)
CRAN (R 3.5.2)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.5.1)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.5.1)
CRAN (R 3.5.3)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.5.3)
CRAN (R 3.6.0)
CRAN (R 3.5.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
Github (philipp-baumann/simplerspec@333e070)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.5.1)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
CRAN (R 3.6.0)
[1] /home/baumanph/R/x86_64-pc-linux-gnu-library/3.6
[2] /usr/local/lib/R/site-library
[3] /usr/lib/R/site-library
[4] /usr/lib/R/library
P ── Loaded and on-disk path mismatch.
Looking forward for a wise hint from the community
Best,
Philipp