I'm having trouble with the
recipes::recipe() function when using a wide set of input predictor features: I get an error saying "cannot allocate vector of size XX Gb."
I've worked up a reproducible example below. Any suggestions or workarounds would be greatly appreciated!
library(AmesHousing) library(tidyverse) library(recipes) ## make a small tibl from the ames housing package ames <- make_ames() %>% select(Sale_Price, Longitude, Latitude) %>% ## make outcome be binary indicator of sale price ## being above $150,000 dplyr::mutate(Sale_Price = factor(sign(Sale_Price>150000)) %>% fct_inseq() ) ## make a recipe with small p/few predictors rec <- recipe(Sale_Price ~ ., data = ames ) # works no problem! rec
Up to this point everything runs smoothly, but if I try to add many more columns to the Ames data I can't get the same script to run:
## add large p nxp matrix to ames p <- 500000 set.seed(32798) big.dat <- matrix(runif(n = nrow(ames) * p), nrow = nrow(ames), ncol = p) %>% as_tibble() big.ames <- ames %>% bind_cols(big.dat) ## make recipe with large p ames dataset rec <- recipe(Sale_Price ~ ., data = big.ames ) ## this never completes! ## > Error: cannot allocate vector of size 3017.5 Gb ## > Execution halted
I'm using a machine with quite a lot of RAM so feel like the command must be getting hung up somewhere unnecessarily, but I'm not sure.