Error in R long vectors not supported yet SVM

I am using CentOS 7 Linux compute cluster, with 130 GB RAM. I am trying to use the SVM function from the e1071 R package. My matrix dimension is rows = 350 and columns = 54250.

R script Code (file_testR.R)

    matris=matrix(rnorm(100),350,54251)
    matris <- as.data.frame(matris)
    matris$new_variable <- 0
    matris$new_variable[1:175] <- "yes"
    matris$new_variable[176:350] <- "no"
    require(e1071)
    svmfit_test <- svm(as.factor(matris$new_variable)~., data = matris, kernel = "linear", cross=10)

Bash code

Rscript --max-ppsize=500000 file_testR.R

I am getting this below error:

Error in model.matrix.default(Terms, m) :
long vectors not supported yet: ../../src/include/Rinlinedfuns.h:522
Calls: svm ... svm.formula -> model.matrix -> model.matrix.default

I would appreciate it if anybody could help me to understand this issue.

Hi and welcome!

I can suggest the shape of the underlying problem, but I can't extend any suggestions for what to do about it, unfortunately. I hope others will weigh in.

The e1071::svm() function is compiled against the Rinlinedfuns.h header, which you can find here. At It's line 522, which is the last line of this chunk

/* from dstruct.c */

/*  length - length of objects  */

int Rf_envlength(SEXP rho);

/* TODO: a  Length(.) {say} which is  length() + dispatch (S3 + S4) if needed
         for one approach, see do_seq_along() in ../main/seq.c
*/
INLINE_FUN R_len_t length(SEXP s)
{
    switch (TYPEOF(s)) {
    case NILSXP:
	return 0;
    case LGLSXP:
    case INTSXP:
    case REALSXP:
    case CPLXSXP:
    case STRSXP:
    case CHARSXP:
    case VECSXP:
    case EXPRSXP:
    case RAWSXP:
	return LENGTH(s);

As I parse it, length(SEXP s) is the hangup.

The header is in C and this note observes that

Since arrays decay immediately into pointers, an array is never actually passed to a function.

And an array is just what we are dealing with. Why is that important? Because there may be a limit on how much memory it's possible to point to. This can be affected, of course, by RAM, but yours is orders of magnitude greater than the size of your object. It can also be limited by OS-set defaults limits like -ulimit but it might also be limited by malloc in C, how much memory be allocated.

I don't pretend to being able to unravel this puzzle, but I think that is its approximate shape.

Thank you very much for your suggestion!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.