I have a small rcpp function and associated R wrapper function that improved upon the performance of the built-in rowSums function when running under R 3.x. Since moving to R 4.x, it is dramatically slower than the built-in function when running in RStudio under Windows 10 Professional. I contacted Dirk Eddelbuettel (the principal maintainer of Rcpp) via GitHub assuming that there was an issue with R 4.x and Rcpp. He was able to demonstrate that the problem did not exist when running on R 4.x under Linux or MAC OS10. I am left to assume that this is a Windows (or possibly RTools 4) or RStudio problem. If anyone has experienced this issue, I would appreciate some guidance. Test code and results below
Running in my RStudio:
> x = matrix(data = rnorm(1e8), ncol = 1000)
> system.time(rowSums(x))
user system elapsed
0.23 0.00 0.24
> system.time(rowSumsA(x))
user system elapsed
1.74 0.00 1.74
Running on Ubuntu 20.04
> x <- matrix(data = rnorm(1e8), ncol = 1000)
> system.time(rowSums(x))
user system elapsed
0.167 0.000 0.167
> system.time(rowSumsA(x))
user system elapsed
0.053 0.000 0.053
Code:
#include <Rcpp.h>
// [[Rcpp::export]]
SEXP rowSumsACPP(SEXP x, SEXP na_rm) {
// Assumes that R wrapper function will only pass in a matrix
using namespace Rcpp;
NumericMatrix xin(x);
long nRows = xin.nrow();
long nCols = xin.ncol();
bool removeNA = as<bool>(na_rm);
NumericVector xout = NumericVector(nRows);
if(removeNA) {
for(long c = 0; c < nCols; c++) {
for(long r = 0; r < nRows; r++) {
// Only non-NA contents of xin will equal itself
if(xin(r, c) == xin(r, c)) {
xout[r] += xin(r, c);
}
}
}
} else {
for(long c = 0; c < nCols; c++) {
for(long r = 0; r < nRows; r++) {
xout[r] += xin(r, c);
}
}
}
return(xout);
}
/*** R
rowSumsA <- function(x, na.rm = FALSE) {
if(mode(na.rm) != "logical") {
stop("na.rm parameter must be of type logical")
}
if(any(class(x) == "data.frame")) {
return(rowSums(x, na.rm))
}
if(is.null(dim(x))) {
# This case is assumed to be a vector, just return it unchanged
xout = x
} else {
if(!(mode(x) %in% c("numeric", "logical"))) {
stop("x parameter is not numeric or logical")
}
# This case is assumed to be a zoo or matrix
# Rcpp will convert zoo to matrix during assignment, so we don't need to explicitly handle
xout = rowSumsACPP(x, na.rm)
}
return(xout)
}
*/