I don't know why, but I got completely engrossed in code golfing this question. The most succinct I've been able to come up with are
x <- c("A", "B", "C")
df <- as.data.frame(sapply(x, function(x) numeric()))
df <- as.data.frame(vapply(x, function(x) numeric(), numeric()))
df <- as.data.frame(replicate(length(x), numeric()), col.names = x)
df <- as.data.frame(matrix(ncol = length(x), nrow=0, dimnames = list(NULL,x)))
df <- as.data.frame(matrix(numeric(),nrow = 0, ncol = length(x))); names(df) <- x
But succinct doesn't translate into efficient. I timed this over data frames from 1 to 500 columns (why anyone would ever make 500 columns of a data frame like this, I don't know, but let's not have reality interfere with ridiculous fun)
The best performer ends up being
df <- as.data.frame(matrix(ncol = length(x), nrow=0, dimnames = list(NULL,x)))`
in case you're really desperate to save yourself a few microseconds. (and I might need a better hobby)
library(dplyr)
library(tidyr)
library(ggplot2)
bench_fun <- function(x){
x <- sprintf("X%s", seq_len(x))
microbenchmark(
sapply = df <- as.data.frame(sapply(x, function(x) numeric())),
vapply = df <- as.data.frame(vapply(x, function(x) numeric(), numeric())),
replicate = df <- as.data.frame(replicate(length(x), numeric()), col.names = x),
name_with = df <- as.data.frame(matrix(ncol = length(x), nrow=0, dimnames = list(NULL,x))),
name_separately = {df <- as.data.frame(matrix(numeric(),nrow = 0, ncol = length(x))); names(df) <- x}
)
}
Bench <-
data_frame(ncol = 1:500) %>%
mutate(result = lapply(ncol, bench_fun))
X <- Bench %>%
mutate(result = lapply(result, as.data.frame)) %>%
unnest(result) %>%
group_by(ncol, expr) %>%
summarise(median = median(time, na.rm = TRUE)) %>%
ungroup()
ggplot(data = X,
mapping = aes(x = ncol,
y = median,
colour = expr)) +
geom_line()