I'm developing a pipeline in R that involves a user supplying information on how different nodes in a network are connected, and I'm hoping to get back a matrix that shows the same information but in a distance-type layout. Here's a simple example. Suppose the user supplies a data frame that gives the strength of connections between 4 nodes:
library(tidyverse)
edges_set <- tibble::tribble(
~from, ~to, ~weight,
"node1", "node2", 1,
"node2", "node3", 0.4,
"node2", "node4", 0.6,
"node3", "node4", 0.8,
"node3", "node2", 0.2,
"node4", "node1", 1
) %>% as.data.frame
The analysis functions I'm using depend on having a matrix layout similar to a distance matrix between these nodes above. I have a "brute-force" approach to making this matrix that involves an inefficient use of loops:
nodes_names <- paste0("node", 1:4)
# set up weight matrix
m <- matrix(0, nrow=length(nodes_names), ncol=length(nodes_names))
rownames(m) <- colnames(m) <- nodes_names
# populate edge weights in appropriate matrix elements
for(dd in 1:nrow(edges_set)) {
row_id <- edges_set[dd, "from"]
col_id <- edges_set[dd, "to"]
m[row_id, col_id] <- as.numeric(edges_set[dd, "weight"])
}
The performance is fine for a small network, but the typical use cases for the analysis involve potentially many more nodes and thus many more edges. Is there a vectorized way of converting the two-column edges data frame above to the matrix layout? I'm trying to squeeze every bit of performance for this pipeline.