# Using pmap with 2 tibbles

I'm fairly new to tidyverse. I have managed to get this code to work with a small example, but I'm sure there must be a better way. Here's the problem I'm trying to code.

I have 2 tibbles, tA and tB (my actual tibbles have 100,000+ rows)

``````tA <- tibble::tribble(
~name1, ~name2,
"k",  "fw",
"k",  "im",
"k",  "fw",
"g",  "fw",
"g",  "im",
)

tB <- tibble::tribble(
~k_im, ~k_fw, ~g_im, ~g_fw,
0.031, 0.053, 0.000, 0.090,
0.209, 0.105, 0.000, 0.105,
0.274, 0.125, 0.158, 0.132,
0.331, 0.186, 0.199, 0.185,
0.344, 0.205, 0.201, 0.271,
0.367, 0.235, 0.272, 0.308,
0.382, 0.270, 0.295, 0.368,
0.390, 0.285, 0.299, 0.430,
0.397, 0.355, 0.348, 0.443,
0.484, 0.406, 0.419, 0.524,
0.532, 0.470, 0.430, 0.531,
0.557, 0.530, 0.468, 0.601,
0.609, 0.590, 0.530, 0.614,
0.646, 0.646, 0.593, 0.631,
0.712, 0.692, 0.644, 0.687,
0.730, 0.700, 0.652, 0.694,
0.793, 0.725, 0.706, 0.707,
0.845, 0.768, 0.766, 0.772,
0.862, 0.831, 0.814, 0.778,
0.886, 0.876, 0.863, 0.788,
0.887, 0.918, 0.896, 0.835,
1.000, 0.926, 0.918, 0.904,
1.000, 0.976, 0.969, 0.990,
1.000, 1.000, 1.000, 1.000
)
``````

For every row in tA I need to generate a different random number between 0 and 1, then look up the appropriate variable in tB (so for row 1 in tA, that would be column k_fw in tB) and find the first row where the value in tB is greater than the random number.

This function does what I need, e.g. findX(tB, "k_im", 0.5) returns 11.

``````findX <- function(x,y,z){ match(TRUE,x[y]>z) }
``````

This gives the right output but involves inserting tB into every row. I think I must be missing something!

``````output <- tA %>%
mutate(name3 = paste0(name1,"_",name2)) %>%
mutate(rnm = runif(5)) %>%
mutate(xx = list(tB)) %>%
mutate(rr = unlist(purrr::pmap(list(xx, name3, rnm), findX)))
``````

Any thoughts? I think I've been staring at this for so long a fresh approach might be needed. Thanks!

Another solution that's not miles away from yours but doesn't repeat `tB`:

``````output2 <- tA %>%
dplyr::mutate(rnm = runif(5)) %>%
dplyr::mutate(rr = purrr::map2_int(paste0(name1, "_", name2), rnm, ~findX(tB, .x, .y)))
``````
1 Like

On my system this is faster and with a smaller memory footprint

``````library(tidyverse)
library(microbenchmark)
tA <- tibble::tribble(
~name1, ~name2,
"k", "fw",
"k", "im",
"k", "fw",
"g", "fw",
"g", "im",
)

tB <- tibble::tribble(
~k_im, ~k_fw, ~g_im, ~g_fw,
0.031, 0.053, 0.000, 0.090,
0.209, 0.105, 0.000, 0.105,
0.274, 0.125, 0.158, 0.132,
0.331, 0.186, 0.199, 0.185,
0.344, 0.205, 0.201, 0.271,
0.367, 0.235, 0.272, 0.308,
0.382, 0.270, 0.295, 0.368,
0.390, 0.285, 0.299, 0.430,
0.397, 0.355, 0.348, 0.443,
0.484, 0.406, 0.419, 0.524,
0.532, 0.470, 0.430, 0.531,
0.557, 0.530, 0.468, 0.601,
0.609, 0.590, 0.530, 0.614,
0.646, 0.646, 0.593, 0.631,
0.712, 0.692, 0.644, 0.687,
0.730, 0.700, 0.652, 0.694,
0.793, 0.725, 0.706, 0.707,
0.845, 0.768, 0.766, 0.772,
0.862, 0.831, 0.814, 0.778,
0.886, 0.876, 0.863, 0.788,
0.887, 0.918, 0.896, 0.835,
1.000, 0.926, 0.918, 0.904,
1.000, 0.976, 0.969, 0.990,
1.000, 1.000, 1.000, 1.000
)

findX <- function(x, y, z) {
match(TRUE, x[y] > z)
}

tA %>%
mutate(name3 = paste0(name1, "_", name2)) %>%
mutate(rnm = runif(5)) -> tA2

microbenchmark::microbenchmark(

orig = output <- tA2 %>%
mutate(xx = list(tB)) %>%
mutate(rr = unlist(purrr::pmap(list(xx, name3, rnm), findX))),

new = {
rr <- map_int(
1:nrow(tA2),
~ findX(tB, tA2\$name3[.], tA2\$rnm[.])
) %>% enframe(value = "rr", name = NULL)
newout <- bind_cols(tA, rr)
}
)``````
2 Likes

Thanks @ChrisL I hadn't realised map2_int would work like that, very useful.

Many thanks @nirgrahamuk, this is certainly a lot faster than my solution. Both answers work fine and get around the issue of repeating tB, but I have marked this as the solution as it seems more flexible when applied to my 'real' code. (I have to repeat the process several times adding a new 'rr' column each time.)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.