Row-Wise Comparison of Values in Nested Lists

Hello. I have a tibble with 2 columns, each a nested list. I would like to make a new column where I look at each row and calculate the percentage of values in the cell of column_a that are greater than the values in the cell of column_b. The number of elements nested within each cell is the same, if that matters.

I know that there is a trick where if I can compare each value from the cell in col_a to the cell in col_b then I can take the mean of that vector to get the percentage I care about. I would like to use that way if possible but don't know how to do it when both are nested. Example: mean(vector_a > vector_b) gets the percentage I care about if I just had 2 vectors but I have a tibble full of them.

Example of a similar tbl with nested lists:

library(tidyverse)
n=10
a <- tibble(mu = c(3, 4, 5),
            sd = 1) %>% mutate(column_a = map2(mu, sd, ~rnorm(n, .x, .y))) %>% select(column_a)

b <- tibble(mu = c(3, 4, 5),
            sd = .8) %>% mutate(column_b = map2(mu, sd, ~rnorm(n, .x, .y))) %>% select(column_b)

combined <- bind_cols(a, b) 

combined

You mean something like this:

library(tidyverse)
n=10
a <- tibble(mu = c(3, 4, 5),
            sd = 1) %>% mutate(column_a = map2(mu, sd, ~rnorm(n, .x, .y))) %>% select(column_a)

b <- tibble(mu = c(3, 4, 5),
            sd = .8) %>% mutate(column_b = map2(mu, sd, ~rnorm(n, .x, .y))) %>% select(column_b)

combined <- bind_cols(a, b) 

combined %>%
  dplyr::mutate(perc = purrr::map2_dbl(column_a, column_b, ~mean(.x > .y)))
#> # A tibble: 3 x 3
#>   column_a   column_b    perc
#>   <list>     <list>     <dbl>
#> 1 <dbl [10]> <dbl [10]>   0.6
#> 2 <dbl [10]> <dbl [10]>   0.5
#> 3 <dbl [10]> <dbl [10]>   0.6

Created on 2020-03-29 by the reprex package (v0.3.0)

2 Likes

It's so simple! Thank you mishabalyasin!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.