All combinations of a 20x2 df

Hello everyone,

I am trying to perform a "simple" task but am struggling to find a way to do it smartly.
I have a 20 by 2 table, with a character vector (colors) and a numeric vector (values).
In plain words, each color has a value attributed to it.

My goal is to create all the combinations of colors possible (Green | Blue, Red | Blue | Yellow, etc...), store that information in the first column vector, and store the average of their associated values in the second column.

So, the resulting table should be a Xby2 table, with string combinations in the first vector, and the result of a mathematical operation (the average) in the second vector.

Do you have any idea please :slight_smile: ?

Here's a reprex that should get you started, and I think you'll be able to take it from here (you can get the average using a [mutate()](, and then select the columns you want in your final data frame (what you're calling a table) using select().

I've gotten all of the combination of colours using tidyr's expand_grid(). But, in my opinion, you're better off keeping the original colours around in their own columns for the purposes of joining the values later (hence my use of remove = FALSE in the unite() function).

ex_df <- tibble::tribble(
   ~colour, ~value,
    "blue",     8L,
   "green",     4L,
     "red",     3L,
  "yellow",    11L,
    "pink",     2L,
  "violet",     6L

col_cross <- expand_grid(col1 = ex_df$colour, col2 = ex_df$colour)
#> # A tibble: 36 x 2
#>    col1  col2  
#>    <chr> <chr> 
#>  1 blue  blue  
#>  2 blue  green 
#>  3 blue  red   
#>  4 blue  yellow
#>  5 blue  pink  
#>  6 blue  violet
#>  7 green blue  
#>  8 green green 
#>  9 green red   
#> 10 green yellow
#> # … with 26 more rows

col_cross %>%
  unite("col_combo", col1, col2, sep = ", ", remove = FALSE) %>%
  left_join(ex_df, by = c("col1" = "colour")) %>% # get value for first colour
  left_join(ex_df, by = c("col2" = "colour")) # get value for second colour
#> # A tibble: 36 x 5
#>    col_combo     col1  col2   value.x value.y
#>    <chr>         <chr> <chr>    <int>   <int>
#>  1 blue, blue    blue  blue         8       8
#>  2 blue, green   blue  green        8       4
#>  3 blue, red     blue  red          8       3
#>  4 blue, yellow  blue  yellow       8      11
#>  5 blue, pink    blue  pink         8       2
#>  6 blue, violet  blue  violet       8       6
#>  7 green, blue   green blue         4       8
#>  8 green, green  green green        4       4
#>  9 green, red    green red          4       3
#> 10 green, yellow green yellow       4      11
#> # … with 26 more rows

Created on 2021-04-05 by the reprex package (v1.0.0)

I'm sure there's a shorter way to do this, but I thought it would be useful to see all of the steps.

For future reference, providing us with a reprex (a self-contained reproducible example) makes it easier for us to help you!


Mara's answer is quite good.

As a small variant, one can also use tidyr::expand,
which I believe is better suited in a tidyverse series of pipelined commands.

So, a bit more succinct. By the way, if the following is unclear, you can just incrementally add and execute each of the pipelined commands one at a time to see what they do. I do this by putting a comment symbol # before the pipe symbol %>% to interrupt the chained execution at some point.

Following Mara:

# make an example dataframe
ex_df <- tibble::tribble(
   ~colour, ~value,
    "blue",     8L,
   "green",     4L,
     "red",     3L,
  "yellow",    11L,
    "pink",     2L,
  "violet",     6L
# see help("expand")

col_cross <- ex_df %>%
 mutate(colour2 = colour) %>% # duplicate the colours, so they can be "crossed"
 expand(colour, colour2) %>% # generates all combination of the specified variables
 left_join(ex_df, by=c("colour")) %>% # add/join values for first colour
 left_join(ex_df, by=c("colour2" = "colour")) %>% # add/join values for colour2
 unite(comb_col, c(colour, colour2)) %>% # combine the colours into one col
 mutate(ave = (value.x + value.y)/2) # compute ave

1 Like