Reordering X and Y axes in geom_tile based on different variable factor

I am trying to build a correlation matrix heat map using geom_tile, but I want to reorder the X and Y axes based on the factors in another column. Below is a sample of the data that I'm using right now (there will be QB, RB, WR, etc. values in the Pos.y column, these are just the first 10 rows):

                 X1             X2         value Pos.x Pos.y
1    Baker Mayfield Baker Mayfield  1.0000000000    QB    QB
2       Kareem Hunt Baker Mayfield  0.0229086957    RB    QB
3        Nick Chubb Baker Mayfield  0.3685311077    RB    QB
4   DErnest Johnson Baker Mayfield  0.0001331921    RB    QB
5     Andy Janovich Baker Mayfield -0.0392098985    RB    QB
6     Jarvis Landry Baker Mayfield  0.4692240161    WR    QB
7     Odell Beckham Baker Mayfield  0.1755536358    WR    QB
8    KhaDarel Hodge Baker Mayfield  0.0086210505    WR    QB
9   Rashard Higgins Baker Mayfield  0.4290631835    WR    QB
10      JoJo Natson Baker Mayfield  0.1645174257    WR    QB

What I want to do is reorder the X-axis by Pos.x and the Y-axis by Pos.y , using the following factor levels:

data$Pos.x <- factor(data$Pos.x, levels = c("QB", "RB", "WR", "TE", "Def"))

Below is the code I currently have for my plot:

ggplot(data = data)+
  geom_tile(aes(x = X1, y = X2, fill = value))

I've tried using arrange() on the dataset and reorder() in the aesthetics but nothing seems to be working. Any suggestions?

Hi @hopnstop,
One solution is to set-up ordered factors.

library(tidyverse)
# I edited your data to include FIVE different strings in Pos.x,
# and changed the "values" to make the differences clearer.

my_dat <- read.csv(sep=",", header=TRUE, strip.white=TRUE, text="
             X1,              X2,   value, Pos.x, Pos.y
 Baker Mayfield,  Baker Mayfield,   1.0,  QB,  QB
    Kareem Hunt,  Baker Mayfield,   0.1,  RB,  QB
     Nick Chubb,  Baker Mayfield,   0.2,  RB,  QB
DErnest Johnson,  Baker Mayfield,   0.3,  RB,  QB
  Andy Janovich,  Baker Mayfield,   0.4,  RB,  QB
  Jarvis Landry,  Baker Mayfield,   0.7,  WR,  QB
  Odell Beckham,  Baker Mayfield,   0.8,  WR,  QB
 KhaDarel Hodge,  Baker Mayfield,   0.01,  TE,  QB
Rashard Higgins,  Baker Mayfield,   0.5,  Def,  QB
    JoJo Natson,  Baker Mayfield,   0.6,  Def,  QB")

# What I want to do is reorder the X-axis by Pos.x and the Y-axis by Pos.y ,
# using the following factor levels:

my_dat$ord.x <- factor(my_dat$Pos.x, ordered=TRUE, levels = c("QB", "RB", "WR", "TE", "Def"))
str(my_dat)
#> 'data.frame':    10 obs. of  6 variables:
#>  $ X1   : chr  "Baker Mayfield" "Kareem Hunt" "Nick Chubb" "DErnest Johnson" ...
#>  $ X2   : chr  "Baker Mayfield" "Baker Mayfield" "Baker Mayfield" "Baker Mayfield" ...
#>  $ value: num  1 0.1 0.2 0.3 0.4 0.7 0.8 0.01 0.5 0.6
#>  $ Pos.x: chr  "QB" "RB" "RB" "RB" ...
#>  $ Pos.y: chr  "QB" "QB" "QB" "QB" ...
#>  $ ord.x: Ord.factor w/ 5 levels "QB"<"RB"<"WR"<..: 1 2 2 2 2 3 3 4 5 5

# Compare the two plots:
# You don't have unique single combinations of Pos.x and Pos.y in your data, so
# (I think) geom_tile() will be using the mean where there are multiple values.
my_dat %>%
  group_by(ord.x) %>%
  summarise(mvar = mean(value))
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 5 x 2
#>   ord.x  mvar
#>   <ord> <dbl>
#> 1 QB     1   
#> 2 RB     0.25
#> 3 WR     0.75
#> 4 TE     0.01
#> 5 Def    0.55

ggplot(data = my_dat)+
  geom_tile(aes(x = Pos.x, y = Pos.y, fill = value))


ggplot(data = my_dat)+
  geom_tile(aes(x = ord.x, y = Pos.y, fill = value))

Created on 2020-12-10 by the reprex package (v0.3.0)

HTH

1 Like

If I'm not mistaken, hopnstop wants to plot each individual person on x and y, using Pos.x only to define the order of the X1. So what you did is correct, but you would need to define X1 as a factor, rather than ord.x. That should work with sorting by Pos.x then defining the levels of factor X1 with forcats::fct_inorder():

dat2 <- my_dat %>%
  arrange(Pos.x) %>%
  mutate(X1 = fct_inorder(factor(X1, ordered=TRUE)))

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.