Unable to produce superscript in labels within x-axis

For five days I have been trying to use a superscript within labels on the x-axis, unfortunately, I am unable to do it. I was able to do it within the x-axis itself but using superscript to alter the column names seems to be impossible for me to do. Can someone please help me?

1 Like

I have two solutions.

1 Manually rewriting

First method is to use scale_x_discrete() and to reannotate every label:

library(tidyverse)

df <- tibble(concentration = rep(c("Atropine 10^{-3}", "Atropine 10^{-4}"),each=5),
             heart = c(rnorm(5, 150, 50), rnorm(5,100,75)))

ggplot(df) +
  geom_violin(aes(x=concentration, y=heart)) +
  scale_x_discrete(labels = c(bquote(Atropine*10^{-3}), bquote(Atropine*10^{-4})))

2 string to expression

The second method is to use str2expression() to convert the strings to expressions. For this to work, you first need to process the strings so that they become plotmath-compatible expressions. In my example, I need to replace the spaces with * or ~. Here is the full description.

So now we can do:

df$concentration2 <- str_replace_all(df$concentration, " ", "*")
ggplot(df) +
  geom_violin(aes(x=concentration, y=heart)) +
  scale_x_discrete(labels = str2expression(df$concentration2))

This almost works, but you will see there is a problem: since concentration contains the full column, here it's 10 values, it's not necessarily ordered as the labels we need. I'm just using unique() for my example, but you might need to reorder to make sure the values are in the right order. The difficulty is that once you've applied str2expression() you can't manipulate that vector anymore, so you need to create a correct vector of labels first.

ggplot(df) +
  geom_violin(aes(x=concentration, y=heart)) +
  scale_x_discrete(labels = str2expression(unique(df$concentration2)))

image

1 Like

Oh and I just realized there is a better way to do the same thing:

3 with a function

We can simply put both the conversion of spaces to "*" and the str2expression() in a custom function, and give it to labels:

make_plotmath_labels <- function(labs){
  labs <- str_replace_all(labs, " ", "*")
  str2expression(labs)
}

ggplot(df) +
  geom_violin(aes(x=concentration, y=heart)) +
  scale_x_discrete(labels = make_plotmath_labels)

This keeps the automatic ordering.

So with your labels, you might want to use that function:

make_plotmath_labels <- function(labs){
  labs <- str_replace_all(labs, " 1x10-([0-9])", "*1~x~10^{-\\1}")
  str2expression(labs)
}
1 Like

Dear @AlexisW,

I am new to R, and I really appreciate your answer, and I am sure this what I was looking for, but I do not understand what df <-t means? I am guessing that this is the global formula and in order to make it useful, I should replace it with appropriate data I am currently working on. I am sorry in advance if it is a stupid question, I can imagine that the R community is well more experienced with R, and by seeing an example with global data they are able to replace it with their own data, but I am still learning, and I am hoping that once explained I will one more step ahead with R.

P.S please do not take me wrong, I am doing everything by myself , as I know that only by doing mistakes we are able to learn faster and it makes us more independent, so I do not want to you to produce the entire code for me, no, no that is not the case. I just noticed that everybody produces their answers based on the global data, and in many cases I was getting stuck, as I did not understand what some sections mean.
Thank you

Yes, it's all about practice : )

For your example, you didn't provide your data. So, in order to try something I generated some random data, just to illustrate the approach. The best way to understand might be to copy/paste these two lines and look at the result:

df <- tibble(concentration = rep(c("Atropine 10^{-3}", "Atropine 10^{-4}"),each=5),
             heart = c(rnorm(5, 150, 50), rnorm(5,100,75)))

df
# A tibble: 10 x 2
#   concentration    heart
#   <chr>            <dbl>
# 1 Atropine 10^{-3} 148. 
# 2 Atropine 10^{-3} 212. 
# 3 Atropine 10^{-3}  71.8
# 4 Atropine 10^{-3} 139. 
# 5 Atropine 10^{-3} 153. 
# 6 Atropine 10^{-4} 194. 
# 7 Atropine 10^{-4}  78.1
# 8 Atropine 10^{-4}  44.8
# 9 Atropine 10^{-4}  49.5
#10 Atropine 10^{-4} 198. 

So I generated a data frame (also called tibble) that has 10 rows and two columns, the first column is "Atropine 10^{-3}" repeated 5 times then "Atropine 10^{-4}" repeated 5 times, the second column is just 10 random numbers. You don't really need to understand that code, looking at the result should be enough, but it shouldn't be too hard to understand if you look up each function.

As for the solution, it so happens that this question is actually really hard: functions like bquote() are among the hardest things to understand in R, it's definitely for advanced users. So at this point I recommend that you don't try too hard to understand what I explained in titles 1 and 2, simply take the function in title 3, I think it should work on your data. In 1 year, when you're starting to be more comfortable with R, you can come back to this and try to understand, for now focus on the analysis side.

2 Likes

I keep practicing, it takes time, as I am not a programmer, I am about to become a scientist, and I started loving the R studio, as you can produce incredible data. Following your advice to take the function in title 3. I did so and the problem below occurred.

You are a pure genius. So I used scale_x_discrete(labels = c(bquote(Atropine*10^{-3}), bquote(Atropine*10^{-4}))) and


I do not know how to thank you.

The function is not part of the ggplot() call, they should not be bound with +. Your code should look like:

library(a)
library(b)

make_plotmath_labels <- function(){
  ...
}

Atropine2$Concentration <- factor(...)

ggplot(Atropine2) +
    geom_boxplot(...) +
    geom_jitter(...) +
    labs(....) +
    scale_x_discrete(labels = make_plotmath_labels)

So, the code for the function has to be "executed" before you use it. You will use it in scale_x_discrete(). So you first load your libraries (always first), then you define the functions and data, and then you can call ggplot() on the data using the functions.

ggplot() is a bit peculiar in its structure of many blocks bound together with +, but each block has to be a ggplot building block (e.g. geom_*()).

2 Likes

Happy that it works : )

Yes, it did

Thank you so much once again.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.