How can I add tags to my points on a PCA with ggplot2?

Hi, I made a PCA with ggplot2, however I want my PCA points to have a name, what function could I use. Thank you
Taxon MOPF MA RA TV GT
1 Andesiops 75 25 0 0 detritívoro
2 Meridialaris 5 95 0 0 herbívoro
3 Polycentropus 20 0 80 0 depredador
4 Austrelmis 93 7 0 0 detritívoro
5 Prionocyphon 40 0 60 0 depredador
6 Neoplasta 98 2 0 0 detritívoro
7 Gigantodax 65 15 20 0 detritívoro
8 Simulium 30 65 5 0 herbívoro
9 Tipula 30 0 0 70 herbívoro
10 Alotanypus 60 20 20 0 detritívoro
11 Pentaneura 35 5 60 0 depredador
12 Corynoneura 15 85 0 0 herbívoro
13 Cricotopus 90 10 0 0 detritívoro
14 Tanytarsus 95 5 0 0 detritívoro
15 Podonomus 25 75 0 0 herbívoro
16 Podonomopsis 85 15 0 0 detritívoro
17 Parochlus 95 5 0 0 detritívoro

image

Here is a reprex of your data. Have you got your ggplot code?

df <- tibble::tribble(
           ~Taxon, ~MOPF, ~MA, ~RA, ~TV,           ~GT,
      "Andesiops",    75,  25,   0,   0, "detritívoro",
   "Meridialaris",     5,  95,   0,   0,   "herbívoro",
  "Polycentropus",    20,   0,  80,   0,  "depredador",
     "Austrelmis",    93,   7,   0,   0, "detritívoro",
   "Prionocyphon",    40,   0,  60,   0,  "depredador",
      "Neoplasta",    98,   2,   0,   0, "detritívoro",
     "Gigantodax",    65,  15,  20,   0, "detritívoro",
       "Simulium",    30,  65,   5,   0,   "herbívoro",
         "Tipula",    30,   0,   0,  70,   "herbívoro",
     "Alotanypus",    60,  20,  20,   0, "detritívoro",
     "Pentaneura",    35,   5,  60,   0,  "depredador",
    "Corynoneura",    15,  85,   0,   0,   "herbívoro",
     "Cricotopus",    90,  10,   0,   0, "detritívoro",
     "Tanytarsus",    95,   5,   0,   0, "detritívoro",
      "Podonomus",    25,  75,   0,   0,   "herbívoro",
   "Podonomopsis",    85,  15,   0,   0, "detritívoro",
      "Parochlus",    95,   5,   0,   0, "detritívoro"
  )

If you are just looking for a pointer you can use geom_text () or geom_label () but if you need specific advice, then I recommend you to provide a reproducible example as requested by William, since it seems you are a fellow Peruvian (because of your university), maybe you prefer the Spanish version of the reprex guide

1 Like

library(ggfortify)
setwd("C:/Users/M4NU.51LV4/Desktop/Pca en R")

chi=data.frame(pca2)

chicucha<- chi[c(1, 2, 3, 4)]
autoplot(prcomp(chicucha))
autoplot(prcomp(chicucha), data = chi, colour = 'GT')
autoplot(prcomp(chicucha), data = chi, colour = 'GT', loadings = TRUE)

Thanks, I managed to put the names, but these overlap, is there any code to avoid that?

Thanks, I managed to put the names, but these overlap, is there any code to avoid that

There is an extention package called ggrepel that you can use for that.

You could also try including the argument check_overlap = TRUE inside of your geom_text() call (but outside of aes()), and see if that helps.

how could I add it in the following code?

autoplot(prcomp(chicucha), data = chi, colour = 'GT', label = TRUE, loadings = TRUE)

how could I add it in the following code?

autoplot(prcomp(chicucha), data = chi, colour = 'GT', label = TRUE, loadings = TRUE)

There is no need to repeat your posts we all can see them.

As I said before, if you need more specific help, please provide a proper REPRoducible EXample (reprex) illustrating your issue.

Could you post the table chicucha for us? (The data you provided earlier, and which @williaml kindly posted, won't allow us to reproduce the steps you took.)

The easiest way to do this is to use the dput() function and post its output between a pair of triple backticks (```) like this:

```
[paste output of dput(chicucha) here]
```

If chicucha is big table, then you should use the output of dput(chicucha %>% slice(1:50)) instead -- this would be a big help to us in trying to help you.

Thank you very much for your support, I am new to the R environment, thanks for your answers, I hope this time I do it well

dput(chi)

chi
library(ggfortify)
library(ggplot2)
rownames(chi)<- chi$Taxon
chicucha<- chi[c(2, 3, 4, 5)]
chicucha
autoplot(prcomp(chicucha))
autoplot(prcomp(chicucha), data = chi, colour = 'GT')
autoplot(prcomp(chicucha), data = chi, colour = 'GT', label = TRUE, loadings = TRUE)

OK, in that case, you should post the output from dput(chi), meaning, you should copy the output, and paste it here between triple backticks (```), like this,

```
[paste output of dput(chi) here]
```

You need the triple backticks so we can copy it easily.

dput(chi)
chi

                         "Austrelmis", "Prionocyphon", "Neoplasta", "Gigantodax", "Simulium", 
                         "Tipula", "Alotanypus", "Pentaneura", "Corynoneura", "Cricotopus", 
                         "Tanytarsus", "Podonomus", "Podonomopsis", "Parochlus"), MOPF = c(75, 
                                                                                           5, 20, 93, 40, 98, 65, 30, 30, 60, 35, 15, 90, 95, 25, 85, 95
                         ), MA = c(25, 95, 0, 7, 0, 2, 15, 65, 0, 20, 5, 85, 10, 5, 75, 
                                   15, 5), RA = c(0, 0, 80, 0, 60, 0, 20, 5, 0, 20, 60, 0, 0, 0, 
                                                  0, 0, 0), TV = c(0, 0, 0, 0, 0, 0, 0, 0, 70, 0, 0, 0, 0, 0, 0, 
                                                                   0, 0), GT = c("detritívoro", "herbívoro", "depredador", "detritívoro", 
                                                                                 "depredador", "detritívoro", "detritívoro", "herbívoro", "herbívoro", 
                                                                                 "detritívoro", "depredador", "herbívoro", "detritívoro", "detritívoro", 
                                                                                 "herbívoro", "detritívoro", "detritívoro")), class = "data.frame", row.names = c("Andesiops", 
                                                                                                                                                                  "Meridialaris", "Polycentropus", "Austrelmis", "Prionocyphon", 
                                                                                                                                                                  "Neoplasta", "Gigantodax", "Simulium", "Tipula", "Alotanypus", 
                                                                                                                                                                  "Pentaneura", "Corynoneura", "Cricotopus", "Tanytarsus", "Podonomus", 
                                                                                                                                                                  "Podonomopsis", "Parochlus"))```¨

library(ggfortify)
library(ggplot2)
rownames(chi)<- chi$Taxon
chicucha<- chi[c(2, 3, 4, 5)]
chicucha
autoplot(prcomp(chicucha))
autoplot(prcomp(chicucha), data = chi, colour = 'GT')
autoplot(prcomp(chicucha), data = chi, colour = 'GT', label  = TRUE, loadings = TRUE)

That's better! But you cut off part of the output from dput(chi) -- could you try again and only paste the output of dput(chi) and don't include any other code?

                         "Austrelmis", "Prionocyphon", "Neoplasta", "Gigantodax", "Simulium", 
                         "Tipula", "Alotanypus", "Pentaneura", "Corynoneura", "Cricotopus", 
                         "Tanytarsus", "Podonomus", "Podonomopsis", "Parochlus"), MOPF = c(75, 
                                                                                           5, 20, 93, 40, 98, 65, 30, 30, 60, 35, 15, 90, 95, 25, 85, 95
                         ), MA = c(25, 95, 0, 7, 0, 2, 15, 65, 0, 20, 5, 85, 10, 5, 75, 
                                   15, 5), RA = c(0, 0, 80, 0, 60, 0, 20, 5, 0, 20, 60, 0, 0, 0, 
                                                  0, 0, 0), TV = c(0, 0, 0, 0, 0, 0, 0, 0, 70, 0, 0, 0, 0, 0, 0, 
                                                                   0, 0), GT = c("detritívoro", "herbívoro", "depredador", "detritívoro", 
                                                                                 "depredador", "detritívoro", "detritívoro", "herbívoro", "herbívoro", 
                                                                                 "detritívoro", "depredador", "herbívoro", "detritívoro", "detritívoro", 
                                                                                 "herbívoro", "detritívoro", "detritívoro")), class = "data.frame", row.names = c("Andesiops", 
                                                                                                                                                                  "Meridialaris", "Polycentropus", "Austrelmis", "Prionocyphon", 
                                                                                                                                                                  "Neoplasta", "Gigantodax", "Simulium", "Tipula", "Alotanypus", 
                                                                                                                                                                  "Pentaneura", "Corynoneura", "Cricotopus", "Tanytarsus", "Podonomus", 
                                                                                                                                                                  "Podonomopsis", "Parochlus"))```

Still cutting it off. How about you try running these commands, one at a time:

sink('dput_output.txt')
dput(chi)
sink()

This will send the output of the dput() command to a file called 'dput_output.txt', which you can then open, copy, and paste here. Could you try that?

structure(list(Taxon = c("Andesiops", "Meridialaris", "Polycentropus",
"Austrelmis", "Prionocyphon", "Neoplasta", "Gigantodax", "Simulium",
"Tipula", "Alotanypus", "Pentaneura", "Corynoneura", "Cricotopus",
"Tanytarsus", "Podonomus", "Podonomopsis", "Parochlus"), MOPF = c(75,
5, 20, 93, 40, 98, 65, 30, 30, 60, 35, 15, 90, 95, 25, 85, 95
), MA = c(25, 95, 0, 7, 0, 2, 15, 65, 0, 20, 5, 85, 10, 5, 75,
15, 5), RA = c(0, 0, 80, 0, 60, 0, 20, 5, 0, 20, 60, 0, 0, 0,
0, 0, 0), TV = c(0, 0, 0, 0, 0, 0, 0, 0, 70, 0, 0, 0, 0, 0, 0,
0, 0), GT = c("detritívoro", "herbívoro", "depredador", "detritívoro",
"depredador", "detritívoro", "detritívoro", "herbívoro", "herbívoro",
"detritívoro", "depredador", "herbívoro", "detritívoro", "detritívoro",
"herbívoro", "detritívoro", "detritívoro")), class = "data.frame", row.names = c("Andesiops",
"Meridialaris", "Polycentropus", "Austrelmis", "Prionocyphon",
"Neoplasta", "Gigantodax", "Simulium", "Tipula", "Alotanypus",
"Pentaneura", "Corynoneura", "Cricotopus", "Tanytarsus", "Podonomus",
"Podonomopsis", "Parochlus"))

This works, but in general you should place the output between a pair of triple backticks, like this:

```
[paste output of dput(chi) here]
```

One thing that confuses me is that if you apply prcomp() to this table, you don't get PCA coordinates for individual taxa; you get them for each of the numeric columns of chi -- how did you get taxon coordinates to make your plot?