PCA biplot, error the condition has length > 1 and only the first element will be used

Hello,
I'm trying to create a biplot of PCA including variables and individuals as below.

gene <- data.frame(
stringsAsFactors = FALSE,
ID = c("C1","C1","CP1","CP2","CS1","CS2","P1","P2","PP1","PP2","PS1","PS2"),
Group = c("C","C","CP","CP","CS","CS","P","P","PP","PP","PS","PS"),
Gene.1 = c(0,0,
-0.449544646,1.604164916,4.22837086,4.403123401,
2.477594943,1.452939095,2.582137758,1.460824362,
3.826064003,2.153493385),
Gene.2 = c(0,0,
-1.12287798,0.142498249,1.025037527,2.708123401,
1.947594943,1.639605762,1.425471091,0.645824362,
2.112730669,0.303493385),
Gene.3 = c(0,0,
-1.39287798,-1.737501751,2.17837086,2.221456735,
1.727594943,0.942939095,1.288804424,-1.027508971,
2.799397336,0.426826718),
Gene.4 = c(0,0,
-4.55287798,-0.447501751,1.961704194,2.171456735,
1.597594943,2.386272428,1.225471091,-2.200842304,
2.246064003,-0.319839948),
Gene.5 = c(0,0,
-1.87287798,0.595831583,3.95837086,4.808123401,
0.870928276,0.582939095,0.925471091,-0.367508971,
4.892730669,2.260160052),
Gene.6 = c(0,0,
0.333788687,-0.250835084,4.271704194,4.688123401,
1.670928276,-0.873727572,2.585471091,-0.330842304,
4.456064003,4.396826718),
Gene.7 = c(0,0,
-0.046211313,1.735831583,0.121704194,1.974790068,
0.270928276,-0.083727572,-0.411195576,-0.464175638,
0.056064003,-0.853173282)
)
library(FactoMineR)
library(factoextra)
library(ggplot2)

data <- data.frame(gene, row.names = gene$ID) #assign column ID as row name
Y <- data[,-c(1)] #create data frame Y ready for principal component analysis, remove the column "ID"
pca <- PCA(Y, quali.sup = 1, ncp =2, scale.unit = FALSE, axes = c(1,2), graph = F) #realize PCA analysis, no drawing graph, assign first column (groups) as supplementary qualitative variable

#draw biplot for variable projection and individual position on axes 1 & 2 of PCA. Coloring the variables based on their contribution on axes 1 & 2. Coloring the individuals but changing the symbols (pointshape) and the colors (col.ind & full.ind) following the groups (Y$Group)

fviz_pca_biplot(pca,
fill.ind= Y$Group, palette = c("#2E9FDF","green","red","yellow","violet","dark orange"), pointsize = 5, mean.point = F, pointshape = c(20,20,20,15,15,15,16,16,16,17,17,17,18,18,18,19,19,19), col.ind = "black",
addEllipses = F, repel = T,
col.var = "contrib", gradient.cols = c("light grey", "purple", "red"), arrowsize = 0.7,
legend.title = list(color = "Gene contribution", fill = "Group")
)

The problem is with the individuals. When I try to change the point shape and the color of the symbols of individuals, R can't generate a proper legend because
In if (shape %in% colnames(data)) { :
the condition has length > 1 and only the first element will be used

I saw that I need to vectorize an element but I don't know. Please help.
Thank you in advance.

Hello, your issue is that if() is not vectorised, so r provides ifelse() for that requirement.

There is a good tutorial to look at here :
adv-r

Thank you nirgrahamuk. But the problem is in this package factoextra, the legend is generated automatically and I don't know where to put ifelse() in my code.

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

Thank you. I tried to manage and I show you my script. Please take a look. My pleasure to get your help.

Ok, its simply that pointshape parameter of fviz_pca_biplot does not support more than a single choice, its a limitation of the package.

Yah I think so. Maybe I'll find another way :frowning:
Anw, thank you so much !

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.