New variables after Principal Components Analysis prcomp

Hello,
I'm using the prcomp() function to perform Principal Component Analysis, but I am unsure about which is the PCA output that I can use as new variables for the further modeling process.
I have my code like below: (I set scale=FALSE because I don't think it is necessary for my data)

res.PCA = prcomp(Bres_Data, scale=FALSE)
summary(res.PCA)
y.pca.mat = res.PCA$x[,c(1:3)]

Originally I have 100 response variables so I want to decrease the number of variables. According to the summary of PCA, I decided to choose the first 3 PCs. But I'm not sure if "res.PCA$x" is correct to be defined as the new response variables.

One thing to mention about my data has n=66 rows(observations) and p=100 columns(variables). Those 100 variables are all same type of measurement and highly correlated. Then it only produces 66 PCs, not the 100 number of PCs as I've learned in class. Is this reasonable? I was thinking it may be related to n<p situation, but I don't know why.

Could someone verify for me?
Thank you so much!

Hi, can you do glimpse(Bres_Data) to reveal the structure ?
your notation of n and p is ambigious to me.
Thanks.

Observations: 66
Variables: 100
$ D_1   <dbl> 47.3, 46.7, 42.3, 48.1, 44.9, 45.3, 45.1, 44.3, 46.7, 39.7, 42.3, 44.7, 35.9, 47.7, 38.5, 43.7, 48.1, 44.9, 46.5, 47.1, 42.1, 51.9, 49.7, 66.1, 45.7, 54.3, 40.5, 46.1, 41.1, 26.1, 21.7, 48....
$ D_2   <dbl> 46.5, 45.7, 41.3, 45.9, 43.7, 44.5, 43.5, 43.5, 45.7, 38.1, 41.1, 43.9, 35.1, 46.5, 37.9, 42.9, 47.1, 43.9, 45.1, 46.1, 41.5, 50.9, 48.9, 65.1, 45.1, 53.3, 39.5, 45.1, 40.3, 24.3, 20.3, 47....
$ D_3   <dbl> 45.7, 45.3, 40.5, 44.5, 43.1, 44.1, 42.5, 42.7, 45.3, 37.3, 40.3, 43.5, 34.3, 45.5, 37.5, 42.1, 46.1, 43.3, 43.7, 45.7, 40.9, 49.9, 48.1, 64.3, 44.5, 52.7, 38.7, 44.3, 39.7, 21.7, 19.1, 46....
$ D_4   <dbl> 45.1, 44.7, 39.7, 43.7, 42.1, 43.7, 41.9, 42.1, 44.9, 36.1, 39.5, 43.1, 33.5, 44.9, 37.1, 41.7, 45.7, 42.7, 42.9, 45.1, 40.5, 49.1, 47.5, 63.7, 44.1, 51.7, 38.3, 43.7, 39.3, 16.1, 18.5, 46....
$ D_5   <dbl> 44.7, 44.3, 39.3, 43.1, 41.5, 43.3, 41.1, 41.5, 44.3, 35.5, 39.1, 42.7, 33.1, 44.5, 36.7, 41.3, 45.1, 42.1, 42.5, 44.5, 40.3, 48.7, 46.7, 63.3, 43.5, 51.3, 37.9, 43.1, 39.1, 14.7, 17.7, 45....
$ D_6   <dbl> 44.3, 43.9, 38.7, 42.5, 40.7, 42.9, 40.3, 41.1, 43.9, 34.7, 38.5, 42.3, 32.7, 44.1, 36.5, 41.1, 44.5, 41.7, 41.9, 44.3, 40.1, 48.1, 46.3, 62.9, 43.1, 50.5, 37.3, 42.5, 38.9, 8.3, 16.9, 45.5...
$ D_7   <dbl> 43.9, 43.5, 38.1, 41.9, 40.1, 42.7, 39.7, 40.5, 43.5, 34.1, 38.1, 42.1, 32.1, 43.7, 36.3, 40.9, 44.1, 41.5, 41.5, 43.9, 39.9, 47.5, 45.9, 62.3, 42.9, 50.3, 37.1, 42.1, 38.5, 7.7, 16.3, 45.1...
$ D_8   <dbl> 43.7, 43.3, 37.7, 41.3, 39.5, 42.5, 39.5, 40.1, 43.1, 33.5, 37.5, 41.7, 31.7, 43.1, 35.9, 40.5, 43.7, 41.1, 41.3, 43.5, 39.5, 47.3, 45.3, 62.1, 42.5, 49.9, 36.7, 41.7, 38.3, 7.3, 15.7, 44.7...
$ D_9   <dbl> 43.3, 42.9, 37.3, 40.7, 38.7, 42.3, 38.9, 39.7, 42.7, 32.9, 37.1, 41.5, 31.5, 42.9, 35.7, 40.3, 43.3, 40.7, 40.9, 43.1, 39.3, 46.9, 44.9, 61.7, 42.3, 49.3, 36.3, 41.3, 38.1, 5.3, 14.9, 44.5...
$ D_10  <dbl> 43.1, 42.7, 36.9, 40.3, 38.1, 41.9, 38.5, 39.3, 42.5, 32.3, 36.7, 41.3, 30.9, 42.5, 35.5, 40.3, 43.1, 40.3, 40.5, 42.9, 38.9, 46.5, 44.5, 61.3, 41.9, 48.9, 35.9, 40.9, 38.1, 4.9, 14.1, 44.3...
$ D_11  <dbl> 42.5, 42.5, 36.5, 39.7, 37.7, 41.7, 38.1, 38.9, 42.3, 31.9, 36.3, 40.9, 30.5, 42.3, 35.3, 40.1, 42.5, 40.1, 40.1, 42.7, 38.7, 46.3, 44.3, 61.1, 41.7, 48.5, 35.5, 40.5, 37.9, 4.7, 13.3, 43.9...
$ D_12  <dbl> 42.3, 42.1, 36.3, 39.3, 37.1, 41.7, 37.5, 37.7, 41.9, 31.5, 35.7, 40.7, 30.3, 42.1, 35.1, 39.9, 42.1, 39.9, 39.9, 42.5, 38.3, 45.9, 43.9, 60.7, 41.5, 48.1, 35.1, 40.1, 37.7, 4.3, 11.5, 43.7...
$ D_13  <dbl> 42.1, 41.9, 35.9, 38.9, 36.5, 41.5, 37.3, 35.7, 41.7, 30.9, 35.5, 40.5, 29.9, 41.7, 34.9, 39.7, 41.7, 39.5, 39.5, 42.1, 37.9, 45.5, 43.5, 60.5, 41.3, 47.7, 34.7, 39.7, 37.5, 4.1, 8.3, 43.5,...
$ D_14  <dbl> 41.7, 41.5, 35.5, 38.3, 35.9, 41.3, 36.9, 34.9, 41.5, 30.3, 35.1, 40.1, 29.7, 41.5, 34.7, 39.5, 41.1, 39.3, 39.1, 41.9, 37.5, 45.3, 42.9, 60.3, 41.1, 47.3, 34.3, 39.5, 37.5, 3.9, 7.9, 43.3,...
$ D_15  <dbl> 41.5, 41.3, 35.3, 37.9, 35.3, 41.1, 36.5, 34.1, 41.1, 30.1, 34.7, 39.9, 29.5, 41.5, 34.5, 39.3, 40.3, 39.1, 38.9, 41.5, 36.9, 44.9, 42.3, 59.9, 40.9, 46.9, 33.9, 39.3, 37.3, 3.7, 7.5, 43.1,...
$ D_16  <dbl> 41.3, 40.9, 35.1, 37.7, 34.5, 40.9, 36.3, 33.5, 40.9, 29.5, 34.3, 39.7, 29.3, 41.1, 34.3, 39.1, 39.7, 38.7, 38.7, 41.3, 36.5, 44.5, 41.5, 59.7, 40.7, 46.5, 33.5, 38.9, 37.1, 3.7, 5.9, 42.9,...
$ D_17  <dbl> 40.9, 40.5, 34.7, 37.1, 34.1, 40.9, 35.9, 32.9, 40.5, 29.3, 34.1, 39.5, 28.9, 40.7, 34.3, 38.9, 38.5, 38.5, 38.3, 41.3, 36.1, 44.3, 40.5, 59.5, 40.5, 46.1, 33.1, 38.7, 36.9, 3.5, 5.5, 42.7,...
$ D_18  <dbl> 40.7, 40.1, 34.5, 36.7, 33.5, 40.7, 35.7, 32.5, 40.3, 28.9, 33.7, 39.1, 28.5, 40.5, 34.1, 38.9, 37.3, 38.3, 38.1, 40.9, 35.5, 44.1, 39.5, 59.3, 40.3, 45.7, 32.5, 38.3, 36.9, 3.3, 5.3, 42.5,...
$ D_19  <dbl> 40.3, 39.9, 34.3, 36.3, 33.1, 40.5, 35.5, 31.7, 40.1, 28.3, 33.5, 38.9, 28.3, 40.3, 33.9, 38.7, 36.1, 38.1, 37.7, 40.7, 34.9, 43.7, 37.9, 58.9, 40.1, 45.5, 31.9, 37.9, 36.7, 3.1, 4.9, 42.3,...
$ D_20  <dbl> 40.1, 39.5, 34.1, 35.9, 32.5, 40.5, 35.1, 31.3, 39.9, 27.9, 33.1, 38.7, 28.3, 39.9, 33.7, 38.7, 35.1, 37.9, 37.3, 40.5, 33.9, 43.5, 35.9, 58.7, 39.9, 44.9, 31.5, 37.7, 36.7, 3.1, 4.3, 41.9,...
$ D_21  <dbl> 39.9, 39.1, 33.9, 35.5, 31.9, 40.3, 34.9, 30.9, 39.7, 27.5, 32.7, 38.5, 28.1, 39.7, 33.5, 38.5, 34.1, 37.7, 37.1, 40.3, 33.3, 43.3, 33.9, 58.5, 39.7, 44.7, 30.9, 37.5, 36.5, 2.9, 4.1, 41.7,...
$ D_22  <dbl> 39.7, 38.7, 33.5, 35.3, 31.3, 40.1, 34.5, 30.5, 39.5, 27.1, 32.5, 38.1, 27.7, 39.1, 33.3, 38.3, 33.3, 37.5, 36.7, 40.1, 32.1, 43.1, 32.7, 58.3, 39.5, 44.1, 30.3, 37.1, 36.3, 2.9, 3.9, 41.5,...
$ D_23  <dbl> 39.3, 38.5, 33.3, 34.9, 30.9, 39.9, 34.3, 29.9, 39.1, 26.7, 32.3, 37.9, 27.5, 38.9, 33.1, 38.1, 32.5, 37.3, 36.3, 39.9, 31.1, 42.9, 31.7, 58.1, 39.3, 43.9, 29.9, 36.9, 36.3, 2.7, 3.7, 41.5,...
$ D_24  <dbl> 39.1, 38.1, 32.9, 34.5, 30.3, 39.9, 34.1, 29.7, 38.7, 26.1, 31.9, 37.7, 27.5, 38.5, 32.9, 37.9, 32.1, 37.1, 35.5, 39.7, 30.1, 42.7, 30.9, 57.9, 39.1, 43.3, 29.1, 36.7, 36.1, 2.5, 3.5, 41.3,...
$ D_25  <dbl> 38.9, 37.7, 32.3, 34.3, 29.7, 39.7, 33.9, 29.1, 38.3, 25.9, 31.7, 37.3, 27.3, 38.1, 32.9, 37.9, 31.5, 36.7, 35.1, 39.5, 28.3, 42.5, 30.1, 57.7, 38.9, 42.9, 28.3, 36.5, 35.9, 2.5, 3.5, 41.1,...

The glimpse() of the data is shown above code results, I'm only pasting here the first 25 variables.

My data has 66 rows and 100 columns. Those 100 variables are all same type of measurement and highly correlated.

as far as I'm concerned something is wrong, because you should get as many PC's as there were columns going in, not rows. are you sure ?

Its very hard to debug/explore without your data though.

Yes I am sure.
I have applied exactly same code with same amount of rows, but with 6 variables only. And I got 6 PC's as usual.
But it does not happen to be like that when I expand to use 100 variables. I was thinking something wrong may be caused by my row number is smaller than the column number, but I don't know how to explain.

I also want to make sure if the x matrix output from prcomp() is a correct matrix that I should look up to use as new variables for further analysis. One of the online help says this x matrix is "The coordinates of the individuals (observations) on the principal components." Could you please help?

I had the thought that duplicate columns in the source data will result in fewer PC's generated than you might expect. I will check later to see about writing code to identify duplicated columns.

Thanks for helping.
I just checked my data, they are highly correlated but none of the 100 columns are duplicated.

Instead of fitting multivariate model for 100 response variables, I would like to reduce the number of Response variables by PCA down to 3 variables. I am wondering if the X matrix output from the prcomp() function can be directly used as new response variables to fit model?

yes. thats my understanding anyway.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.