Fix number of groups different from number of rows (when they should be the same) error?

Hey guys, first time posting because I got really really stumped last night for the first time in R. Hope this is the right forum for a question relating to code. If not, by all means remove. Thanks.

I have this data frame that is 3 columns x 36000 rows. Early in the code I transposed the data frame so the frame I am trying to use when this hiccup occurs somehow becomes 36000c x 2r.

BOP1_flg22_vs_WT_flg22_rnaseq_pre <- read.csv("TEST_DATA_COMPARE.csv")

BOP1_flg22_vs_WT_flg22_rnaseq_pre[1:4, 1:3]
    ï..gene WT.flg22.__pd__2..BAM. bop1.6D.flg22.__pd__2..BAM.
1 AT1G01010                   1383                        1260
2 AT1G01020                    742                         598
3 AT1G03987                      1                           5
4 AT1G01030                    229                         183

BOP1_flg22_vs_WT_flg22_gene_symbols <- BOP1_flg22_vs_WT_flg22_rnaseq_pre$ï..gene

BOP1_flg22_vs_WT_flg22_rnaseq <- t(BOP1_flg22_vs_WT_flg22_rnaseq_pre[, -1])

colnames(BOP1_flg22_vs_WT_flg22_rnaseq) <- BOP1_flg22_vs_WT_flg22_gene_symbols

BOP1_flg22_vs_WT_flg22_rnaseq <- data.frame(BOP1_flg22_vs_WT_flg22_rnaseq)

BOP1_flg22_vs_WT_flg22_rnaseq[1:2, 1:10]
                            AT1G01010 AT1G01020 AT1G03987 AT1G01030 AT1G01040 AT1G03993 AT1G01046
WT.flg22.__pd__2..BAM.           1383       742         1       229      2023         0        17
bop1.6D.flg22.__pd__2..BAM.      1260       598         5       183      1360         0         1
                            AT1G01050 AT1G03997 AT1G01060
WT.flg22.__pd__2..BAM.           3865         0      1818
bop1.6D.flg22.__pd__2..BAM.      3289         0      2333

BOP1_flg22_vs_WT_flg22_rnaseq_samples <-rownames(BOP1_flg22_vs_WT_flg22_rnaseq)

BOP1_flg22_vs_WT_flg22_rnaseq_group <- data.frame("group" = c("WT_flg22", "BOP1_flg22"))

row.names(BOP1_flg22_vs_WT_flg22_rnaseq_group) <- BOP1_flg22_vs_WT_flg22_rnaseq_samples

set.seed(1)

BOP1_flg22_vs_WT_flg22_rnaseq_results <- dnapath("BOP1_flg22_vs_WT_flg22_rnaseq", pathway_list = NULL, groups = BOP1_flg22_vs_WT_flg22_rnaseq_group, network_inference = run_pcor, seed = 1)

Error in dnapath("BOP1_flg22_vs_WT_flg22_rnaseq", pathway_list = NULL,  : 
length(groups) = 2, but nrow(x) = 1. These must be equal.

Does transposing the data somehow turn 3 columns (gene name and 2 values) into 2 rows?
Anyone know what is going on and how to fix??

I may be missing the point but it seems to me that you are take the transpose of the second and third columns, dropping the first one.

BOP1_flg22_vs_WT_flg22_rnaseq <- t(BOP1_flg22_vs_WT_flg22_rnaseq_pre[, -1])

I would expect two rows from that, as in this super simple example:

> DF <- data.frame(A = LETTERS[1:4], B = 1:4, C = 2:5)
> DF
  A B C
1 A 1 2
2 B 2 3
3 C 3 4
4 D 4 5
> t(DF[, -1])
  [,1] [,2] [,3] [,4]
B    1    2    3    4
C    2    3    4    5
1 Like

Ah, I see. Removing the first column to transpose the data and add the column back as row headings is what made the number of rows go down to 2 when I expected it would stay as 3.
I still can't seem to figure out why I've created a group for each row resulting in 2 groups, and the output of the matrix dimensions shows the number of rows as 2, yet the error still occurs.

dim(BOP1_flg22_vs_WT_flg22_rnaseq)
[1]     2 32833

and

BOP1_flg22_vs_WT_flg22_rnaseq_group
                                 group
WT.flg22.__pd__2..BAM.        WT_flg22
bop1.6D.flg22.__pd__2..BAM. BOP1_flg22

I think that "BOP1_flg22_vs_WT_flg22_rnaseq" should not be in quotes in the call to dnapath().

1 Like

I had it that way originally, but when it is like that the function doesn't read the DF at all. Instead it results in an error as follows:

BOP1_flg22_vs_WT_flg22_rnaseq_results <- dnapath(BOP1_flg22_vs_WT_flg22_rnaseq, pathway_list = NULL, groups = BOP1_flg22_vs_WT_flg22_rnaseq_group, network_inference = run_pcor, seed = 1)
Error in matrix(0, nrow = p, ncol = p) : non-numeric matrix extent
In addition: Warning message:
In dnapath(BOP1_flg22_vs_WT_flg22_rnaseq, pathway_list = NULL, groups = BOP1_flg22_vs_WT_flg22_rnaseq_group,  :
  Only 0 permutations are possible with the given sample size. Setting 'n_perm' to this value.

The groups argument should be a vector. Quoting from the documentation:
" groups is a vector of length equal to the number of rows in x , and it should contain two unique elements (the two group names)"
My guess is that passing groups as a data frame is causing a problem.

1 Like

Hmm. I messed around with that just now and it didn't seem to result in any fix. I have practically followed the documentation code identically. I think it has something to do with the number of rows in the data frame. Every sort of display and dimensions of the frame shows 2 rows present as I demonstrated, yet the error claims nrow(x) = 1. That unfortunately seems to be the issue, the number of groups is right were I want it (2).
I think I might have to scrap it and start fresh since there really appears to be nothing wrong.

What happens if you run this version of the code?

BOP1_flg22_vs_WT_flg22_rnaseq_pre  <- read.csv("TEST_DATA_COMPARE.csv")
BOP1_flg22_vs_WT_flg22_rnaseq <- t(BOP1_flg22_vs_WT_flg22_rnaseq_pre[, -1])
BOP1_flg22_vs_WT_flg22_rnaseq_results <- dnapath(BOP1_flg22_vs_WT_flg22_rnaseq, pathway_list = NULL, groups = c("WT_flg22", "BOP1_flg22"), network_inference = run_pcor, seed = 1)

The idea is that dnapath() accepts matrices as well as data frames and that I do not think row names matter. I could be wrong, I have never used this function.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.