Can't sort data frame by column name with a "-" in it

I have a data frame with a column I want to sort by, and the name is "Tcrg-C1"

When I run: df.sort<- df[order(-df$"Tcrg-C1"), ]

I get: Error in -df$"Tcrg-C1" : invalid argument to unary operator

When I run: df.sort<- df[order(-df$Tcrg-C1), ]

I get: Error in eval(quote(list(...)), env) : object 'C1' not found

and finally when I use backticks: df.sort<- df[order(-V.4$Tcrg-C1), ]

I get: Error in -V.4$Tcrg-C1 : invalid argument to unary operator

Any ideas how to make the computer sort on this column?

Thanks and happy holidays!

Make friends with _ and ditch - in the names of any R object. It will save you no end of grief. If you have a compelling reason to use -, you will need to subset. Assuming Tcrg-C1 is in column 4

df.sort<- df[order(-df[4, ]

While I agree with your recommendation about usage of - in column headings, your code fragment is incorrect and misses the closing ), ]:

df[order(-df[4, ]

will order the columns descending by the values in row 4 (if you add ), ] to the statement).

order() returns a list of row numbers that would sort df but you need to also specify the columns:

df.sort <- df[order(-df[ , 4]), ]

df[row, column] so order(-df[ , 4]) sorts rows in df descending by column 4. Adding another , ] will pick all columns in the final result.

1 Like

Thanks. One of the hazards of working without a reproducible example, called a reprex and being too lazy to recreate a test data frame.

Thanks guys! I didn't think about calling the column by its index (duh!!!)

1 Like

Just to let you know, you can also use the non-syntactic name, but you have to surround it with backticks, not double quotes, see this example:

df <- data.frame(`Tcrg-C1` = rnorm(10),
                 check.names = FALSE)
df[order(-df$`Tcrg-C1`), ]
#>  [1]  1.55230177  0.80692436 -0.06985415 -0.11250646 -0.35477148 -0.47891313
#>  [7] -0.76296825 -1.44910684 -2.03221197 -2.07937508
1 Like

Great. Could you mark a solution (suggest @andresrcs's) for the benefit of those to follow?

I just want to add a point to this discussion that the reason because Andres solution worked and @cook675 failed with an error of invalid argument to unary operator beause of the use of check.names=F. It ensures that the data frame created actually has a column named Tcrg-C1, and not Tcrg.C1. Check below:

df1 <- data.frame(`Tcrg-C1` = rnorm(10), check.names = FALSE)
names(x = df1)
#> [1] "Tcrg-C1"

df2 <- data.frame(`Tcrg-C1` = rnorm(10)) # `check.names` is TRUE by default
names(x = df2)
#> [1] "Tcrg.C1"

Created on 2019-12-24 by the reprex package (v0.3.0)

So, without check.names=F, df$`Tcrg-C1` is just NULL, and hence unary - is not applicable on it.

3 Likes

Hi guys sorry for not replying sooner I was on holiday. Thanks Andres but I tried using backticks in my orginal example and I get the same error.

I also tried @Yarnabrina's and your idea of setting checknames = FALSE but this did not fix the issue. Here I tried:

df.sort<- data.frame(df[order(-df$`Tcrg-C1`), ], check.names = FALSE)

And the same error follows: Error in -df$Tcrg-C1 : invalid argument to unary operator

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.