Selecting specific rows in a dataframe and keeping the row names

The following code can be copy/pasted on a R script:

select_dataframe_rows = function(ds, sel) {
  cnames = colnames(ds)
  rnames = rownames(ds)
  ds = data.frame(ds[sel,])
  colnames(ds) = cnames
  rownames(ds) = rnames[sel]
  return (ds)
}

df = data.frame(value = c(3,6,2,5,8,9,1,4,7))
rownames(df) = sprintf("Entry %02d", 1:nrow(df))

printf("I have the following dataframe:\n")
df

printf("This is a subset of it:\n")
df2 = select_dataframe_rows(df, c(2, 5))
df2

This is the output:

# I have the following dataframe:
#          value
# Entry 01     3
# Entry 02     6
# Entry 03     2
# Entry 04     5
# Entry 05     8
# Entry 06     9
# Entry 07     1
# Entry 08     4
# Entry 09     7

# This is a subset of it:
#          value
# Entry 02     6
# Entry 05     8

My question is: can be the content inside function: sort_dataframe_by_varname() be simplified?
I think that implementation is a bit tricky and for sure should be a better implementation for this common use case.

Be aware that one requirement is to keep the row names.

Hello,

I've never liked the whole row-names concept in R :slight_smile:
Sometimes it's treated as a column, and sometimes it's just invisible or you need workarounds to do something useful with it (unless you save it as a csv and then suddenly it's saved as a column lol).

I've come up with a tidyverse implementation of your question, and think it's slightly more condensed and readable than your code.

library(dplyr)
library(tibble)

df = data.frame(value = c(3,6,2,5,8,9,1,4,7))
rownames(df) = sprintf("Entry %02d", 1:nrow(df))

df %>% rownames_to_column("rowname") %>% 
  slice(2,5) %>% 
  column_to_rownames("rowname")

         value
Entry 02     6
Entry 05     8

You'll always have to convert the row-names to a column to do any actions with it, and thus that's exactly what this code does.

Hope this helps,
PJ

1 Like

thank you @pieterjanvc. I think that's the best way to do it :wink:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.