How to rename the rownames of a dataframe with matching characters from another dataframe?

Hi there!!!
I have a dataframe named transposed_physeq_rarefied like this:

transposed_physeq_rarefied <- structure(list(OTU1 = c(10, 20, 30, 10, 40, 50), OTU2 = c(10, 20, 30, 10, 40, 50), OTU3 = c(10, 20, 30, 10, 40, 50) ), row.names = c("ERR260261_profile", "ERR260264_profile", "ERR260265_profile", "ERR260268_profile", "ERR260263_profile", "ERR275252_profile"), class = "data.frame")

I have another one dataframe called samplemat:

samplemat <- structure(list(SampleID = c("ERR275252_profile", "ERR260268_profile", 
"ERR260265_profile", "ERR260264_profile", "ERR260263_profile", 
"ERR260261_profile", "ERR260260_profile", "ERR260259_profile", 
"ERR260258_profile", "ERR260252_profile"), type = c("obese", 
"control", "control", "control", "control", "obese", "control", 
"control", "obese", "control")), row.names = c("ERR275252_profile", 
"ERR260268_profile", "ERR260265_profile", "ERR260264_profile", 
"ERR260263_profile", "ERR260261_profile", "ERR260260_profile", 
"ERR260259_profile", "ERR260258_profile", "ERR260252_profile"
), class = "data.frame")

The two dataframes have some (or all) common rownames but not in the same order. Now, I want the rownames of the first dataframe (transposed_physeq_rarefied) to be changed according to their respective value in the 2nd column (type) of the second dataframe (samplemat).

so, the rownames of transposed_physeq_rarefied will be changed from

rownames(transposed_physeq_rarefied)
[1] "ERR260261_profile" "ERR260263_profile" "ERR260265_profile" "ERR260264_profile" "ERR260268_profile" "ERR275252_profile"

to

rownames(transposed_physeq_rarefied)
[1] "obese" "control" "control" "control" "control" "obese"

Please note that, the rownames in the two dataframes are not in the same order

Can anyone please help me?
Thanks and regards,
DC7

At its simplest

cars <- cars[1:32,]
head(cars)
#>   speed dist
#> 1     4    2
#> 2     4   10
#> 3     7    4
#> 4     7   22
#> 5     8   16
#> 6     9   10
rownames(cars) <- rownames(mtcars)
cars
#>                     speed dist
#> Mazda RX4               4    2
#> Mazda RX4 Wag           4   10
#> Datsun 710              7    4
#> Hornet 4 Drive          7   22
#> Hornet Sportabout       8   16
#> Valiant                 9   10
#> Duster 360             10   18
#> Merc 240D              10   26
#> Merc 230               10   34
#> Merc 280               11   17
#> Merc 280C              11   28
#> Merc 450SE             12   14
#> Merc 450SL             12   20
#> Merc 450SLC            12   24
#> Cadillac Fleetwood     12   28
#> Lincoln Continental    13   26
#> Chrysler Imperial      13   34
#> Fiat 128               13   34
#> Honda Civic            13   46
#> Toyota Corolla         14   26
#> Toyota Corona          14   36
#> Dodge Challenger       14   60
#> AMC Javelin            14   80
#> Camaro Z28             15   20
#> Pontiac Firebird       15   26
#> Fiat X1-9              15   54
#> Porsche 914-2          16   32
#> Lotus Europa           16   40
#> Ford Pantera L         17   32
#> Ferrari Dino           17   40
#> Maserati Bora          17   50
#> Volvo 142E             18   42

See the FAQ: How to do a minimal reproducible example reprex for beginners for your case. I can't cut and paste your code and get workable objects. Only need as much data as illustrates the problem.

Thanks @technocrat for your kind response. Actually that is not what I wanted to do. It is little less straight forward than just chnaging the rownames. I have changed the code in the question to make it more simple so that you can get an workable object and understand the problem.
Thanks again,
DC7

I would use tibble::rownames_to_column() to move the rownames into a n actual column, and then use dplyr::left_join() or another kind of join on the dataframes.

When you work with dataframes, a lot of the time (although not always) you can do what you want without worrying about the order of the rows and columns.

Another way to match things without using tidyverse is the match() function, which returns the indices of the first vector items in the second vector.

1 Like

This is still not workable.

transposed_physeq_rarefied <- new("otu_table", .Data = structure(c(
  0, 0, 15954, 0, 35691, 12176,
  +0, 0, 230, 0, 0, 0, 1670, 11649, 27563, 83210, 21974, 31237,
  +0, 0, 0, 0, 0, 0
), .Dim = c(6L, 4L), .Dimnames = list(c(
  "ERR275252_profile",
  +"ERR260264_profile", "ERR260265_profile", "ERR260263_profile",
  +"ERR260268_profile", "ERR260261_profile"
), c(
  "OTU198", "OTU32",
  +"OTU83", "OTU259"
))), taxa_are_rows = FALSE)
#> Error in getClass(Class, where = topenv(parent.frame())): "otu_table" is not a defined class

Thanks @technocrat for your time. I have again edited it in a different way. Hope this time it will work.

DC7

1 Like

In line with @woodward's suggestion

suppressPackageStartupMessages({
  library(dplyr)
})

# data frame name shortened for convenient; rownames dropped in favor
# of a variable name

tpr <- data.frame(
  OTU1 = c(10, 20, 30, 10, 40, 50),
  OTU2 = c(10, 20, 30, 10, 40, 50),
  OTU3 = c(10, 20, 30, 10, 40, 50),
  SampleID = c("ERR260261_profile", "ERR260264_profile", 
               "ERR260265_profile", "ERR260268_profile", 
               "ERR260263_profile", "ERR275252_profile")
)

# row names also dropped

samplemat <- data.frame(
  SampleID = c("ERR275252_profile", "ERR260268_profile",
  "ERR260265_profile", "ERR260264_profile", "ERR260263_profile",
  "ERR260261_profile", "ERR260260_profile", "ERR260259_profile",
  "ERR260258_profile", "ERR260252_profile"), 
  type = c(
  "obese","control", "control", "control", "control", "obese", "control",
  "control", "obese", "control")
)

inner_join(tpr, samplemat, by = "SampleID") %>%
  select(SampleID, everything()) -> combined

combined
#>            SampleID OTU1 OTU2 OTU3    type
#> 1 ERR260261_profile   10   10   10   obese
#> 2 ERR260264_profile   20   20   20 control
#> 3 ERR260265_profile   30   30   30 control
#> 4 ERR260268_profile   10   10   10 control
#> 5 ERR260263_profile   40   40   40 control
#> 6 ERR275252_profile   50   50   50   obese
2 Likes

Thanks, @technocrat @woodward for your help. it works.

DC7

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.