RStudio rownames_to_column - How to find name of first column to use in this function?

I understand the syntax for the tibble function rownames_to_columns, e.g., rownames_to_column(mtcars, var = "das_Auto").

But how do I find the name of the row column (das_Auto) in the first place?

Capture

The value of the var argument is the name of the column that will be added to the data frame to accept the row names. You can set that to whatever you want or accept the default value of "rowname".

Oh! So, first columns never have names? You have to assign them?

In the mtcars example, why would someone want to create major, defining column that identifies each unique observation, like make/model, and leave the column name blank?

Another example was the hotel_booking sample data set. I couldn't find a reservation_ID column.
I don't get it...
Thanks!

library(tibble)

# mtcars is a *named* data frame: just as each column has a name, each row has a name

mtcars %>% head()
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

# the row names can be converted to a new column, which in this case I named "model"
# it is now an *unnamed* data frame with the rows just numbered
# most of the data frames I work with are unnamed

mtcars %>% rownames_to_column("model") %>% head()
#>               model  mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1         Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> 2     Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> 3        Datsun 710 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> 4    Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> 5 Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> 6           Valiant 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Created on 2021-08-18 by the reprex package (v2.0.1)

In the mtcars data frame, the make/model names are not a data column with a blank column name. They aren't a column at all. They are row names. One way to see this is the following:

ncol(mtcars)
#> [1] 11

colnames(mtcars)
#>  [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
#> [11] "carb"

rownames(mtcars)
#>  [1] "Mazda RX4"           "Mazda RX4 Wag"       "Datsun 710"         
#>  [4] "Hornet 4 Drive"      "Hornet Sportabout"   "Valiant"            
#>  [7] "Duster 360"          "Merc 240D"           "Merc 230"           
#> [10] "Merc 280"            "Merc 280C"           "Merc 450SE"         
#> [13] "Merc 450SL"          "Merc 450SLC"         "Cadillac Fleetwood" 
#> [16] "Lincoln Continental" "Chrysler Imperial"   "Fiat 128"           
#> [19] "Honda Civic"         "Toyota Corolla"      "Toyota Corona"      
#> [22] "Dodge Challenger"    "AMC Javelin"         "Camaro Z28"         
#> [25] "Pontiac Firebird"    "Fiat X1-9"           "Porsche 914-2"      
#> [28] "Lotus Europa"        "Ford Pantera L"      "Ferrari Dino"       
#> [31] "Maserati Bora"       "Volvo 142E"

Note that mtcars has 11 columns of data, none of which have a blank name. It's true that when you type mtcars in the console, the data frame is printed to the console with the row names on the left, and this might look like a "column" of data. But this printed output shouldn't be confused with the actual structure of the mtcars data frame object as shown above.

As EconProf showed, you can use mtcars = mtcars %>% rownames_to_column("model") as a convenient way to add a new column called model that contains the make/model values that used to be row names. You could also do the following with just base R functions:

# Starting data frame
d = mtcars

# Create a new column that contains the make/model names
d$model = rownames(d)

# Remove the rownames (the rownames will now just be the row numbers)
rownames(d) = NULL 

# Final result
head(d)
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb             model
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4         Mazda RX4
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4     Mazda RX4 Wag
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1        Datsun 710
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1    Hornet 4 Drive
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 Hornet Sportabout
#> 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1           Valiant

With data frames, row names are generally a pain to work with. If you're working on a data frame that has informative row names (that is, row names other than the default row number), turn them into a data column so that they have the same status as and can be processed in the same way as the rest of the data columns.

2 Likes

You asked why a column would not have a name and it was pointed out that these are rownames.

The reason (I believe) is that dataframes were not included in R at the start and were added later on. (Dataframes are lists of vectors).

Think about a matrix which has optional rownames (and colnames). The rownames are not part of the matrix data because a matrix has a uniform data type. Rownames for dataframes are thus the equivalent of rownames for matrices.

In order to rectify this (mainly) undesirable feature, tibbles and data.tables, which are both enhanced dataframes, do not use rownames.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.