How to save variables description from dataframe to csv?

Is there a way to save the description of each variable from the dataframe to a csv file?

One possibility is the write_labelled_csv() function from the expss package.

Unfortunately, write_labelled_csv is not portable—it creates a file intended to be read by SPSS. The primary virtue of csv is universality. Although there is a read_labelled_csv() companion function to reverse the process, that assumes restricting use of the csv to R use cases. For that, it is simpler to borrow the set attributes value to add metadata to a data frame object and then saveRDS() to serialize in native format.

library(expss)
#> Loading required package: maditr
#> 
#> Use magrittr pipe '%>%' to chain several operations:
#>              mtcars %>%
#>                  let(mpg_hp = mpg/hp) %>%
#>                  take(mean(mpg_hp), by = am)
#> 
#> 
#> Use 'expss_output_viewer()' to display tables in the RStudio Viewer.
#>  To return to the console output, use 'expss_output_default()'.
library(readr)
#> 
#> Attaching package: 'readr'
#> The following object is masked from 'package:maditr':
#> 
#>     cols
mtcars = mtcars %>% 
  apply_labels(
    mpg = "Miles/(US) gallon",
    cyl = "Number of cylinders",
    disp = "Displacement (cu.in.)",
    hp = "Gross horsepower",
    drat = "Rear axle ratio",
    wt = "Weight (lb/1000)",
    qsec = "1/4 mile time",
    vs = "Engine",
    vs = c("V-engine" = 0, 
           "Straight engine" = 1),
    am = "Transmission",
    am = c(automatic = 0, 
           manual=1),
    gear = "Number of forward gears",
    carb = "Number of carburetors"
  )

write_labelled_csv(mtcars, "mtcars.csv")
d <- read.csv("/Users/ro/projects/demo/mtcars.csv")
head(d)
#>   X.dict variable value                 label   meta
#> 1      #      mpg           Miles/(US) gallon varlab
#> 2      #      cyl         Number of cylinders varlab
#> 3      #     disp       Displacement (cu.in.) varlab
#> 4      #       hp            Gross horsepower varlab
#> 5      #     drat             Rear axle ratio varlab
#> 6      #       wt            Weight (lb/1000) varlab
tail(d)
#>     X.dict variable value label meta
#> 109     15        8   301   335 3.54
#> 110   3.57     14.6     0     1    5
#> 111      8                          
#> 112   21.4        4   121   109 4.11
#> 113   2.78     18.6     1     1    4
#> 114      2

Created on 2023-05-20 with reprex v2.0.2

Easy-peasy if you know where to steal from :slight_smile: .

library(tidyverse)
t <- tibble(a = 1, b = 2)
attr(t$a, "label") <- "hello"
attr(t$b, "label") <- "bye"

vars_labels =  sapply(t,function(x){attr(x,"label")})
  # see https://stackoverflow.com/questions/57396453/extract-labels-from-dataframe
t2 <- rbind(vars_labels,t)

write_csv(t2,"foo.csv")
1 Like

Metadata shouldn't be coded as data.

t1 <- tibble::tibble(a = 1, b = 2)
attr(t1$a, "label") <- "hello"
attr(t1$b, "label") <- "bye"

vars_labels =  sapply(t,function(x){attr(x,"label")})
# see https://stackoverflow.com/questions/57396453/extract-labels-from-dataframe
t2 <- rbind(vars_labels,t)
#> Error in rbind(vars_labels, t): cannot coerce type 'closure' to vector of type 'list'

write_csv(t2,"t3.csv")
#> Error in write_csv(t2, "t3.csv"): could not find function "write_csv"

t3 <- read.csv("/Users/ro/projects/demo/t3.csv")
identical(t2,t3) # back in as data frame without using readr::read_csv()
#> Error in eval(expr, envir, enclos): object 't2' not found
str(t2)
#> Error in eval(expr, envir, enclos): object 't2' not found
str(t3) # t1 numeric variables are now chr becasue rbind
#> 'data.frame':    2 obs. of  2 variables:
#>  $ a: chr  "hello" "1"
#>  $ b: chr  "bye" "2"
str(t1)
#> tibble [1 Ă— 2] (S3: tbl_df/tbl/data.frame)
#>  $ a: num 1
#>   ..- attr(*, "label")= chr "hello"
#>  $ b: num 2
#>   ..- attr(*, "label")= chr "bye"


t1 <- tibble::tibble(a = 1, b = 2)
attr(t1$a, "label") <- "hello"
attr(t1$b, "label") <- "bye"
saveRDS(t1,"t1.Rds")
t4 <- readRDS("/Users/ro/projects/demo/t1.Rds")
identical(t1,t4)
#> [1] TRUE
str(t4)
#> tibble [1 Ă— 2] (S3: tbl_df/tbl/data.frame)
#>  $ a: num 1
#>   ..- attr(*, "label")= chr "hello"
#>  $ b: num 2
#>   ..- attr(*, "label")= chr "bye"

Agreed. But if the OP needs a csv file they may need to go this route.

My comeback is that a csv file that changes variables typeof to tack on the metadata is no csv file at all. :grin:

I suppose one could write an R script to tack on the attributes—two files to pass around, but would have to drop the meta row to convert to typeof num anyway

You can do:

library(Hmisc)
names(df) <- Hmisc::label(df)

and then export as csv or excel file.

But

library(Hmisc)
#> 
#> Attaching package: 'Hmisc'
#> The following objects are masked from 'package:base':
#> 
#>     format.pval, units
x1 <- 1:10
x2 <- 10:1
label(x2) <- 'Label for x2'
units(x2) <- 'mmHg'
dframe <- data.frame(x1, x2)
Label(dframe)
#> label(x1)    <- ''
#> label(x2)    <- 'Label for x2'
write.csv(dframe,"/Users/ro/projects/demo/dframe.csv")

dframe2 <- read.csv("/Users/ro/projects/demo/dframe.csv")

identical(dframe,dframe2)
#> [1] FALSE
Label(dframe)
#> label(x1)    <- ''
#> label(x2)    <- 'Label for x2'
Label(dframe2)
#> label(X) <- ''
#> label(x1)    <- ''
#> label(x2)    <- ''

Hi Technocrat,

Could you please explain what are you suggesting in your post ? I would like to understant it correctly, that during saving to csv file when Hmisc package was used previously, some informations are lost ?
And my second question, I try to grasp OP's intentions as when I used "Hmisc way" which is quick and then saved to csv,
it opens in excel in a good way and variables' names are replaced by variables' labels and it looks nicely. It looks nicely in R as well.

Kind regards,
Andrzej

Hi @Andrzej

Do you get something different than shown in my reprex when you apply it to your example, save it to csv and then read it back in?

I have used your mtcars example dataset.

I saved it into csv and opened in excel that I often work with. In excel I have done "text to columns" and saved. Then imported back to R. Looks like this:

Identical(mtcars, mtcars2)

gives me FALSE, which is even better, as columns with numbers are now numerical. Before they were of "labelled" class.

In terms of my usefulness this is OK and enough for me, but as I said I do not know what were OP's intensions with regard to do something further with that csv file.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.