Write.csv with custom column number format


#1

Hello all,

I have a script with a function that cleans data in a batch of files in a folder. One important column in the dataframes is in number format that start with 0, where the 0 is important. Importing into R with extra code helps the zeros display correctly only if it is manually saved in the csv a certain way. When opened in excel, saving the column format as Custom 00000000000000 ( it needs 14 spaces) manually does the trick.

I would like to know if there is a way to add this as part of the function when saving/writing the csv files.

Any suggestions?


#2

In order to preserve the leading zeros, one approach is to save/read that column as character strings. The following example shows the basics:

my_df1 <- data.frame(xx=letters[1:3],
                    yy=c("00000000000987",
                         "00000000123456",
                         "00000000000022"))
my_df1
write.csv(my_df1, row.names=FALSE)  # Show output in console
write.csv(my_df1, file="my_df.csv", row.names=FALSE) # Save to disk

my_df2 <- read.csv("my_df.csv") # Default read drops the leading zeros!
my_df2

my_df3 <- read.csv("my_df.csv",
                   colClasses=c(NA,"character")) # Capture leading zeros from column 2
my_df3

Alternatively, you could read and process the column as numbers but pad with zeros
just before outputting as strings:

library(stringr)
str_pad(my_df2$yy, 14, pad = "0")

Or, my preferred option, read and process the column as numbers, then create and format the column directly in an Excel file (xlsx not csv) on-the-fly:

library(openxlsx)

wb <- createWorkbook()
addWorksheet(wb, "My_data")
writeData(wb, 1, my_df2)

fixed14 <- createStyle(numFmt = "00000000000000")
addStyle(wb, 1, style = fixed14,
         cols = 2, 
         rows=1:nrow(my_df2)+1, 
         gridExpand = TRUE)

saveWorkbook(wb, "fixed14.xlsx", overwrite = TRUE)

HTH