Is it possible to (partially) recover corrupt RDS file?

I have a corrupted RDS file. Is it possible to recover any of the contents of this file? It was saved with a standard saveRDS(data, "data.rds") call.

readRDS("data.rds")
## Error in readRDS("data.rds") :  error reading from connection

I was able to dump text from the RDS file using

file <- gzfile("data.rds", "rb")
x <- readLines(file)

This text appears to be the saved tibble, column-by-column with each row as a new line. I was able to use regular expressions to grab an ID column that I can use to reconstruct the data set, so I've "solved" my issue.

Digging around a bit more, I've discovered two more pieces of interesting information. It seems like some metdata about the saved data object is saved at the end of the file. And in my case I may be able to read the plain text because the file wasn't completely written. This is what a similar process looks like for a valid RDS file:

> mtcars_tbl <- as_tibble(mtcars)
> saveRDS(mtcars_tbl, fs::path_temp("mtcars.rds"))
> file <- gzfile(fs::path_temp("mtcars.rds"), "rb")
> readLines(file)
##  [1] "X"
##  [2] ""
##  [3] "=p\xa4@\t333333@\006\024z\xe1G\xae\024@\t\xae\024z\xe1G\xae@"
##  [4] "\x85\036\xb8Q\xeb\x85@\017\\(\xf5\u008f\\@\017\\(\xf5\u008f\\@\017\\(\xf5\u008f\\@\b\x8f\\(\xf5\u008f@\b\x8f\\(\xf5\u008f@\b\x8f\\(\xf5\u008f@\ap\xa3\xd7"
##  [5] "=q@\b"
##  [6] "=p\xa3\xd7@\020Q\xeb\x85\036\xb8R@\023\xb8Q\xeb\x85\036\xb8@\020\xe1G\xae\024z\xe1@"
##  [7] "\x99\x99\x99\x99\x99\x9a@\006\024z\xe1G\xae\024@\t333333@"
##  [8] "\xd7"
##  [9] "=p\xa3\xd7@\b\xa3\xd7"
## [10] "=p\xa4@\020Q\xeb\x85\036\xb8R@\021\xb8Q\xeb\x85\036\xb8@\016(\xf5\u008f\\)@\020\xe1G\xae\024z\xe1@\f\xf5\u008f\\(\xf6@\fQ\xeb\x85\036\xb8R@\020p\xa3\xd7"
## [11] "=q"
## [12] "\xd7"
## [13] "=p\xa3\xd7@\016=p\xa3\xd7"
## [14] "=@\025"
## [15] "=p\xa3\xd7?\xfd\\(\xf5\u008f\\@\003\xb8Q\xeb\x85\036\xb8@\f(\xf5\u008f\\)@\vz\xe1G\xae\024{@\016\xb8Q\xeb\x85\036\xb8@\016\u008f\\(\xf5\xc3?\xfe\xf5\u008f\\(\xf6@\001\036\xb8Q\xeb\x85\037?\xf85?|\xed\x91h@\t\\(\xf5\u008f\\@\006(\xf5\u008f\\)@\f\x8f\\(\xf5\u008f@\006=p\xa3\xd7"
## [16] "="
## [17] "=q@1\005\036\xb8Q\xeb\x85@48Q\xeb\x85\036\xb8@/\xae\024z\xe1G\xae@4"
## [18] ""
## [19] "data.frame"

Would still appreciate if anybody can give any pointers about the structure of RDS files

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.