Ways to reverse data frame construction so it's reproducible?


#1

Is it possible to reverse engineer a data frame so that it's reproducible?

Like let's say I have a giant CSV from an employer or webpage that I want to make entirely reproducible. Is there a package or method to turn that into tibble(name = "...", id = "...", ...)?


#2

You should take a look at the datapasta package. It has some size limitations (or rather, a lack of testing past certain sizes) but should do what you are looking for in terms of changing a df into a tibble input


#3

Woah. That's incredible. Thanks for sharing!


#4

And also my own package wrapr: https://winvector.github.io/wrapr/reference/draw_frame.html .


#5

For some reason, I'd never heard of dput() in base R, but this is basically what I was looking for


#6

For complex types dput() may be your only option. But if your types are simple one of the other alternatives will be much more legible. Examples:

d <- data.frame(x = 1)

dput(d)
# structure(list(x = 1), class = "data.frame", row.names = c(NA, 
# -1L))

cat(wrapr::draw_frame(d))
# d <- wrapr::build_frame(
#   "x" |
#   1   )

#7

Love it. I'll give it a try. So is the main problem with dput() the messy output?


#8

Yes, dput() works great- but what it gives doesn't look like anything a user would type.