@EconomiCurtis split this out of FAQ: What's a reproducible example (`reprex`) and how do I do one?.
Curious if you have anything additional to add specifically on “how to prepare your own data for use in a
reprex if you can’t, or don’t know how to reproduce a problem with a built-in dataset.”
I think @jessemaegan’s post is about 80% there. The piece it is missing, if your average stack overflow post is any indication, is an explanation about how to prepare your own data for use in a reprex if you can’t, or don’t know how to reproduce a problem with a built-in dataset.
Some handy things to know for this situation:
The ugly as sin, gold standard:
head(my_data, 2) %>% deparse()
returning something like:
structure(list(date = list(structure(-61289950328, class = c("POSIXct", "POSIXt"), tzone = ""), structure(-61258327928, class = c("POSIXct", "POSIXt"), tzone = "")), id = c("0001234", "0001235"), ammount = c("$18.50", "-$18.50")), class = "data.frame", .Names = c("date", "id", "ammount" ), row.names = c(NA, -2L))
Which is not beginner friendly… what’s a
structure? But it is really the only method that will not mess with the data types. It also works with both listy structures and data.frame-ish ones.
Handy if you have the patience to hand type out a some data for your audience in a pretty format. There is a servere limitation in that not all data types can be represented in a
tribble(). The previous would be something close to:
tibble::tribble( ~date, ~id, ~ammount, "27/10/2016 21:00", "0001234", "$18.50", "28/10/2016 21:05", "0001235", "-$18.50" ) %>% mutate(date = lubridate::parse_date_time(date, orders = c("d!/m!/Y! H!:M!")))
With the trailing mutate to fix the date that could not be represented. It would be remiss of me not to plug
datapasta::tribble_paste() which can save you some typing here.
It’s possible to represent your data, complete with type specification, as a
read_csv()call. The previous would be:
readr::read_csv('date, id, amount "27/10/2016 21:00", 0001234, $18.50 "28/10/2016 21:05", 0001235, -$18.50', col_types = cols( col_date(format="%d/%m/%Y %H:%M"), col_character(), col_character() ) )
Not yet on CRAN, A nicer version of 1, that can also get you directly to 2. in some cases. https://github.com/krlmlr/deparse
Edit: you can always use data.frame(), Tibble(), list() etc!