Issue with dput()

blackish952 · June 12, 2018, 3:18am

Hello,

I have read many posts about reproducing a sample data using dput(). However, the out-put does not have something like "Structure....".
It seems to me since I have over 50,000 rows with 10 columns, so dput() cannot contain all of the information in the output.
Even when I do:

dput(data[1:10,])

The out-put still does not have "Structure..."

Members from R-community suggest posts should have reproducible samples, but since I cannot reproduce a sample using dput(), what can I do in order to post an acceptable post with all needed information so the question can be addressed?

Thank you.

jcblum · June 12, 2018, 4:50am

I agree that a very large dataset is not a good fit for the dput() strategy (some people will argue that there are very few problems where you really need to include all of a large dataset in your reproducible example). There have been a couple of discussions here with ideas for sharing data beyond dput():

Best Practices: how to prepare your own data for use in a `reprex` if you can’t, or don’t know how to reproduce a problem with a built-in dataset? tidyverse

@EconomiCurtis split this out of FAQ: What's a reproducible example (`reprex`) and how do I do one?. Curious if you have anything additional to add specifically on "how to prepare your own data for use in a reprex if you can't, or don't know how to reproduce a problem with a built-in dataset." I think @jessemaegan's post is about 80% there. The piece it is missing, if your average stack overflow post is any indication, is an explanation about how to prepare your own data for use in a reprex if you can't, or don't know how to reproduce a problem with a built-in dataset. Some handy things to know for this situation: deparse() The ugly as sin, gold standard: head(my_data, 2) %>% depa…

I’m curious what’s going wrong for you when you try dput() while selecting just a few rows. You said you don’t get output that starts with structure() — what do you get? What happens when you try running dput() on a slice of a built-in dataset? For example:

dput(head(ggplot2::diamonds))

(I get this...)

structure(list(carat = c(0.23, 0.21, 0.23, 0.29, 0.31, 0.24), 
    cut = structure(c(5L, 4L, 2L, 4L, 2L, 3L), .Label = c("Fair", 
    "Good", "Very Good", "Premium", "Ideal"), class = c("ordered", 
    "factor")), color = structure(c(2L, 2L, 2L, 6L, 7L, 7L), .Label = c("D", 
    "E", "F", "G", "H", "I", "J"), class = c("ordered", "factor"
    )), clarity = structure(c(2L, 3L, 5L, 4L, 2L, 6L), .Label = c("I1", 
    "SI2", "SI1", "VS2", "VS1", "VVS2", "VVS1", "IF"), class = c("ordered", 
    "factor")), depth = c(61.5, 59.8, 56.9, 62.4, 63.3, 62.8), 
    table = c(55, 61, 65, 58, 58, 57), price = c(326L, 326L, 
    327L, 334L, 335L, 336L), x = c(3.95, 3.89, 4.05, 4.2, 4.34, 
    3.94), y = c(3.98, 3.84, 4.07, 4.23, 4.35, 3.96), z = c(2.43, 
    2.31, 2.31, 2.63, 2.75, 2.48)), .Names = c("carat", "cut", 
"color", "clarity", "depth", "table", "price", "x", "y", "z"), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

blackish952 · June 12, 2018, 4:58am

@jcblum
What I get is the a ton of lines and I will scroll all the way up and then I cannot scroll any more. The output ends right there.

I did tailor down the columns. Otherwise, it will be a mess.
I assume the right syntax for select a few rows is posted in my original post above. Am I right?
If not, I don’t know what to do.

jcblum · June 12, 2018, 5:45am

Yes, the syntax you used should select the first 10 rows and all columns — so a 10 row x 10 column data frame. That shouldn’t be an inordinately long amount of text, unless there’s some really long values in your data frame?

But either way, if the output is too long for your console's scrollback buffer, you can use sink() to send all the console output to a file instead. For instance:

sink("dput_diamonds.txt")  # output to specified file in working directory
dput(ggplot2::diamonds)
sink()  # cancels sink, output to console again

(the result is a 2.7MB text file ... diamonds has >50,000 rows)

blackish952 · June 12, 2018, 12:47pm

@jcblum
Hello,
I was able to extract the output to *.txt file.

Thank you for your input.
I appreciate it.

jcblum · June 12, 2018, 1:42pm

Fantastic! Happy to help

JohnMount · June 12, 2018, 4:05pm

For this sort of task I also suggest trying wrapr::draw_frame().

cat(wrapr::draw_frame(head(ggplot2::diamonds)))

wrapr::build_frame(
   "carat", "cut"      , "color", "clarity", "depth", "table", "price", "x" , "y" , "z"  |
   0.23   , "Ideal"    , "E"    , "SI2"    , 61.5   , 55     , 326L   , 3.95, 3.98, 2.43 |
   0.21   , "Premium"  , "E"    , "SI1"    , 59.8   , 61     , 326L   , 3.89, 3.84, 2.31 |
   0.23   , "Good"     , "E"    , "VS1"    , 56.9   , 65     , 327L   , 4.05, 4.07, 2.31 |
   0.29   , "Premium"  , "I"    , "VS2"    , 62.4   , 58     , 334L   , 4.2 , 4.23, 2.63 |
   0.31   , "Good"     , "J"    , "SI2"    , 63.3   , 58     , 335L   , 4.34, 4.35, 2.75 |
   0.24   , "Very Good", "J"    , "VVS2"   , 62.8   , 57     , 336L   , 3.94, 3.96, 2.48 )

jcblum · June 12, 2018, 5:08pm

Ooh, very nice! Can you post that to the FAQ thread about ways of including your data (ideally with brief, clear instructions for basic use in this context and the best way to get the package, since we send a lot of inexperienced useRs to that thread).

JohnMount · June 12, 2018, 5:45pm

Wow, thanks. I will add it to the FAQ in a bit!