Convert copied dput output (in clipboard) to a data frame or tibble/tribble?

reprex

#1

This question is somewhat related to my previous question

Is there any package which can convert the dput() output to a dataframe or tibble/tribble? For example:

  • Copy dput output to clipboard (Ctrl + C)

  • Run through a magic function or addin

  • Paste and get a nicely formatted df or tibble

I have tested datapasta and overflow packages but they did not work.

First copy this

structure(list(
    mean  = c(NA, NA, 1.62, 1.48, 1.43, 1.55, 1.60, 1.63, 1.48, 1.43, 1.43, 1.41, NA),
    lower = c(NA, NA, 1.23, 0.95, 1.04, 1.15, 1.01, 1.83, 1.15, 1.04, 1.07, 0.79, NA),
    upper = c(NA, NA, 2.14, 2.31, 1.95, 2.09, 2.53, 5.68, 1.91, 1.95, 1.92, 2.54, NA)),
    .Names = c("mean", "lower", "upper"),
    row.names = c(NA, -13L),
    class = "data.frame")

Then run

install_github("mrdwab/overflow-mrdwab") 
library(overflow)
soread()

Error in read.table(text = temp, header = header, stringsAsFactors = stringsAsFactors,  : 
  more columns than column names
library(datapasta)
tribble_paste()
Could not paste clipboard as tibble. Text could not be parsed as table.
NULL

df_paste()
Could not paste clipboard as tibble. Text could not be parsed as table.
NULL

Thank you!


#2

I don't know about a specific package but there is this question on SO which has some possible workarounds to make dput() output a bit more 'palatable'


#3

I recently added a write.so function to my read.so package that calls dput and restructures the results for you:

For example,

df <- tibble::data_frame(
    i = 1:6, 
    d = i + .1, 
    c = letters[i], 
    f = factor(c)
)

read.so::write.so(df)
#> df <- data_frame(
#>     i = 1:6,
#>     d = c(1.1, 2.1, 3.1, 4.1, 5.1, 6.1),
#>     c = c("a", "b", "c", "d", "e", "f"),
#>     f = structure(1:6, .Label = c("a", "b", "c", "d", "e", "f"), class = "factor")
#> )

Output is copied to the clipboard by default, though that's settable, as is indentation. The package is experimental and may change, but it's mostly only useful for interactive work anyway, not as a dependency.


#4

You could also try wrapr::draw_frame().

d <- structure(list(
  mean  = c(NA, NA, 1.62, 1.48, 1.43, 1.55, 1.60, 1.63, 1.48, 1.43, 1.43, 1.41, NA),
  lower = c(NA, NA, 1.23, 0.95, 1.04, 1.15, 1.01, 1.83, 1.15, 1.04, 1.07, 0.79, NA),
  upper = c(NA, NA, 2.14, 2.31, 1.95, 2.09, 2.53, 5.68, 1.91, 1.95, 1.92, 2.54, NA)),
  .Names = c("mean", "lower", "upper"),
  row.names = c(NA, -13L),
  class = "data.frame")

cat(wrapr::draw_frame(d))

# d <- wrapr::build_frame(
#     "mean", "lower", "upper" |
#     NA    , NA     , NA      |
#     NA    , NA     , NA      |
#     1.62  , 1.23   , 2.14    |
#     1.48  , 0.95   , 2.31    |
#     1.43  , 1.04   , 1.95    |
#     1.55  , 1.15   , 2.09    |
#     1.6   , 1.01   , 2.53    |
#     1.63  , 1.83   , 5.68    |
#     1.48  , 1.15   , 1.91    |
#     1.43  , 1.04   , 1.95    |
#     1.43  , 1.07   , 1.92    |
#     1.41  , 0.79   , 2.54    |
#     NA    , NA     , NA      )

#5

Can you show how you can convert the clipboard after copying the structure above to a data frame or tibble? Thank you


#6

Thank you John! wrapr looks useful but not quite exactly what I am looking for.

Imagine people sharing their sample data on the net using the output from dput. You want to copy that structure(list()) to clipboard then immediate turn it into a data frame or tibble using a function/addin then paste the processed output to your RStudio session. If I paste the shared dput directly to RStudio, RStudio will automatically format it and make it look terrible. For example, the code below spreads up to 600 cols


#7

read.so::write.so does the dput for you, but the source shows how it does it. The actual dput happens on line 41:

after which it constructs both a call and a string version of the result.

You could rebuild a version that takes a dput, if you like, though a simpler solution would be to run the dput and call write.so on the result.


#8

Just so I’m understanding what you’re looking for — you want to be able to convert dput() output (of a data frame) into an equivalent data.frame() call so that you can… replace the dput() output in an example for the purposes of further sharing? Or are you trying to automate making your own data frames into a representation that looks prettier in examples you want to share?

(I’m assuming that if you just wanted to work with someone else’s example, you’d run the line with all the ugly structure() stuff and then you’d have the data frame object in your environment).

I’m with you on RStudio’s ugly dput() output pasting. If you haven’t seen it already, we had a whole thread on that, including a couple of workarounds.

Personally I still can’t figure out the principles that the current auto-indent-on-paste algorithm is using to arrive at this ugly indentation — it’s not the same as the auto-format method (Code menu), because that makes a less-ugly but way too long version. If somebody else gets the logic that leads to the ugly, I’d love to hear an explanation!


#9

Is this specifically for dataframe? Is this close to what you have in mind?

library(magrittr)

# simulate copy a dput output
clipr::write_clip(
  '
  structure(list(
  mean  = c(NA, NA, 1.62, 1.48, 1.43, 1.55, 1.60, 1.63, 1.48, 1.43, 1.43, 1.41, NA),
  lower = c(NA, NA, 1.23, 0.95, 1.04, 1.15, 1.01, 1.83, 1.15, 1.04, 1.07, 0.79, NA),
  upper = c(NA, NA, 2.14, 2.31, 1.95, 2.09, 2.53, 5.68, 1.91, 1.95, 1.92, 2.54, NA)),
  .Names = c("mean", "lower", "upper"),
  row.names = c(NA, -13L),
  class = "data.frame")
  '
)

# write as tribble call
clipr::read_clip() %>%
  parse(text = .) %>%
  eval() %>%
  datapasta::tribble_paste()
#> tibble::tribble(
#>   ~mean, ~lower, ~upper,
#>      NA,     NA,     NA,
#>      NA,     NA,     NA,
#>    1.62,   1.23,   2.14,
#>    1.48,   0.95,   2.31,
#>    1.43,   1.04,   1.95,
#>    1.55,   1.15,   2.09,
#>     1.6,   1.01,   2.53,
#>    1.63,   1.83,   5.68,
#>    1.48,   1.15,   1.91,
#>    1.43,   1.04,   1.95,
#>    1.43,   1.07,   1.92,
#>    1.41,   0.79,   2.54,
#>      NA,     NA,     NA
#>   )
  

# write as a data.frame call
clipr::read_clip() %>%
  parse(text = .) %>%
  eval() %>%
  datapasta::df_paste()
#> data.frame(
#>         mean = c(NA, NA, 1.62, 1.48, 1.43, 1.55, 1.6, 1.63, 1.48, 1.43, 1.43,
#>                  1.41, NA),
#>        lower = c(NA, NA, 1.23, 0.95, 1.04, 1.15, 1.01, 1.83, 1.15, 1.04, 1.07,
#>                  0.79, NA),
#>        upper = c(NA, NA, 2.14, 2.31, 1.95, 2.09, 2.53, 5.68, 1.91, 1.95, 1.92,
#>                  2.54, NA)
#> )

Created on 2018-10-10 by the reprex package (v0.2.1)

In either case, I don't think row names are preserved and list columns seems partially supported in tribble_paste. If this is the right direction and you feel strongly about either feature, you may contribute back to the datapasta package.


#10

This is a typical scenario:

  1. You come to SO and see question that you can answer
  2. You copy the shared dput from the OP's question to clipboard
  3. Paste it into RStudio source editor window
  4. RStudio auto-formats that dput into hundreds of columns

What I can do (but I hate it):
2.1. Turn off "Auto indent" in RStudio global settings
3. Paste
3.1. Turn on "Auto indent" again

So what I'm looking for is:
2.1. Run a function or addin that reads clipboard content then magically turns the copied dput() output into a data frame or tibble
3. Paste it into RStudio source editor & good to go

Let me know if it makes sense to you


#11

This is exactly what I'm looking for. Thank you very much!

Please contribute this feature to datapasta package. I'm sure it will be very useful for many people


#12

Oh, I see. I think you're working too hard. I type df <- into the console, paste the dput (yes, it's ugly, but it doesn't matter), and run it. After constructing an answer, I use write.so (use datapasta::dpasta if you prefer its formatting) to generate code to generate the data in my answer so I can run the whole thing with reprex.

Alternatively, if you want to keep the original dput in your answer, if you paste it in an Rmd file (not a pure R file), it keeps the original indentation. (I have no idea why there's a difference, but there is.)


#13

This is dope! I will make write.so the method of choice for my short guide to loading data in a reprex. Thanks!


#14

if you paste it in an Rmd file (not a pure R file), it keeps the original indentation. (I have no idea why there's a difference, but there is.)

Not sure if it's a feature or a bug in RStudio but good to know nonetheless :slight_smile:


#15

Yup. That helps, thanks!

That’s way too much fiddling for me, too — when I need to paste dput() output into a script file, I do the thing with commenting-reflowing-uncommenting that I described in the other thread (still not pretty, but also not a zillion cols wide!). Or I avoid it by creating the object via the console first, Iike @alistaire does.

I guess one could also play around with making an addin to reformat dput() output in a nicer way (maybe with the help of styler?).