I used tesseract::ocr to extract a character string from a vector of png files. This created an object like this, with one long concatenated value for each png file.
start <-
c("PreVIous Day CompOSIte Report\nStandard Previous Day Composite Report\nAs of 04l16l2018",
"PreVIous Day CompOSIte Report\nStandard Previous Day Composite Report\nAs of 04l17l2018")
I want to convert this into a table with a row for each line as denoted by "\n", something like this:
target <-
tibble::tribble(
~page, ~line, ~text,
1, 1, "PreVIous Day CompOSIte Report",
1, 2, "Standard Previous Day Composite Report",
1, 3, "As of 04l16l2018",
2, 1, "PreVIous Day CompOSIte Report",
2, 2, "Standard Previous Day Composite Report",
2, 3, "As of 04l17l2018")
I bet this is something simple with readr or tidytext, but it's eluding me!