I am using tesseract engine for detecting text in images (scanned PDFs). One important thing is to detect the position of a word inside a document and function ocr_data (from tesseract) does just that, it outputs words it finds and their coordinates.
Is there a way to produce the same output but for symbols, like every letter it has detected?
For example, the ocr_data produces for the word hello the following output: hello, 0.98, 10,20,60,30. I would like to produce the following output: h,0.98,5,20,15,30; e,0.98,6,20,16,30 etc. In Python, tesseract engine has a method called GetUTF8Text that outputs what just that.
Thank you.