I need to read 100 pdf documents, where I need to extract the text information from the pdf and export the excel. In the pdf there are various text from which I need to create the data table. I am giving a part of the pdf from which I need to extract the information.
I am doing my job in the company(Employee Id : 12345678)
Name : XXXXX YYYYY
** Date of Birth : 12/12/2001**
** Place : AAAAAAAA**
** Address: 111, BLOCK 1,**
** XYZ LOCALITY**
** BANGKOK **
** Email id: firstname.lastname@example.org**
I have to create the columns and extract all the information along with it from all the pdfs in Excel.
I am trying to use tesseract and pdf_convert.