how can I extract data from a pdf written in hindi language and is using winansiencoding

- June 25, 2015

i have got pdf contains data in hindi , has different blocks store same type of information. have extract data pdf , store in csv/excel format can used further processing.

i have tried using ocr , different tools , libraries of python (like tesseract, pdfminer) not able receive satisfactory results.(somewhere or other there problem in hindi 'matra').

please me this. have been stucked 3-4 days

hi,

are trying extract data in pdf form style or converting normal pdf excel?

thanks,

abhishek

More discussions in Creating PDFs

adobe

Search This Blog

Keep

how can I extract data from a pdf written in hindi language and is using winansiencoding

Comments

Post a Comment

Popular posts from this blog

Filter List Bug

Lost catalog

audio channels