how can I extract data from a pdf written in hindi language and is using winansiencoding
i have got pdf contains data in hindi , has different blocks store same type of information. have extract data pdf , store in csv/excel format can used further processing.
i have tried using ocr , different tools , libraries of python (like tesseract, pdfminer) not able receive satisfactory results.(somewhere or other there problem in hindi 'matra').
please me this. have been stucked 3-4 days
hi,
are trying extract data in pdf form style or converting normal pdf excel?
thanks,
abhishek
More discussions in Creating PDFs
adobe
Comments
Post a Comment