Is it possible to extract non-english (tamil) text from PDF?
Hi,
Is there any library that I can use to extract non-english (tamil language) text from a PDF file?
Any direction is greatly appreciated.
Thanks,
Divakar. V
Hi,
Is there any library that I can use to extract non-english (tamil language) text from a PDF file?
Any direction is greatly appreciated.
Thanks,
Divakar. V
If the text is a part of an image then you'll need to find OCR software that supports this language. No Adobe software does, though.
If the text is not a part of an image then you can simply select all of it, copy it and then paste it into another application.
It should come out correctly, if the font encoding that was used is correct.
Already have an account? Login
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.