Locked

Extracting content from PDF file - Issue with symbols in output

New Here ,
Mar 13, 2018 Mar 13, 2018

Copy link to clipboard

Copied

Hi,

I used the library provided by datalogic(acrobat's) in my Java Program to extract the content from a pdf file, but after extraction, the content has symbols and non understandable fonts. The same program works fine when I extract the content of another pdf file.

Can you please let me know how to read the exact content from the pdf file avoiding the symbols and various non understandable fonts.

Thanks,

Mahesh

TOPICS
Acrobat SDK and JavaScript

Views

258

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct Answer

Most Valuable Participant , Mar 18, 2018 Mar 18, 2018
I replied to your duplicate post.Duplicate post: Acrobat XI Pro: -> Need SDK/API for programmatic process. Response edited by Forum Moderator.

Likes

translate

Translate

Translate
Adobe Employee ,
Mar 13, 2018 Mar 13, 2018

Copy link to clipboard

Copied

Since you licensed the library from Datalogics, you should contact them for support – it’s included in your contract

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 18, 2018 Mar 18, 2018

Copy link to clipboard

Copied

Hi Team,

Thank you for your reply.

Could you please go through below points and help me understand how can I proceed further.

1) We've Acrobat Adobe Pro tool which is used to convert pdf files of any format(encrypted, protected) into PLAIN readable pdf files.

2) And then we will use these readable pdf files as inputs to our java programs and extract the content from them.

3) When I use our JAVA program to extract the content from these kind of encrypted PDF files, we're getting the output text in symbols and non recognized fonts. So I hope if I use your "SDK\Library\jar" while extracting the content from PDF files, I'll get the content in readable text format. Even when I'm copying the content from pdf and pasting in a text file, the same thing is happening, the pasted content will have symbols and non recognized fonts.

4) We'd be happy if you can let us know about the SDK\Library\jar  which Acrobat Adobe PRO is using to convert these encrypted PDF files into PLAIN PDF files, so that we can use the same library in our JAVA programs to convert the encrypted PDF files into plain PDF files and then extract the content in text format.

5) I can see SDK in JAVASCRIPT here "https://www.adobe.com/devnet/acrobat/sdk/eula.html", but could not find it in JAVA. Also, I’m not sure whether this will be useful to meet my requirement.

Requesting your assistance at the earliest possible, as it’s very urgent. Thanks in advance

Thanks,

Mahesh

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Most Valuable Participant ,
Mar 18, 2018 Mar 18, 2018

Copy link to clipboard

Copied

I replied to your duplicate post.

Duplicate post: Acrobat XI Pro: -> Need SDK/API for programmatic process.

Response edited by Forum Moderator.

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines