Skip to main content
tarekahf
Inspiring
July 19, 2019
Answered

Add reference to Adobe Acrobat Type Library in Eclipse Java Project

  • July 19, 2019
  • 3 replies
  • 5150 views

I know how to develop Windows Application in Visual Studio to to control Adobe Acrobat and PDF Documents using OLE Automation.

I am referring to page 21 in this guide:

https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/iac_developer_guide.pdf

I have done that many times in the past and the result was 100% successful.

I need to do the same but using Java and Eclipse.

My ultimate objective is to be able to extract text from a flattened PDF which is an appraisal form. So the Form has Fields and Values in a flattened PDF, and it follows a strict and fixed layout.

So, I want to write a Windows Desktop Application that will open a flattened PDF, find the field caption, and jump to the field value, extract the text. I've done some research, and so far I realized that I have to use the Doc method "getPageNthWord()" using OLE JSObject in Java.

I was able to use this code sample in the console window to extract the text of the current page:

var len = this.getPageNumWords(this.pageNum);

var txt="";

for (var i=0; i<len; i++) {

var w = this.getPageNthWord(this.pageNum, i);

txt += w + " ";

}

txt;

Questions:

- How I can add a reference to the Acrobat Library in Java Project in Eclipse.

- Is there any other method other than "gerPageNthWord()" that I can use to perform scraping to extract the text from PDF. I was expecting to find a method to extract a paragraph or the complete text of a given page.

Any help would be greatly appreciated.

Tarek

This topic has been closed for replies.
Correct answer Test Screen Name

Suggestion: forget Java. Use VB.

3 replies

Test Screen NameCorrect answer
Legend
July 19, 2019

Suggestion: forget Java. Use VB.

Legend
July 19, 2019

All text extraction in Adobe Interfaces starts with words. Paragraphs only exist in our perfection so you need to use guesswork and fuzzy logic.

If if you want to use JSObject I recommend you use VB. Converting this to another platform will use a lot of your time.

tarekahf
tarekahfAuthor
Inspiring
July 19, 2019

What is your solution or recommendation?

Please provide details.

Tarek

Bernd Alheit
Community Expert
Community Expert
July 19, 2019

The Adobe PDF Library has a Java interface:

https://dev.datalogics.com/adobe-pdf-library/

Karl Heinz  Kremer
Community Expert
Community Expert
July 19, 2019

From Bernd’s reply it may not be clear, but the Adobe PDF Library is a separate product with a a separate price tag. You can license it via DataLogics:

https://www.datalogics.com/products/pdf/pdflibrary/