Add reference to Adobe Acrobat Type Library in Eclipse Java Project
I know how to develop Windows Application in Visual Studio to to control Adobe Acrobat and PDF Documents using OLE Automation.
I am referring to page 21 in this guide:
https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/iac_developer_guide.pdf
I have done that many times in the past and the result was 100% successful.
I need to do the same but using Java and Eclipse.
My ultimate objective is to be able to extract text from a flattened PDF which is an appraisal form. So the Form has Fields and Values in a flattened PDF, and it follows a strict and fixed layout.
So, I want to write a Windows Desktop Application that will open a flattened PDF, find the field caption, and jump to the field value, extract the text. I've done some research, and so far I realized that I have to use the Doc method "getPageNthWord()" using OLE JSObject in Java.
I was able to use this code sample in the console window to extract the text of the current page:
var len = this.getPageNumWords(this.pageNum);
var txt="";
for (var i=0; i<len; i++) {
var w = this.getPageNthWord(this.pageNum, i);
txt += w + " ";
}
txt;
Questions:
- How I can add a reference to the Acrobat Library in Java Project in Eclipse.
- Is there any other method other than "gerPageNthWord()" that I can use to perform scraping to extract the text from PDF. I was expecting to find a method to extract a paragraph or the complete text of a given page.
Any help would be greatly appreciated.
Tarek
