Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Powershell and Adobe Acrobat

New Here ,
Apr 22, 2016 Apr 22, 2016

Dear all,

I have a lot of PDF files, from which I want to extract information. In particolare I want to extract IDs printed on the PDF document in bold. The PDF documents contain a list of events, the events which are interesting for me are printed in bold in the PDF file. By hand there is no problem to do this, but since I have several documents ( a few hundreds of them ) I do not like to make this by hand.

I own Adobe Acrobat 9 Pro, I heard about the SDK, and I wanted to automate this task, possibly using powershell.

I have heard, that other programming languages this could be achieved.

Where can I have more information on this issue ?

Thank you very much in advance

Erik

TOPICS
Acrobat SDK and JavaScript , Windows
5.8K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 22, 2016 Apr 22, 2016

All Acrobat developers need the SDK. There is no command line interface, but a powerful OLE and JavaScript combination. You could consider extracting each word on every page and checking the location if you know the text is always in the same place. These interfaces would not give you font names. (Bold is NOT a style in PDF).  A plug-in in C++ maybe could do this, but developing it is unlikely to save you time compared to a few hundred documents.

But get the SDK, it's free and essential!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 22, 2016 Apr 22, 2016

By the way, I don't think the Acrobat 9 SD is available any more. Adobe only support development on supported versions of Acrobat.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 22, 2016 Apr 22, 2016

It would be enough, if it would be possible to save the files automatically as HTML, they are nicely formatted, exact in a way I need it and the bold information is maintained.

Only thing I would need to do is a script which opens the document I want in pdf, and saves it in HTML, and this for all files in the directory.

I have several hundreds of these pdf files per month ... until now we are doing work by hand and I want to automate the stupid part of my job.

Thank you very much

E.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 22, 2016 Apr 22, 2016

You can do this with an action, I think, with no need for programming or scripting. I may be wrong, that might not be in 9. I think in 9 it was Advanced > Batch Processing.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 22, 2016 Apr 22, 2016
LATEST

Thank you very much,

this worked fine for me, I tried it out right now.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines