• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Help extracting info from PDFs to create automatic file-naming script

New Here ,
Jan 09, 2023 Jan 09, 2023

Copy link to clipboard

Copied

Hi All - Very new to the community and trying to solve a problem for one of our employees.

We have hundreds of PDF documents that were given to us as print-out's. One of our employee's needs to open each file, determine some info from the data inside, such as the date, the customer, etc and then re-name the PDF using that data.

 

Is there a way to automate this using Java Script or something else? I'd like to provide the basic details to our Development team so they can quickly write us a script to batch these.


Thank you!

Ryan

TOPICS
Acrobat SDK and JavaScript , Windows

Views

713

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 09, 2023 Jan 09, 2023

Copy link to clipboard

Copied

I should add we are scanning these documents into PDFs...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 09, 2023 Jan 09, 2023

Copy link to clipboard

Copied

Yes it's possible!! Use the doc.getPageNthWord() function to extract page text. Of course the scanned pages will need to be high quality and OCR'd 

Here's the refernce entry

 

https://opensource.adobe.com/dc-acrobat-sdk-docs/library/jsapiref/doc.html#getpagenthword

 

 

Thom Parker - Software Developer at PDFScripting
Use the Acrobat JavaScript Reference early and often

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 09, 2023 Jan 09, 2023

Copy link to clipboard

Copied

Thank you Thom! I was pretty certain this was possible.  Much appreciated 🙂

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 09, 2023 Jan 09, 2023

Copy link to clipboard

Copied

Be aware the this function is particularly slow. It may take up to 10 seconds to walk through all the words on a page, depending on the number of words and speed of the computer. 

I use the function a lot. If I need to find multi-word phrases or patterns, I'll collect all the words in to a single string before searching. 

 

An alternate method that works for Actions (i.e. batch processing) is to use the redaction search and mark feature, as the first command in the batch sequence. The method is perhaps 100x to 1000x faster than the JavaScript doc.getPageNthWord() function. It marks the the found words with redaction annots. You can then use a script as the second command in the batch sequence to acquire the selected words from the Annot.contents property of the redactions. And then delete the redactions.  There is commenting preference that needs to be set so that the text is picked up by the redact annot. 

 

 

Thom Parker - Software Developer at PDFScripting
Use the Acrobat JavaScript Reference early and often

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 09, 2023 Jan 09, 2023

Copy link to clipboard

Copied

As mentioned, this can be done using a custom-made script, but it's not a trivial task. Also, a script can't actually rename a file, it can only save a copy of it under a new name, but the original file will not be removed.


If you're interested in hiring a professional to create it for you, feel free to contact me privately by clicking my user-name and then on "Send a Message".

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 19, 2023 Jan 19, 2023

Copy link to clipboard

Copied

LATEST

You should try (30 days) the AutoSplit plugin: https://evermap.com/AutoSplit.asp

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines