Skip to main content
Participating Frequently
April 3, 2020
Question

Find Specific Text String and Automatically Create Bookmarks Based on That String

  • April 3, 2020
  • 5 replies
  • 6619 views

Is it possible in Acrobat to automatically create/insert bookmarks when a particular string (ex: Order #) is encountered?  I am trying to create individual work order files from one large PDF file (using Split Document into multiple files using bookmarks), but I need to create bookmarks each time the string "Order #" is encountered.  Because the pages vary based on the work order specs (drawings, material needed, instructions, etc.), this text string is not located in a predictable spot on each page.  Once the "Order #" is found, I need to insert a bookmark that includes "Order #" and the next 9 characters that come after it. I know how to do it manually, but there could be 100 or more orders in one file.  Any help is greatly appreciated...thanks!

This topic has been closed for replies.

5 replies

August 5, 2020

I have finally created a custom redaction pattern that I can use to find and mark for redaction the work order numbers that I need to be used as the new file names.  My question now is how do I extract each group of pages with the same work order into multiple files and name the new files after each work order.  The Find, Highlight and Extract action only puts them into one file, and I need them to create a new file per work order number.  Thanks for any help you can give me.

Thom Parker
Community Expert
Community Expert
August 5, 2020

The next step after running the redaction search is to loop through all the redact annots. Use the annot rectangle to find the text at that location, i.e. the order number. Then find all the pages associated with this number and extract them to a separate file.  Repeat until you've run through the annots.

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often
Participating Frequently
May 5, 2020

Is it not possible just to do a nested calculation with getPageNumWords and getPageNthWord to get the last word on the page, for example, getPageNthWord(p, (getPageNumwords(p) - 1)).  Then if the result does not resemble 2#######, the value of the previous page is used?

Thom Parker
Community Expert
Community Expert
May 6, 2020

You can do what you want to do with a script. No Problem.

But a calculation script is not the correct location for this type of code. This needs to be either a batch or folder level script. 

 

Like I said earlier, you need to did a bit of code testing in the console window. 

Run this code on the console, when a page with the order number is displayed.

 

this.getPageNthWord(this.pageNum, (this.getPageNumwords(this.pageNum) - 1))

 

What is the exact text that is returned?

When you can verify this, you can then create a regular expression to identify the order number. 

And then we can help you to design a complete script to perform this task.

 

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often
Participating Frequently
May 7, 2020

If I use this.getPageNthWord(this.pageNum, (this.getPageNumwords(this.pageNum) - 1)) I get exactly what I need, the work order number EXCEPT on a few pages that do have the WO number anywhere on the page (drawings, maps, etc.)  In the case where the page does not have a WO number, I would like to use the WO number from the previous page, since these unmarked pages come after the main WO pages.  I'm thinking an IF...ELSE statement could handle this, but I'm not sure what the exact code needs to be.  Thank you for taking the time to help me.

JR Boulay
Community Expert
Community Expert
April 4, 2020

If I understand correctly, there is no point in creating bookmarks since what you want to do is extract pages based on their content.
In this case you are lucky because it is precisely the subject of this thread which provides several versions of scripts to do this.

 

Google translate is your friend: https://abracadabrapdf.net/forum/index.php/topic,3410.0.html

Acrobate du PDF, InDesigner et Photoshopographe
Thom Parker
Community Expert
Community Expert
April 4, 2020

See the search and highlight Action here. If it can be used to find your words, then it's a short trip to splitting the PDF. 

https://www.acrobatusers.com/actions-exchange/

 

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often
Participating Frequently
April 13, 2020

Thank you for getting back to me so quickly. 🙂 

 

I have been able to use the Action you referred me to but instead of having it search for just "Order #", I need it to also highlight the 9 characters afterwards(1 space and 8 numbers that refer to each work order) so that when I split the file into multiple PDFs, each one has "Order #" plus the 8 digit work order number.  How do I do this?

 

Thanks again!

Thom Parker
Community Expert
Community Expert
April 13, 2020

That's more complicated. To do that you need to either write a custom JavaScript search, or specify a custom redaction search pattern.

The custom redaction search pattern is easier:

https://blogs.adobe.com/acrolaw/2011/05/creating_and_using_custom_redact/

 

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often
try67
Community Expert
Community Expert
April 3, 2020

Yes, if the text can be identified based on a specific pattern then it should be possible, but will require a custom-made script.

By the way, a script can just split the file directly. There's no need to create bookmarks and then use the Split Document command based on that...

 

I've developed many similar tools for my clients and would be happy to create one for you as well (for a fee, of course).

You can contact me privately via [try6767 at gmail.com] to discuss it further.