Skip to main content
Participant
April 9, 2024
Answered

Extract and Rename

  • April 9, 2024
  • 1 reply
  • 1755 views

I have prviousaly posted about this issue and I belive that I am going in the right direction. Let me explain, I have thousands of scanned PDF'S that I need to rename to a specific number. These PDF'S are OCR and the portion that I need read is Print not handwriting. I belive that I can have Adobe read the pattern of numbers and rename the PDF and then save it to a secific file based off the number that I extracted. The number pattern is simple always the same and only appears on the document once. Is it as simple as having JavaScript "Find number pattern" "Print" "File Save" Use Number" "Save". Obviously there is more to it but it should follow that pattern correct? My last question is could I run this at multiples, say I get 100 pages at a time? 

Any more indepth tutorials such as this but more for my issue?: https://acrobatusers.com/tutorials/how-save-pdf-acrobat-javascript/ 

This topic has been closed for replies.
Correct answer Thom Parker

Evermap makes a bunch of great tools. If it works for you then use it. Much better than writing the code manually. 

 

1 reply

Thom Parker
Community Expert
Community Expert
April 9, 2024

Yes, that can all be done with a script. Given of course that the scan returns usable text. Try using the text select tool to select the text of interest from the OCR's PDF and paste it into a text editor. If the text is correct, then you have a chance, but if it is garbage, then you'll need to refine the scan settings. 

 

Extracting text from a PDF is a more advanced task. Individual words are acquired with the "doc.getNthPageWord" function. In Acrobat JS, words are separated at any whitespace or punctuation boundary. So if number pattern contains punctuation (like a phone number) then the script will need to collect all words on the page into a single string for searching. 

 

Here's a better version of the save article:

https://www.pdfscripting.com/public/How-to-Save-a-PDF-2.cfm

 

You'll find lots of info on scripting Acrobat at this site, although there isn't anything about finding and extracting page text.

 

  

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often
IsaacXRYWAuthor
Participant
April 9, 2024

So I found AutoSplit Pro by Evermap and this does exactly what I was looking for. "Renaming files from text location/text search". Im sure its possible to do manually but I think this may be the best solution? Any opionions? 

Thom Parker
Community Expert
Thom ParkerCommunity ExpertCorrect answer
Community Expert
April 9, 2024

Evermap makes a bunch of great tools. If it works for you then use it. Much better than writing the code manually. 

 

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often