Skip to main content
Participant
September 3, 2021
Question

Can I auto split / save / rename files using OCR ?

  • September 3, 2021
  • 4 replies
  • 5768 views

Hi, I'm looking for suggestions on how to do the following as an automated process:

 

  1.  Auto split a multi page pdf into individual pages
  2. Use OCR to find a reference number in a specific area of the single document
  3. Save the individual documents with a file name matching that reference number combined with today's date
  4. If possible create a CSV file listing the references numbers scanned and save this to the same folder

 

Basically I'm looking to make a very time consuming manual process a little bit easier to manage.

 

Thanks in advance for any suggestions

This topic has been closed for replies.

4 replies

Participant
November 25, 2024
AnandSri
Legend
December 30, 2024

Thanks for sharing a workaround.

^AB

Participant
September 30, 2021

We use Adobe Acrobat DC with the Evermap AutoSplit Pro plugin. Does exactly what you need, but there is a cost for Acrobat and the plugin. 

JayWHurlAuthor
Participant
October 1, 2021

Many thanks, this looks like the route I will take, much appreciated!

try67
Community Expert
Community Expert
September 3, 2021

I would not split the file as the first step. Instead, OCR the entire file, then run a script on the individual pages in it searching for reference number, and when found extract that page as a new file. This will make it easier to generate the output CSV file and will also save you a couple of steps along the file.

The script to do it would have to be custom-made, and will be quite complex, though.

 

If you're interested in hiring someone to create it for you (for a fee, of course) feel free to contact me privately via [try6767 at gmail.com] to discuss it further.

Abambo
Community Expert
Community Expert
September 3, 2021

All is possible, but you are looking for a highly automated process that needs extensive programming. It's probably worth to investigate, if you have a lot (not a thousand, but thousands) of documents to work on. And it won't be a cheap solution…

ABAMBO | Hard- and Software Engineer | Photographer