Participating Frequently

Question

OCR Scanning only the first page of multiple documents with action wizard

Forum|Forum|2 years ago
March 13, 2023
2 replies
2530 views

Hello,

I have been trying to solve this problem for a couple of weeks with information from the internet and have not been able to.

I have thousands of documents in PDF image format of which I would like to scan only the first page of each one and then save that single page as a separate document renamed with "OCR_" + "original name".

I have found the following javascript code to save only the first page of a document:

this.extractPages(0, 0, this.path.replace(/\.pdf$/i, "_p1.pdf"));

but I can't manage to add the step of scanning only that page before saving it.

Could someone help me?

Thank you very much in advance.

This topic has been closed for replies.

BarlaeDC

Community Expert

Hi,

I don't think you can managet this in the order you want to, I think what you can do though is to extract the page that you want to be OCR'd then run another action to OCR those documents.

because the OCR action does not have any settings to control what gets OCR'd, it is just all or nothing.

So the full work flow would be

1 - Run action to extract Page1 to new location

2 - Run action to OCR the new documents.

M

Meme Vintage28851519ihrtAuthor

Participating Frequently

Thank you very much for your answer Barlae,

I had already thought about doing it in two steps, the problem is that I will have maybe hundreds of thousands of couments and doing it in two steps could double the processing time, I thought there would be a way to run the ocr scan using javascript code

BarlaeDC

Community Expert

Hi,

The OCR function is not really available from coding at all, despite lots of requests for it.

Here is a little bit of a crazy idea.

1. change your flow to delete all but the first page ( you might want to copy the documents first )

2. OCR the now 1 page document.

This action would look something like -

Bernd Alheit

Community Expert

Use 2 actions

Extract the first pages
Run OCR on the extracted pages

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded