Skip to main content
Participating Frequently
March 13, 2023
Question

Scanning only the first page of multiple documents with action wizard

  • March 13, 2023
  • 3 replies
  • 1719 views

Hello,

I have been trying to solve this problem for a couple of weeks with information from the internet and have not been able to.

I have thousands of documents of which I would like to scan only the first page of each one and then save that single page as a separate document renamed with "OCR_" + "original name".

I have found the following javascript code to save only the first page of a document:

this.extractPages(0, 0, this.path.replace(/\.pdf$/i, "_p1.pdf"));

but I can't manage to add the step of scanning only that page before saving it.

 

Could someone help me?

 

Thank you very much in advance.

 

This topic has been closed for replies.

3 replies

try67
Community Expert
Community Expert
March 14, 2023

You can't perform the Recognize Text command with JS (at least not fully automatically). If you're using Actions you would need to first extract all the first pages of the files (possibly to a different folder), then (using another Action) run Recognize Text on them, and merge them back into the originals (based on the file-name), and remove the old pages.

Bernd Alheit
Community Expert
Community Expert
March 13, 2023

What does you mean with "scan" ?

Participating Frequently
March 13, 2023

I'm sorry I didn't express myself well, the documents are in PDF image format. What I mean by "scan" is to apply the OCR scan function to the images to convert them into searchable text

Bernd Alheit
Community Expert
Community Expert
March 13, 2023

Put only the first page in the scanner.

Participating Frequently
March 13, 2023

How can I configure this? I have thousands of documents and I want a process configured through action wizard, that opens the first document (of 200 pages) scan only the first one and save it separately, then repeat the same procedure for the rest.

Participating Frequently
March 13, 2023

I am sorry I did not express myself well, the documents are in PDF image format.