Copy link to clipboard
Copied
Hello,
I have been trying to solve this problem for a couple of weeks with information from the internet and have not been able to.
I have thousands of documents in PDF image format of which I would like to scan only the first page of each one and then save that single page as a separate document renamed with "OCR_" + "original name".
I have found the following javascript code to save only the first page of a document:
this.extractPages(0, 0, this.path.replace(/\.pdf$/i, "_p1.pdf"));
but I can't manage to add the step of scanning only that page before saving it.
Could someone help me?
Thank you very much in advance.
Copy link to clipboard
Copied
Use 2 actions
Copy link to clipboard
Copied
Hi,
I don't think you can managet this in the order you want to, I think what you can do though is to extract the page that you want to be OCR'd then run another action to OCR those documents.
because the OCR action does not have any settings to control what gets OCR'd, it is just all or nothing.
So the full work flow would be
1 - Run action to extract Page1 to new location
2 - Run action to OCR the new documents.
Copy link to clipboard
Copied
Thank you very much for your answer Barlae,
I had already thought about doing it in two steps, the problem is that I will have maybe hundreds of thousands of couments and doing it in two steps could double the processing time, I thought there would be a way to run the ocr scan using javascript code
Copy link to clipboard
Copied
Hi,
The OCR function is not really available from coding at all, despite lots of requests for it.
Here is a little bit of a crazy idea.
1. change your flow to delete all but the first page ( you might want to copy the documents first )
2. OCR the now 1 page document.
This action would look something like -
Copy link to clipboard
Copied
Thank you@BarlaeDC
I think it's a good idea, what would be the JavaScript code to remove all pages except the first one?
Copy link to clipboard
Copied
Hi,
You should be able to create the JavaScript required from here - https://opensource.adobe.com/dc-acrobat-sdk-docs/library/jsapiref/doc.html#deletepages