Copy link to clipboard
Copied
We have been recently trying to achieve creating an OCR enabled document from a pdf file and with the help of Adobe’s PDF services connectors we were able to do it up to certain extent however we faced the real challenge when the file were as huge as about 350-400 pages. Adobe’s OCR action has a limitation of 50 pages at a time. Luckily, we found a solution to this and thought of splitting the file and then merging it so get the expected outcome and we figured out a way also to do that but we are still seeing some problems while merging the file.
The approach:
Using the split pdf action to split file into 49 pages each -> using OCR action to file instance (apply to each) -> creating and storing the files in a temp folder after OCRed -> getting the temp files’ content and appending them to a temp array variable
The problem:
We realized that there are size limitation to array variable in power automate.
Are there any ideeas / solution on how to merge multiple files using the "merge" function via PDF that can be used to merge multiple documents without hitting the variable limitations?
Thank You!
PS: And just in case tehre is a possibility to open an service ticket for a paid API subscription... i would love to find it!
Copy link to clipboard
Copied
Are you keeping the file content in an array or are you writing it to a file store like SharePoint or something? Because if you are reaching array limits, that would be the way to get around it:
Copy link to clipboard
Copied
Thank you for the fast reply!
We are writing the splited files to a Sharepoint temp folder. After that we tryed to merge all files but here we reached the arreay limit. Is your recomendation to merge them one by one into the final pdf or to do this for all at once?
Thank You!
Copy link to clipboard
Copied
Do you mean like this?
If so it still gives me the same corrupted format error as it needs content of the file.
And if I add a step to get file content it goes into loop and I do not understand how would merge step work in a loop.
Could you please send a snapshot of the steps to be more clearer?
Copy link to clipboard
Copied
If you have to split, you don't have to work one at a time, that would be inefficient.
Suppose the limit is 10 and you have N files. You can combine 10, then another 10, and so on until you have N/10 files.
Now, repeat the process with the N/10 files, until you have only 1 file. With a limit of 10, you could combine 1000 pages in 111 steps.
Copy link to clipboard
Copied
The merge pdf action asks for file content and in order to get the file content I have to run it in a loop and store all files content in an array variable, when I do that the variable exceeds the limits.
Like mentioned before, could you please provide a snapshot of what you mean.
AIM: we are trying to split a file, OCR it and merge it back
Find more inspiration, events, and resources on the new Adobe Community
Explore Now