We have a brand new look! Take a tour with us and explore the latest updates on Adobe Support Community.
I have a 2 page document that we run through a batch Adobe OCR process overnight.
All our documents go through this process as part of our filing system.
This particular file is a PDF that has been created by exporting a file to PDF through Word and not a scanned document.
For some reason the last page always gets rotated.
We are running Adobe Acrobat Pro DC Version 2020.006.20042.
I have turning off Deshew in the Enchance Scan setting as recommended in this old post: https://community.adobe.com/t5/acrobat/is-there-a-way-to-turn-off-page-rotation-prior-to-ocring-a-sc...
The document has one table, 3 lines of text and a one line footer. There are no obvious "images" or text that could be considered upside down for the OCR to correct.
I have looked at some previous posts about this issue but they have not helped and most are atleast 2 year old if not older and so I am looking to see if anyone has a more recent update/fix.
We are sorry for the trouble. As described, OCR is rotating a page during OCR
In general, OCR is trying to figure out what the main text orientation is in the document you are processing. This can be tricky if you have text vertically and horizontally. In such a case, the OCR process may interpret the text orientation wrong and you would end up with a rotated page.
Is this a behavior with a particular PDF file or with all the PDFs? Please try with a different PDF file and check. If its a file specific issue please share the file with us for testing. Upload the file to the document cloud (https://documentcloud.adobe.com/link/home/) generate the link and share it with us,
Also, would you mind sharing the workflow or steps you did to create the PDF file from Word? Please try to recreate the PDF file as described in the help article (https://helpx.adobe.com/in/acrobat/how-to/create-pdf-files-word-excel-website.html) and see if that helps.
You may also try to repair the installation (For Windows only) Go to Help > Repair Installation and see if that works
Let us know how it goes
I've spoken to the employee who sent the document and apparently I mixed up where the PDF comes from. It is apparently generated by the Website for the Pensions Regulator (a UK Government body) and saved to our system.
Unfortunately I cannot give you any more information on how the document is made.
I have uploaded the doc here:
I have redacted some sensitive information on the first page, however the second page remains untouched.
Thank you for sharing the file. We tried to run the OCR on this file and its giving us the error 'Acrobat could not perform recognition because this page contains renderable text'
For more information please refer to the help article (https://helpx.adobe.com/acrobat/kb/error-could-perform-recognition-acrobat.html)
Is that an error that would appear on our copy of Adobe as well or is that a developer only error from debug?
Becuase I just tried running the OCR process on the same redacted file that I provided I have not been shown the same error.
You will get the message when you use
Tools > Scan & OCR > Recognize Text > ...
In that case can I confirm that Adobe Acrobat Pro DC Version 2020.006.20042 is the same version that you are testing the file on. I don't get the same error at all.
Does you use Recognize Text ?
I'm happy to do a screen share/teamviewer option if you want to see it in action.
p.s sorry for late reply I took a week off.
Adobe employee should here have stated you need to select the option "Recognize text-> settings-> output-> Searchable image (EXACT)
This will disregard previously txt in the document, and only process images.
But in our case it always rotates documents nomatter what.
We found the problem does not happen in Acrobat 2017 and earlier, but in DC and PRO 2020 it is definately a program error that it does not re-orient pages to original orientation.
It has nothing to do with deskew as we have already tested that.
Why does you perform OCR on this document? The document has already text.
Its part of our document filing system.
Everything that gets scanned or imported to our system gets processed in the same way, and the final step after indexing and approving is that all new files are OCR'ed overnight.
Yes the file is a generated one PDF rather than a printer scan but it all goes through the same system.
This is the same for us, we get documents of maybe 500-1000 pages and some are txt and others images that need to be ocr scanned, and having random pages of 1000 pages rotated is extremely annoying.
Problem is only happening to DC and DC2020 of Acrobat.
I have just observed the same issue. I have not used the OCR funditon on sheet music files for some time. In earlier version of Adobe Professional I could use OCR to not only correct scanned images for horizontal alignment usign de-skew funciton but it would also rotate upside down images automatically correctly. This is useful when scanning a bound volume that requires rotation to fit on a falt bed scanner where the lid cannot be removed.
Any progress on fixing this in future upgrades?
Today I scanned a set of PDF file that were all correct and the software has rotated random pages 180 degrees. THis