Copy link to clipboard
Copied
I would like to OCR the attached PDF page, preferably into editable text and images. However, whatever I try, it rotates the elements on the page by something like 20°. If I select Searchable Image (Exact), then the recognized text gets rotated. The rotation doesn’t even make sense.
How do I tell Acrobat not to rotate the elements on the page?
OK, I also had the Old UI going, so I used that.
First, go into Scan & OCR
Next, along the top bar, select "Enhance" and then Scanned Document
Next, click on the Gear icon for Settings
And lastly, on the bottom of the window (seen above) click on the Edit of Text Recognition Options
Then go to Deskew and turn that off.
As a side, but related issue. I get a business-related journal that I scan for storage. On my flatbed scanner, I scan one side, flip 180°, scan the next page, wash, rin
...Copy link to clipboard
Copied
My guess is that it's focusing on these lines (in red) and not on the vertical line.
Let me know if you're using the new or old User Interface and I'll tell you how to turn the auto-roate off.
Copy link to clipboard
Copied
Thanks for the quick and detailed reply, Gary!
I am using the new interface, but can switch back to the old interface in case that helps. I faintly remember encountering the same issue years ago.
Copy link to clipboard
Copied
OK, I also had the Old UI going, so I used that.
First, go into Scan & OCR
Next, along the top bar, select "Enhance" and then Scanned Document
Next, click on the Gear icon for Settings
And lastly, on the bottom of the window (seen above) click on the Edit of Text Recognition Options
Then go to Deskew and turn that off.
As a side, but related issue. I get a business-related journal that I scan for storage. On my flatbed scanner, I scan one side, flip 180°, scan the next page, wash, rince, repeat for all ≈ 60+ pages. With Deskew turned on, Acrobat's OCR recognizes that the text is 180° off and rotates the page back to 0° and I'm good to go. So, it definately has some advantages. So, when you're done with this, you might want to turn it back on.
Let me know if this solves your issue.
Copy link to clipboard
Copied
Thank you very much, Gary! Unfortunately, this does not seem to work if I want to get editable text and images. The issue with Searchable Image as output is very big.
There are many pages that have black text on white background, plus a grayscale image. With editable text and images, the grayscale image is stored separately from the text, which can be compressed using different algorithms. This brings down PDF size to a fraction of the original size, while maintaining quality. With searachable image as output, the PDF size remains high.
I think I should slowly look at other software. This issue has been in Acrobat since ages.
Copy link to clipboard
Copied
I now switched to doing OCR from a the command line on a Linux machine using a software called OCRmyPDF. That is much faster, compression can be finely tuned, and I don’t get the issue with scewed pages.
I still marked your answer as accepted. It is very detailed and probably the best that can be done with Acrobat at the moment. It’s a pity, though, that this bug has not been fixed in years. It looks like Acrobat doesn’t get much development other than user interface overhauls.
Copy link to clipboard
Copied
@feklee, Thank you for that. What marking answers correct mostly does is to help those with similar issues find answers without having to wait for the answers to come to you (if they are helpful and if they come! :>))
Your issue is a bit unique, so while disappointed it didn't work, I was only hopeful that it might. Why Acrobat's OCR engine would follow those lines is beyond me. And you are correct; Acrobat has not done much with their OCR for some time. Adobe pays for the OCR engine (for the life of me, I cannot remember which company they use). The one thing I do wish is that they'd use some of this AI stuff going around to help the OCR process. There is so much that could be done with that, but it's not being done at all — or at least not yet. Maybe OCRmyPDF might do that and embarrass Adobe to do something as well!
Good luck!