Copy link to clipboard
Copied
I am trying to edit keyboard text of a book that was scanned to pdf and has a bunch of handwritten notes on it. This isn't as easy as I thought it would be. I used the OCR in acrobat Pro and tried to erase and add text but the page started to get hard to control. Everything is moving out of line. Can someone explain what I will need to do to acomplish this?
Copy link to clipboard
Copied
If you have handwritten notes, squiggles, coffee stains, etc., that are overlapping text that you want to OCR, there will be issues. You can try to take these documents and open them in Photoshop to remove the handwritten stuff, etc., but that will be tedious work. It often is the case that it's easier to retype the text than to try to OCR over such text. Can you get a copy of the book without the handwritten notes?
Also, just in case you might find this handy, here's a blog I wrote some years back on scanning and OCR.
Good luck!
Copy link to clipboard
Copied
Thank you for your concise answer!
Copy link to clipboard
Copied
So, your saying if I had a copy without the handwriting, I should be able to do it fine?
Copy link to clipboard
Copied
Well, you would not have all the issues mentioned above, but there is another issue: the page curl.
There is one issue with books that I did not mention before because it was a different issue. This has to do with the "curl" that pages have when you're working with a bound book. I'm sure you've seen this when photocopying sections from a book. First, be aware that OCR will go downhill with this "curl," and the only way to get around it is to partially or fully destroy the book. The former is by over-flexing the pages to lie as flat as possible; the latter is to cut the book up into pieces so that each section can lie flat. The human eye can read and understand this curl distortion; computers, not so much.
As an aside, I got into a short conversation the other day with a person in the forum about using AI to help OCR. I'm all for it and would love to see this. The first OCR product that comes out with a viable format will be leaps and hurdles above anything else on the market. But we do not have anything like this — yet!
Copy link to clipboard
Copied
I appreciate taking the time to reply. I should have mentioned that the book was typed on loose paper and I may be able to get a copy without the handwritng, which if is the case, it should go fairly smoothly, yes?
Copy link to clipboard
Copied
Oh, in that case YES! Please be sure to read the blog I linked to before. It will give you some strong guidance as to how to get the best scan for OCR.
Find more inspiration, events, and resources on the new Adobe Community
Explore Now