Scripted OCR doesn't let me script finding text, manual OCR does
When I script the OCRing of an image PDF, it creates bounded boxes and can't find text unless the cursor is in that particular bounded box.
However, if I manually (Enhance Scans > Recognize Text > In this file > Settings > Output = Editable Text and Images, OK) OCR the file, the findtext command works.
Document is already open when I run this VBA script:
Set aApp = CreateObject("AcroExch.App")
Set aAVDoc = aApp.GetActiveDoc()
Set aPageView = aAVDoc.GetAVPageView()
Set aPdDoc = aAVDoc.GetPDDoc() pageCount = aPdDoc.GetNumPages
' Get PDF OCR'd
For curPage = 0 To pageCount - 1
aPageView.GoTo curPage
aApp.MenuItemExecute ("TouchUp:EditDocument")
Next curPage
rtgFound = aAVDoc.FindText("accordingly", 0, 0, 1) rtgFound is False. If I manually OCR the document and run this code:
Set aApp = CreateObject("AcroExch.App")
Set aAVDoc = aApp.GetActiveDoc()
Set aPageView = aAVDoc.GetAVPageView()
Set aPdDoc = aAVDoc.GetPDDoc()
pageCount = aPdDoc.GetNumPages
rtgFound = aAVDoc.FindText("accordingly", 0, 0, 1) rtgFound is True. Is it possible to automate Acrobat to OCR into "Editable Text and Images"? That is currently the default UI setting, but it doesn't seem to make a difference.
If I have to search every one of the hundreds of little boxes, what would I have to loop through? Are there other options?
Many thanks!
