Skip to main content
charlesk56914348
Participant
January 8, 2019
Question

manual zones OCR

  • January 8, 2019
  • 1 reply
  • 992 views

I am coming from using Omnipage and Abbyy line of OCR products at my previous employer, to your OCR technology by Image Recognition Integrated Systems S.A. Copyright 1987 - 2014 with my current employer:

i find your OCR auto-selects zones (types, sizes, and/or shapes) that are nearly impossible to edit due to so many "wrong" or excessive zones auto-created as well as multiple excess zones.. Is there anyway to treat the image as 100% text to at least allow extraction to Word or Excel for proper formatting. Or is there a way to merge many lines of overlapped zones.  Or even better i would love to create my own zone sizes and types manually, i just can't seem to find instructions how.

Thanks for any suggestions at all.

This topic has been closed for replies.

1 reply

gary_sc
Community Expert
Community Expert
January 8, 2019

Hi Charles,

Yup, quite a bit different.

Unfortunately I'm not sure I understand the nature/design of the page you're trying to capture. Is it like a magazine page? Newspaper? Text book?

Also, what are you trying to get out of this process: just the text, recreate the whole page? What?

And effectively no, the OCR engine in Acrobat will not treat an image as text. If there are text elements within an image, than maybe unless part of the image crosses over the text.

And just to ask ahead of time incase it is necessary, what is your OS (and release) and which version of Acrobat (and release). Similarly, which scanner are you using and which software is Acrobat using to do the scanning?

Sorry for all the questions but most if not all are necessary to give you the best help we can.

charlesk56914348
Participant
July 22, 2019

just to "close" my query - your reply solved the issue - i.e. "no"

What i was trying to do was take old pdf files (font based - not bitmap) created by adobe from a MS Word file "printer" and convert them back to word files. The original Word files were from long gone faculty members.

There are lots of formulas, tables, special formatting that "translate back wrong".  I was trying to simplify the reversal by modifying the auto generated zones before exporting.... Thanks anyway. I did a manual fix based on the part Acrobat got right...

C