Recognize Text - Output - Description of options

Report · Jun 05, 2022

Product: Adobe Acrobat DC

Function: Recognize Text

When looking in the settings, there are 3 options provided in the drop down menu.

Output:

(1). Searchable Image

(2). Searchable Image (Exact)

(3). Editable Text and Images

Please could someone explain, what are the detailed differences for each of these options.

I am unable to find clear and detailed information about the same Adobe Help, hence this post.

Thanks in advance.

Report · Jun 05, 2022

#1 - Provides an OCR output whose glyphs have no stroke or fill -- so, "invisible" or "hidden".

This method also dresses up the image a wee bit. Thus, an altered image rather than the exact image is provided by the scanner.

Consequently, #1 is typically not acceptable to a FedGov agency (or any entity with an interest in a document of record having the proper "provenance").

#2. An OCR output developed as in #1. But, the exact image remains untouched.

Typically this is what a FedGov agency requires if submitting a scanned image of text.

So, the original image out of the scanner maintains its integrity and the OCR output supports find / search.

#3 ClearScan - Introduced a few versions back. When the bit-map of a character's image is recognized that is replaced with a font (character glyph is seen as it has fill and stroke applied). What is not recognized is left. And more magic...

Bottom line - That image out of the scanner that *was* the exact replica of the hardcopy and thus a valid/legal document of record is blown away, gone. Typically not acceptable for something submitted to a FedGov agency.

View solution in original post

Report · Jun 05, 2022