Copy link to clipboard
Copied
I work in the pharmaceutical industry and I thought it was a requirement to have all PDF files OCR'd when sending them to FDA. When I mentioned that it was commented to me that when you OCR a pdf file the text in that file will change. It was mentioned that once we OCR a file you need to compare the 2 files.
Is this true?
Copy link to clipboard
Copied
How did you create the PDF files?
Copy link to clipboard
Copied
Acrobat (Pro or Standard) offers 3 ways to OCR.
(1) Searchable Image
(2) Searchable Image (Exact)
(3) ClearScan
#1 - Provides an OCR output whose glyphs have no stroke or fill -- so, "invisible" or "hidden".
This method also dresses up the image a wee bit. Thus, an altered image rather than the exact image as provide by the scanner.
Consequently #1 is typically not acceptable to a FedGov agency (or any entity with an interest in a document of record having the proper "provenance").
#2. An OCR output developed as in #1. But, the exact image remains untouched.
Typically this is what a FedGov agency requires if submitting a scanned image of text.
So, the original image out of the scanner maintains its integrity and the OCR output supports find / search.
#3 ClearScan - Introduced a few versions back. When the bit-map of a character's image is recognized that is replace with a font (character glyph is seen as it has fill and stroke applied). What is not recognized is left. And more magic...
Bottom line - That image out of the scanner that *was* the exact replica of the hardcopy and thus a valid / legal document of record is blown away, gone, dent de lion in the wind eh. Typically not acceptable for something submitted to a FedGov agency.
So - You use #2. But, there is more! What is the required resolution? Often it is 300ppi. Was lossy compression used? Typically a no-no.
So gotchas may result in submittal rejections.
For a submittal to a FedGov agency never-ever rely on hearsay; talk is cheap and like as not wrong or incomplete.
It is "your" submittal eh. Fetch and become one with the agency submittal guidelines / requirements. That's your success path.
Be well...
Find more inspiration, events, and resources on the new Adobe Community
Explore Now