• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

OCR a PDF that is printed from another PDF

New Here ,
Aug 24, 2020 Aug 24, 2020

Copy link to clipboard

Copied

I have an original PDF that is OCR'd. For certain purpose, I have to print that PDF to PDF. (Save as wouldn't serve the purpose for me.) I have found out that printed PDF is no longer OCR'd and CAN'T be OCR'd. May I know if there is any way I can 

 

1. Print a PDF to PDF and keep OCR on the printed PDF

or 

2. OCR the printed PDF

 

Thank you,

TOPICS
Create PDFs , Edit and convert PDFs , How to , Print and prepress , Scan documents and OCR

Views

558

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 24, 2020 Aug 24, 2020

Copy link to clipboard

Copied

Printing to PDF will likely remove all OCR.

ORC should still work in the printed PDF, if the resolution is sufficient.

Can you explain why you need to print to PDF? There may be a better option.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 24, 2020 Aug 24, 2020

Copy link to clipboard

Copied

A bit different from Luke's question, HOW did you print to PDF, Acrobat doesn't allow that. Did you print (to PDF or Adobe PDF) via a 3rd party application (e.g., Apple's Preview application?)?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 24, 2020 Aug 24, 2020

Copy link to clipboard

Copied

Interesting - I would have never tried to print a PDF from Acrobat, but since you mentioned one can't, I tried!  I guess I like to live life on the edge  😉

It took me through the motions, after selecting the Adobe PDF as the desired printer, and way in the back (behind a few other windows) was a Save PDF File As... dialogue box.  Who knew?

As long as the PDF is image based, one should be able to run the OCR on it.  The result is very dependent on the quality and contrast in the text image.

Dave

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 24, 2020 Aug 24, 2020

Copy link to clipboard

Copied

Dave, I found this out one time when I wanted a PDF of a PDF but that had multiple pages on each page (like 2-up or 4-up). Can't be done.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 24, 2020 Aug 24, 2020

Copy link to clipboard

Copied

Print to PDF (or "refrying the PDF") will always result in a lower quality PDF - that does not necessarily mean quality as in resolution, but features that are missing. When you OCR a PDF, the font that is used for the recognized text is created on the fly, based on the glyphs that are in the text. When you then print to PDF, the information about how to get from the glyph (the drawing of a character) back to the original character is lost. Also, because the file is OCRed, it is no longer an image. Your only option at this point is to save the PDF as individual images (e.g. high resolution TIFF images) and then combine these images into a PDF file and then OCR again. Not a straightforward approach, but that will work. Having said that, chances are that there is a way to accomplish what you need without saving to PDF and then trying to OCR again - what is it that you think is not possible without refrying the document? 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 24, 2020 Aug 24, 2020

Copy link to clipboard

Copied

LATEST

People give many, many reasons why printing PDF to PDF is a bad idea. This is one reason. I know you state you must do this, but perhaps if you share the issue you are solving, we can help you find a way that is less damaging.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines