Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Using Optical Character Recognition (OCR) to Edit PDFs in InDesign

Contributor ,
Sep 06, 2024 Sep 06, 2024

I have a large collection of 1-2 page PDFs that I need to make small edits to and maintain as documents for future edits. I am able to import a document into Adobe Acrobat Pro and use the optical character recognition feature to convert things to editible text and make changes, but I can't save the document in any lossless way from within Acrobat (other than exporting another flattened PDF).  Does Indesign support similar optical character recognition for importing PDFs?  I'm looking for a way to do this without having to start over on the many documents I have in PDF format.  Since I know Adobe is capable of optical character recognition (Acrobat Pro), and has great document managing software (InDesign) I'm hoping to find a way to do this within the Adobe Creative suite, without having to pay $300 for 3rd party conversion software.  Can I copy-paste from Acrobat Pro to InDesign (after Acrobat has done the text conversion)?

 

 

<Title renamed by MOD>

TOPICS
How to , Import and export
1.4K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 06, 2024 Sep 06, 2024

PDF was designed as an end format, just like exporting to an image or printing to paper. The weak editing capabilites only came along later and as you've discovered, aren't very good. The short answer is that there isn't any good way to turn a PDF back into an editable document. In theory, InDesign is adding such a feature, and you can edit PDF a bit clumsily in Illustrator, and there are some third-party solutions  — but I know of nothing that is here/available, easy to use, "capable" and cheap.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Sep 06, 2024 Sep 06, 2024

Its kind of lame that Acrobat Pro doesn't let you save out an "Acrobat" file (similar to a PSD) so you can save your efforts before exporting to a PDF (like you can do in all their other applications).

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 06, 2024 Sep 06, 2024

It does. It's called an INDD file, a DOC/X file, an XLS/X file... PDF was never meant to be an editable or source document file in its own right. It's a "print to digital" format that produces perfect digital page images from almost any app — some with their own export code, or a shared export engine, or the feeble "print to PDF' driver that can let any app do just that.

 

The model has always been that if you needed changes, you went back to your source doc and app and re-exported to PDF. There's no room in that model for some sort of intermediate, editable source document.

 

The current slow trend towards re-import and editability is hacky reverse engineering and I doubt it will ever get better than current editing in AI.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 06, 2024 Sep 06, 2024

Try InDesign Beta - but texts have to be texts - not bitmaps. 

 

Or you can always use Illustrator. 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 09, 2024 Sep 09, 2024

Good point. Also, the beta only works with PDF created from InDesign which means the text is most likely not images. 

 

In fact, if the PDFs are all images and no actual text, I doubt anything would work other than an OCR and exporting to Word to clean up.

 

David Creamer: Community Expert (ACI and ACE 1995-2023)
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 09, 2024 Sep 09, 2024
quote

Good point. Also, the beta only works with PDF created from InDesign which means the text is most likely not images. 


By @Dave Creamer of IDEAS

 

That's what I've thought - and mentioned it in another post - but, I think @Peter Spier,  corrected me that it doesn't have to be created in Beta - can be any PDF.

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 09, 2024 Sep 09, 2024

Can be any PDF created from InDesign per my test.

image.png

David Creamer: Community Expert (ACI and ACE 1995-2023)
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 09, 2024 Sep 09, 2024
quote

Can be any PDF created from InDesign per my test.

image.png


By @Dave Creamer of IDEAS

 

Right, then maybe the comment was "from any InDesign" - not "any program". 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 09, 2024 Sep 09, 2024
LATEST

Pretty sure that wasn't me....

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 09, 2024 Sep 09, 2024

If you have "many" documents in PDF format, what would the cost per document for a $219-299 US for a year? If you have 150 documents, your costs per document would be $1.50-$2 per document. The amount of time to recreate and even cut-and-paste would probably be more. 

There is also a per-page conversion company that charges $0.25 per page too. 

 

A couple of notes:

If the PDF was created from Word or PowerPoint (check the Document Properties in Acrobat), you should be able to get a clean export back to Word or PowerPoint respectively. If the PDF was created in InDesign, see if you have access to the Beta; convert the PDF to InDesign and save as CS4 IDML. If not created in either of those programs, you can always tray exporting to Word and see how messy it is.

 

If you go the copy/paste route, Acrobat can export all your graphics for you at once. Then import into InDesign. 

David Creamer: Community Expert (ACI and ACE 1995-2023)
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines