Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

How to Convert PDF Photo of Newspaper Article

New Here ,
Jan 24, 2021 Jan 24, 2021

I subscribe to Newspapers.com and am clipping newspaper articles and downloading them to my computer as a PDF file.  I then am trying to convert those PDFs to Word.  When I do that, I find that the word document is a photo image and I cannot edit the text.  How can I capture the text of the newspaper article into an editable Word document?  Need instructions.  See sample file below.

TOPICS
Edit and convert PDFs , How to
4.6K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
1 ACCEPTED SOLUTION
People's Champ ,
Jan 24, 2021 Jan 24, 2021

Your downloaded PDFs are graphics of the newspaper pages, not live editable text.

 

You can attempt to convert them to editable, searchable text with Acrobat's OCR tool, Scan & OCR. It will attempt to interpret the graphical text and give you something you can then export to an MS Word.docx.

 

And you'll need either Acrobat Pro and Standard to do this, not the free Reader.

1. Select Scan & OCR from the RIGHT toolbar, or top left Tools Tab.1. Select Scan & OCR from the RIGHT toolbar, or top left Tools Tab.

 

2. Select Recognize Text from top Tools.2. Select Recognize Text from top Tools.

 

Once the text is recognized (OCR'ed), you can now Save As / Save As Type = Word Document .docx and open the recovered text in MS Word.

 

|    Bevi Chagnon   |  Designer, Trainer, & Technologist for Accessible Documents |
|    PubCom |    Classes & Books for Accessible InDesign, PDFs & MS Office |

View solution in original post

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 24, 2021 Jan 24, 2021

I just took a look at that website and noticed that it's carrying newspaper articles from way way way back. That's cool!

 

However, what they've done is to just post images saved into the PDF format such as you had as a link in your question. They were never made searchable and were never saved or altered to be searchable. They are simply images saved in the PDF format. I was easily able to make it searchable though because I have Acrobat Pro DC.

 

If you are using Adobe Reader, you cannot do this and will need to update your Acrobat application. If you are on Windows you can update to Acrobat or Acrobat Pro. If you are on a Mac, the only option is Acrobat Pro.

 

Please note that I do not work for Adobe, I, like most people in these forums are just folks with Acrobat experience and are willing to take time to help others out. I do not receive a commission for any sales, I'm just answering your question to the best of my ability.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
People's Champ ,
Jan 24, 2021 Jan 24, 2021

Your downloaded PDFs are graphics of the newspaper pages, not live editable text.

 

You can attempt to convert them to editable, searchable text with Acrobat's OCR tool, Scan & OCR. It will attempt to interpret the graphical text and give you something you can then export to an MS Word.docx.

 

And you'll need either Acrobat Pro and Standard to do this, not the free Reader.

1. Select Scan & OCR from the RIGHT toolbar, or top left Tools Tab.1. Select Scan & OCR from the RIGHT toolbar, or top left Tools Tab.

 

2. Select Recognize Text from top Tools.2. Select Recognize Text from top Tools.

 

Once the text is recognized (OCR'ed), you can now Save As / Save As Type = Word Document .docx and open the recovered text in MS Word.

 

|    Bevi Chagnon   |  Designer, Trainer, & Technologist for Accessible Documents |
|    PubCom |    Classes & Books for Accessible InDesign, PDFs & MS Office |
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jul 02, 2023 Jul 02, 2023

Hi, I did this, and now I have the same image but as a word file. Is there a way to get it so that I just have the text and it look like a standard word document, not a newspaper?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 02, 2023 Jul 02, 2023

Did you do an OCR on the file? The problem is probably, that the news clippings are images. If you convert a PDF image to Word, you get a Word image. I would expect that.

 

To get the text as text (and real images as image) you need to try to run an OCR program on the image. Such a program tries to find text and to convert that back into computer readable text.

 

When you export to Word, you will get the result of this OCR as a Word file, but Acrobat tries to keep the formatting as it was, and you need probably to modify that to get a nice Word file. It helps, if you try working with not too complex layouts.

ABAMBO | Hard- and Software Engineer | Photographer
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 02, 2023 Jul 02, 2023

There is no way for a computer system to know that you do not want the image, only the text. 

what you need to do, after ocr-ing the text is to copy the text and place that in a word document. Be ready to manually correct the text because there will be errors. The best and fastest way to do this is manually.  

good luck!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 02, 2023 Jul 02, 2023

One other thing that may help you is this blog I wrote for Adobe a number of years ago. 

 

http://photosbycoyne.com/Gary's_Help/Scanning/clean-scanning.html

 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 13, 2024 Jun 13, 2024
LATEST

@YT Editing Hu38036276gndq ,

Do you have a message for us?

ABAMBO | Hard- and Software Engineer | Photographer
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines