Converting a PDF to MS Word - Font spacing issue.

New Here ,
May 15, 2020 May 15, 2020

Copy link to clipboard

Copied

I have and old manually typed document that I scaned to PDF.  The PDF document reads well. I then converted the PDF to Word file from Acrobat DC,  In some instancies the Word file had a text compression (letters and spaces mashed together) I tried changing fonts, paragraph and line spacing and could not correct.. When I pasted the compressed line in this text box the compression went away. So I made a jpeg with a screen capture.

 

PDF ot Word.jpg

 

TOPICS
Edit and convert PDFs , General troubleshooting , Scan documents and OCR

Views

6.5K

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 15, 2020 May 15, 2020

Copy link to clipboard

Copied

Hi Relmeaux,

 

First off, I can't help you. However, I can join in your grief in that I also had a typed document that after scanning and OCR-ing also had dreadful results. Parts were so bad it was eaiser to retype them than to repair the text.

 

Other than to rasis my hand and say "Me Too," was curious as to what typewriter did the original? The original that I was dealing with used the Courier font. Was yours a Selectric where you could change the font ball?

 

Anyhow, good luck to all. I find it interesting that when processing text in a document, manual typing text seems to work less well than other kinds of text on paper. Curious.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
May 20, 2020 May 20, 2020

Copy link to clipboard

Copied

Hi,

 

Thank you for reporting the issue. Have you started facing this issue recently?

Can you please tell us if it is specific to some font or a file? If possible, can you please share the input and output files with us so that we can take a look into the issue and provide you the better experience with our services.

 

Thanks and Regards,

Akanksha

Software Engineer II
Adobe Acrobat Team

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 20, 2020 May 20, 2020

Copy link to clipboard

Copied

We had a similar example about a year ago.

Found that the text (or portions of the text) had kerning/tracking applied, and it probably was introduced by Acrobat's OCR utility, attempting to mimic/represent the visual appearance of the original scanned text.

 

Adobe, please check your utility: it should not be adding tracking/kerning/letterspacing to any text during the OCR process.

 

We corrected the resulting Word file by removing all manual overrides at the character level. Two ways to do that in Word/Windows (Mac-ers, only the second method is available in Word/Mac):

  1. Open Word's Navigator panel.
  2. Select the text (we recommend selecting ALL the text with Control + A).
  3. Click the bottom-most pink eraser to erase manual overrides (formatting).
  4. The text should snap and appear normal.

 

OR ...

  1. Select the text.
  2. Open Word's Font Panel (see screen capture below).
  3. Select the Advanced tab at the top.
  4. Set the Spacing to Normal and leave the "By" field blank.
  5. The text should snap and appear normal.

Correct bad text kerning/tracking in Word.Correct bad text kerning/tracking in Word.

 

Hope this helps.

 

Bevi Chagnon | PubCom | Designer & Technologist for Accessible Documents
| Books & Classes | Accessible InDesign | Accessible PDFs | Accessible MS Office |

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 20, 2020 May 20, 2020

Copy link to clipboard

Copied

Hi Bevi,

 

Interesting approach, good to know for the future. 

 

What I did was to save the documents as straight, unformatted text ( .txt) that removed ALL formatting, including the page breaks so that importing that back into Word for subsequent formatting and correcting all of the OCR errors.

 

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 14, 2022 Sep 14, 2022

Copy link to clipboard

Copied

Hi @gary_sc

Saving to .txt format (ascii text) often does the trick on this type of legacy formatting.

But not always.

We've found ascii retains and passes through some deep formatting, like mail merge codes and section breaks.

So give .txt a try, and if that doesn't completely strip the file down to its pure content, you'll have to do more extensive stripping. It's one reason why we keep a copy of Corel WordPerfect on one of our computers: its Reveal Codes utility is a lifesaver! (FYI, Corel is now Alludo.)

 

Bevi Chagnon | PubCom | Designer & Technologist for Accessible Documents
| Books & Classes | Accessible InDesign | Accessible PDFs | Accessible MS Office |

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 15, 2022 Sep 15, 2022

Copy link to clipboard

Copied

LATEST

Hi Bevi,

 

Interesting. Unfortunately I am not sure one can find WordPerfect any more for the Mac. I wonder if BBEdit (by BareBones Software) might also be a viable option. I can't check right now as I'm on holiday and am away from my computer with all of the software that I normally have access to. (I'm using my wife's laptop right now.)

 

When I get back in early Oct I'll see what BBEdit can do with those issues. 

 

Thanks for the thumbs up on this issue, I didn't know that.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 14, 2022 Sep 14, 2022

Copy link to clipboard

Copied

its not helpful..

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines