Converting a PDF to MS Word - Font spacing issue.

Report · May 15, 2020

I have and old manually typed document that I scaned to PDF. The PDF document reads well. I then converted the PDF to Word file from Acrobat DC, In some instancies the Word file had a text compression (letters and spaces mashed together) I tried changing fonts, paragraph and line spacing and could not correct.. When I pasted the compressed line in this text box the compression went away. So I made a jpeg with a screen capture.

Report · May 15, 2020

Hi Relmeaux,

First off, I can't help you. However, I can join in your grief in that I also had a typed document that after scanning and OCR-ing also had dreadful results. Parts were so bad it was eaiser to retype them than to repair the text.

Other than to rasis my hand and say "Me Too," was curious as to what typewriter did the original? The original that I was dealing with used the Courier font. Was yours a Selectric where you could change the font ball?

Anyhow, good luck to all. I find it interesting that when processing text in a document, manual typing text seems to work less well than other kinds of text on paper. Curious.

Report · May 20, 2020

Hi,

Thank you for reporting the issue. Have you started facing this issue recently?

Can you please tell us if it is specific to some font or a file? If possible, can you please share the input and output files with us so that we can take a look into the issue and provide you the better experience with our services.

Thanks and Regards,

Akanksha

Software Engineer II
Adobe Acrobat Team

Report · May 20, 2020

We had a similar example about a year ago.

Found that the text (or portions of the text) had kerning/tracking applied, and it probably was introduced by Acrobat's OCR utility, attempting to mimic/represent the visual appearance of the original scanned text.

Adobe, please check your utility: it should not be adding tracking/kerning/letterspacing to any text during the OCR process.

We corrected the resulting Word file by removing all manual overrides at the character level. Two ways to do that in Word/Windows (Mac-ers, only the second method is available in Word/Mac):

Open Word's Navigator panel.
Select the text (we recommend selecting ALL the text with Control + A).
Click the bottom-most pink eraser to erase manual overrides (formatting).
The text should snap and appear normal.

OR ...

Select the text.
Open Word's Font Panel (see screen capture below).
Select the Advanced tab at the top.
Set the Spacing to Normal and leave the "By" field blank.
The text should snap and appear normal.

Correct bad text kerning/tracking in Word.

Hope this helps.

Report · May 20, 2020

Hi Bevi,

Interesting approach, good to know for the future.

What I did was to save the documents as straight, unformatted text ( .txt) that removed ALL formatting, including the page breaks so that importing that back into Word for subsequent formatting and correcting all of the OCR errors.

Report · Sep 14, 2022

Hi @gary_sc,

Saving to .txt format (ascii text) often does the trick on this type of legacy formatting.

But not always.

We've found ascii retains and passes through some deep formatting, like mail merge codes and section breaks.

So give .txt a try, and if that doesn't completely strip the file down to its pure content, you'll have to do more extensive stripping. It's one reason why we keep a copy of Corel WordPerfect on one of our computers: its Reveal Codes utility is a lifesaver! (FYI, Corel is now Alludo.)

Report · Sep 15, 2022

Hi Bevi,

Interesting. Unfortunately I am not sure one can find WordPerfect any more for the Mac. I wonder if BBEdit (by BareBones Software) might also be a viable option. I can't check right now as I'm on holiday and am away from my computer with all of the software that I normally have access to. (I'm using my wife's laptop right now.)

When I get back in early Oct I'll see what BBEdit can do with those issues.

Thanks for the thumbs up on this issue, I didn't know that.

Report · Sep 14, 2022

its not helpful..

Report · Aug 09, 2023

I find a solution that works for me, when it comes to exporting from .pdf export as the older version of word - .doc all should look okay and then save it out as a docx file

Report · Jul 11, 2024

•Font size: 10 or 12
• Font: Times New Roman
Line spacing: Single line spacing
• Font color: As displayed
• 1 IMAGE OUTPUT 1 PDF