• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Remove hyphenation

New Here ,
Sep 16, 2018 Sep 16, 2018

Copy link to clipboard

Copied

The PDF-to Word converter doesn't handle line breaks well. Hyphenations at line breaks are kept as hard hyphens. Line breaks that are a result of formatting is kept as hard line breaks.  It requires a lot of work in Word afterwards to remove those.

It would be nice to have some options for the conversion process so you could make the conversion a bit more intelligent.

Views

3.4K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Sep 16, 2018 Sep 16, 2018

The fault is with the application that created the PDF. It probably added the hyphens and line-breaks to it, which it shouldn't have. The export simply maintains what it finds in the PDF.

Votes

Translate

Translate
Community Expert ,
Sep 16, 2018 Sep 16, 2018

Copy link to clipboard

Copied

The fault is with the application that created the PDF. It probably added the hyphens and line-breaks to it, which it shouldn't have. The export simply maintains what it finds in the PDF.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 17, 2018 Sep 17, 2018

Copy link to clipboard

Copied

In an untagged PDF there is no mark to say if a hyphen is hard or soft. They are the same character. So it's down to guesswork. There are no paragraph marks, no soft or hard returns. Just the text where it is. It's all guesswork, sometimes Acrobat guesses more as we would like, sometimes not.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 17, 2018 Sep 17, 2018

Copy link to clipboard

Copied

Some PDF creators add line-breaks characters at the end of lines in a PDF, although it's unnecessary, which causes the output to be incorrect.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 17, 2018 Sep 17, 2018

Copy link to clipboard

Copied

There isn't really such a thing as a line break character in an untagged PDF. Well, nevertheless some PDF creators might use CR and/or LF characters in a string, which luckily in most fonts show as nothing at all... but in the PDF they aren't line endings, or special.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 05, 2020 Mar 05, 2020

Copy link to clipboard

Copied

In case anybody is interested, I found a sort of a workaround for this problem (and it is definitely a problem) -- after creating the Word doc, I use the "Replace" (as in Find/Replace) in Word -- I replace "- " (hyphen space) with an optional hyphen (go to more-special-optional hyphen) -- then it will replace the hard hyphen with an optional (discretionary) hyphen and automatically still break the word there, but it will now be one word, and the discretionary hyphenation will stay intact for future purposes. Now, you're goint to have to hit the replace button one at a time so you can see what you're replacing as you go, because there may be legitimate instances (such as seven-day basis) that you don't want to change by using the "Replace All" button, so pay attention. It's the best option I've figured out so far for this problem.

 

 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 07, 2020 Mar 07, 2020

Copy link to clipboard

Copied

LATEST

Try searching for a "hard" hyphen followed by a line-break and replacing if with the "soft" hyphen.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources