Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Remove hyphenation

New Here ,
Sep 16, 2018 Sep 16, 2018

The PDF-to Word converter doesn't handle line breaks well. Hyphenations at line breaks are kept as hard hyphens. Line breaks that are a result of formatting is kept as hard line breaks.  It requires a lot of work in Word afterwards to remove those.

It would be nice to have some options for the conversion process so you could make the conversion a bit more intelligent.

5.1K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Sep 16, 2018 Sep 16, 2018

The fault is with the application that created the PDF. It probably added the hyphens and line-breaks to it, which it shouldn't have. The export simply maintains what it finds in the PDF.

Translate
Community Expert ,
Sep 16, 2018 Sep 16, 2018

The fault is with the application that created the PDF. It probably added the hyphens and line-breaks to it, which it shouldn't have. The export simply maintains what it finds in the PDF.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 17, 2018 Sep 17, 2018

In an untagged PDF there is no mark to say if a hyphen is hard or soft. They are the same character. So it's down to guesswork. There are no paragraph marks, no soft or hard returns. Just the text where it is. It's all guesswork, sometimes Acrobat guesses more as we would like, sometimes not.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 17, 2018 Sep 17, 2018

Some PDF creators add line-breaks characters at the end of lines in a PDF, although it's unnecessary, which causes the output to be incorrect.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 17, 2018 Sep 17, 2018

There isn't really such a thing as a line break character in an untagged PDF. Well, nevertheless some PDF creators might use CR and/or LF characters in a string, which luckily in most fonts show as nothing at all... but in the PDF they aren't line endings, or special.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 05, 2020 Mar 05, 2020

In case anybody is interested, I found a sort of a workaround for this problem (and it is definitely a problem) -- after creating the Word doc, I use the "Replace" (as in Find/Replace) in Word -- I replace "- " (hyphen space) with an optional hyphen (go to more-special-optional hyphen) -- then it will replace the hard hyphen with an optional (discretionary) hyphen and automatically still break the word there, but it will now be one word, and the discretionary hyphenation will stay intact for future purposes. Now, you're goint to have to hit the replace button one at a time so you can see what you're replacing as you go, because there may be legitimate instances (such as seven-day basis) that you don't want to change by using the "Replace All" button, so pay attention. It's the best option I've figured out so far for this problem.

 

 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 07, 2020 Mar 07, 2020
LATEST

Try searching for a "hard" hyphen followed by a line-break and replacing if with the "soft" hyphen.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines