Highlighted

Remove hyphenation

New Here ,
Sep 16, 2018

Copy link to clipboard

Copied

The PDF-to Word converter doesn't handle line breaks well. Hyphenations at line breaks are kept as hard hyphens. Line breaks that are a result of formatting is kept as hard line breaks.  It requires a lot of work in Word afterwards to remove those.

It would be nice to have some options for the conversion process so you could make the conversion a bit more intelligent.

Most Valuable Participant
Correct answer by try67 | Most Valuable Participant

The fault is with the application that created the PDF. It probably added the hyphens and line-breaks to it, which it shouldn't have. The export simply maintains what it finds in the PDF.

TOPICS
ExportPDF

Views

763

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

Remove hyphenation

New Here ,
Sep 16, 2018

Copy link to clipboard

Copied

The PDF-to Word converter doesn't handle line breaks well. Hyphenations at line breaks are kept as hard hyphens. Line breaks that are a result of formatting is kept as hard line breaks.  It requires a lot of work in Word afterwards to remove those.

It would be nice to have some options for the conversion process so you could make the conversion a bit more intelligent.

Most Valuable Participant
Correct answer by try67 | Most Valuable Participant

The fault is with the application that created the PDF. It probably added the hyphens and line-breaks to it, which it shouldn't have. The export simply maintains what it finds in the PDF.

TOPICS
ExportPDF

Views

764

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Sep 16, 2018 0
Most Valuable Participant ,
Sep 16, 2018

Copy link to clipboard

Copied

The fault is with the application that created the PDF. It probably added the hyphens and line-breaks to it, which it shouldn't have. The export simply maintains what it finds in the PDF.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Sep 16, 2018 1
Most Valuable Participant ,
Sep 17, 2018

Copy link to clipboard

Copied

In an untagged PDF there is no mark to say if a hyphen is hard or soft. They are the same character. So it's down to guesswork. There are no paragraph marks, no soft or hard returns. Just the text where it is. It's all guesswork, sometimes Acrobat guesses more as we would like, sometimes not.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Sep 17, 2018 1
Most Valuable Participant ,
Sep 17, 2018

Copy link to clipboard

Copied

Some PDF creators add line-breaks characters at the end of lines in a PDF, although it's unnecessary, which causes the output to be incorrect.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Sep 17, 2018 1
Most Valuable Participant ,
Sep 17, 2018

Copy link to clipboard

Copied

There isn't really such a thing as a line break character in an untagged PDF. Well, nevertheless some PDF creators might use CR and/or LF characters in a string, which luckily in most fonts show as nothing at all... but in the PDF they aren't line endings, or special.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Sep 17, 2018 1
New Here ,
Mar 05, 2020

Copy link to clipboard

Copied

In case anybody is interested, I found a sort of a workaround for this problem (and it is definitely a problem) -- after creating the Word doc, I use the "Replace" (as in Find/Replace) in Word -- I replace "- " (hyphen space) with an optional hyphen (go to more-special-optional hyphen) -- then it will replace the hard hyphen with an optional (discretionary) hyphen and automatically still break the word there, but it will now be one word, and the discretionary hyphenation will stay intact for future purposes. Now, you're goint to have to hit the replace button one at a time so you can see what you're replacing as you go, because there may be legitimate instances (such as seven-day basis) that you don't want to change by using the "Replace All" button, so pay attention. It's the best option I've figured out so far for this problem.

 

 

 

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Mar 05, 2020 0
try67 LATEST
Most Valuable Participant ,
Mar 07, 2020

Copy link to clipboard

Copied

Try searching for a "hard" hyphen followed by a line-break and replacing if with the "soft" hyphen.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Mar 07, 2020 0
Resources