• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
Locked
0

copied text from pdf is pasted without spaces

Guest
Dec 30, 2013 Dec 30, 2013

Copy link to clipboard

Copied

Hi,

I create a a fm book using FM 10. I create a PDF for this fm book.

When I copy the text from the PDF, the copied text is pasted without spaces.

Can anyone help me solve this problem.

Regards

Parinita

TOPICS
Formatting and numbering , PDF output

Views

100.0K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advisor ,
Dec 30, 2013 Dec 30, 2013

Copy link to clipboard

Copied

I think I've seen this a couple of times, but can't remember what caused it: sorry I can't give you a better answer.

One comment I've seen – different people have identified the same issue with different wording – is

The only thing a PDF reader can see is what letters are approximately on a line. It can't see spaces as a space character, since there is no such thing in a PDF. All it has is smaller and larger gaps between letters. And thanks to kerning or justified text those aren't even consistent. So what PDF readers usually do is to guess what gaps are spaces and what gaps are not. Depending on the algorithm used the results are fairly good or horrible.

There was also one post that suggested PDF/A may give better results.

Of course, if you're the person who created the .pdf from FM 10 files you might get better results copy/pasting from the FM source?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Dec 30, 2013 Dec 30, 2013

Copy link to clipboard

Copied

Thanks Niels,

I have the source files, I can always copy paste the content.

But the problem is if the customer copies the text from the pdf and tries to paste it, it is pasted without spaces.

I am not able to understand what the problem is??

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 30, 2013 Dec 30, 2013

Copy link to clipboard

Copied

> I am not able to understand what the problem is??

The problem is that the PDF may or may not have the apparent spaces encoded as space characters, particularly at line ends, but also between words and perhaps even between characters. The rendering engine (Ps or PDF driver) may have chosen to break "word1 word2" into two strings with two starting coordinates and no U+0020 space character (or alternative space characters) at all.

If you are generating PDFs that must support copy&paste, one workaround might be to:

  • tag those sections with a fixed-pitch font (Courier is apt to be the most reliable),
  • turn off Kerning for the paragraph and/or character tag used,
  • set Spread to 0%,
  • Stretch to 100%,
  • do not use Alignment:Justified,
  • Advanced: Word Spacing 100% 100% 100%, and
  • Automatic Letter Spacing [off].

I've not experimented with any of that. I also have no idea if using any accessibility (Section 508) practices might provide the PDF reader/viewer with more clues for proper plaintext extraction (particularly keeping flows in proper order).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 30, 2013 Dec 30, 2013

Copy link to clipboard

Copied

Yes, it's like that.

Copying from PDF also commonly brings in garbage characters at end of line.

Text copied from PDFs requires careful proof-reading and other touch-ups.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Jan 21, 2014 Jan 21, 2014

Copy link to clipboard

Copied

I am not a Framemaker user but I discovered this problem with some scientific journal articles downloaded in Adobe pdf format.  The "copy text to a highlighted Comment" feature in Adobe Acrobat XI (Mac) rendered the text with no spaces.  This occurs only certain pdf files from particular journals.  Copying and pasting the entire text within the Comment annotation box into BBedit (using the show invisibiles feature) on my Mac showed that in fact there were absolutely no white space characters in between the words.  A big nuisance since I use this feature for generating annodated summaries.  I did find a fix that works for me in this particular case.  I opened the pdf file in question with Mac Preview rather than Adobe Acrobat and resaved the file under a new name.  Preview rendered the text in the new file such that Adobe Acrobat copies the highlighted text with the expected white space between words.  However, Preview removes the existing Bookmarks and perhaps other metadata that I do not know about, so beware that there may be other differences.  In any event, as remarked elsewhere, the rendering of spaces obviously depends on the pdf reader.  Whether this helps in your case, I of course do not know. Lends support to my axiom that computers allow you to do amazing things, but they do not save you time.  Second axiom is that no matter how many times you have performed an operation succesfully in the past is no guarantee that it will work next time.  -j

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jul 28, 2017 Jul 28, 2017

Copy link to clipboard

Copied

Yep, opening a pdf in preview fixed this issue for me!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jan 07, 2016 Jan 07, 2016

Copy link to clipboard

Copied

If you want the text as a continuous text from copying the pdf text then you have to go under edit pdf and click on edit.

2. These boxes around the text should appear and so you can now copy the text and the whole text should be copied without that space after each line that is shown in the pdf.

Screen Shot 2016-01-07 at 8.13.42 PM.pngI sent two screenshots on how you can do it too.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Feb 16, 2016 Feb 16, 2016

Copy link to clipboard

Copied

If you convert the PDF to word doc (.docx) then save as .pdf the spaces are introduced when copying from the new version.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 03, 2019 Oct 03, 2019

Copy link to clipboard

Copied

LATEST

[FIX: use Mendeley PDF preview function]


I encountered the same problem with one PDF file (out of some 200 I've copy pasted contened from over the past 2 months).

What solved it for me was opening the mentioned file in Mendeley. Once I paste the same passage to a .docx, the words are separated by spaces like in the PDF.

Judging from the other comments, this kind of logic seems to have worked for others as well, with other PDF readers. Mendeley might not be your best choice as it's a bibliographic organizer, not a PDF reader. 

Cheers!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines