Copy link to clipboard
Copied
Hi,
I create a a fm book using FM 10. I create a PDF for this fm book.
When I copy the text from the PDF, the copied text is pasted without spaces.
Can anyone help me solve this problem.
Regards
Parinita
Copy link to clipboard
Copied
I think I've seen this a couple of times, but can't remember what caused it: sorry I can't give you a better answer.
One comment I've seen – different people have identified the same issue with different wording – is
The only thing a PDF reader can see is what letters are approximately on a line. It can't see spaces as a space character, since there is no such thing in a PDF. All it has is smaller and larger gaps between letters. And thanks to kerning or justified text those aren't even consistent. So what PDF readers usually do is to guess what gaps are spaces and what gaps are not. Depending on the algorithm used the results are fairly good or horrible.
There was also one post that suggested PDF/A may give better results.
Of course, if you're the person who created the .pdf from FM 10 files you might get better results copy/pasting from the FM source?
Copy link to clipboard
Copied
Thanks Niels,
I have the source files, I can always copy paste the content.
But the problem is if the customer copies the text from the pdf and tries to paste it, it is pasted without spaces.
I am not able to understand what the problem is??
Copy link to clipboard
Copied
> I am not able to understand what the problem is??
The problem is that the PDF may or may not have the apparent spaces encoded as space characters, particularly at line ends, but also between words and perhaps even between characters. The rendering engine (Ps or PDF driver) may have chosen to break "word1 word2" into two strings with two starting coordinates and no U+0020 space character (or alternative space characters) at all.
If you are generating PDFs that must support copy&paste, one workaround might be to:
I've not experimented with any of that. I also have no idea if using any accessibility (Section 508) practices might provide the PDF reader/viewer with more clues for proper plaintext extraction (particularly keeping flows in proper order).
Copy link to clipboard
Copied
Yes, it's like that.
Copying from PDF also commonly brings in garbage characters at end of line.
Text copied from PDFs requires careful proof-reading and other touch-ups.
Copy link to clipboard
Copied
I am not a Framemaker user but I discovered this problem with some scientific journal articles downloaded in Adobe pdf format. The "copy text to a highlighted Comment" feature in Adobe Acrobat XI (Mac) rendered the text with no spaces. This occurs only certain pdf files from particular journals. Copying and pasting the entire text within the Comment annotation box into BBedit (using the show invisibiles feature) on my Mac showed that in fact there were absolutely no white space characters in between the words. A big nuisance since I use this feature for generating annodated summaries. I did find a fix that works for me in this particular case. I opened the pdf file in question with Mac Preview rather than Adobe Acrobat and resaved the file under a new name. Preview rendered the text in the new file such that Adobe Acrobat copies the highlighted text with the expected white space between words. However, Preview removes the existing Bookmarks and perhaps other metadata that I do not know about, so beware that there may be other differences. In any event, as remarked elsewhere, the rendering of spaces obviously depends on the pdf reader. Whether this helps in your case, I of course do not know. Lends support to my axiom that computers allow you to do amazing things, but they do not save you time. Second axiom is that no matter how many times you have performed an operation succesfully in the past is no guarantee that it will work next time. -j
Copy link to clipboard
Copied
Yep, opening a pdf in preview fixed this issue for me!
Copy link to clipboard
Copied
If you want the text as a continuous text from copying the pdf text then you have to go under edit pdf and click on edit.
2. These boxes around the text should appear and so you can now copy the text and the whole text should be copied without that space after each line that is shown in the pdf.
I sent two screenshots on how you can do it too.
Copy link to clipboard
Copied
If you convert the PDF to word doc (.docx) then save as .pdf the spaces are introduced when copying from the new version.
Copy link to clipboard
Copied
[FIX: use Mendeley PDF preview function]
I encountered the same problem with one PDF file (out of some 200 I've copy pasted contened from over the past 2 months).
What solved it for me was opening the mentioned file in Mendeley. Once I paste the same passage to a .docx, the words are separated by spaces like in the PDF.
Judging from the other comments, this kind of logic seems to have worked for others as well, with other PDF readers. Mendeley might not be your best choice as it's a bibliographic organizer, not a PDF reader.
Cheers!