How can we preserve space characters when exporting to the pdf format?
The document contains programming code listings, which people should be able to copy and paste into the editor/IDE to run them. But for this to be possible, all of the leading line spaces have to be preserved.
can you show one screenshot from InDesign with invisible characters showing plus one screenshot from Adobe Reader where the issue comes up? Maybe the best strategy is to provide form fields with the contents? However, then you would have to format the available text with Acrobat Pro's form field tools.
( ACP )
Thank you for your reply!
This is how it looks like in InDesign:
The font is Consolas (fixed char width) and the text is colored to match the code syntax highlighting in the Sublime Edit.
After exporting to pdf and a try to copy the code, you can see that spaces are not preserved:
It looks like there is just a single space.
How it looks like when we copy it from the pdf file and paste into Sublime Edit:
How it supposed to look (bottom part shows the spaces there).
Basically, we need these space characters to be there for readers to be able to copy the code and paste it into the code editor.
"Maybe the best strategy is to provide form fields with the contents?" - I'm quite new to InDesign so also not sure what does that mean, I'll google it a bit.
"However, then you would have to format the available text with Acrobat Pro's form field tools." we cannot afford reformatting a text. The book has almost 700 pages, and many of the pages contain code inserts with a code colored to match syntax highlighting. Redoing this by hand is not really an option (we rendered this code as html, did not color it by hand). We basically just want to keep the spaces. They are there in InDesign.
How should we go about it?
Thank you in advance.
PDF is ABSOLUTELY unsuited for this and always has been. It is not the right solution for distributing anything with technical meaning using copy/paste. InDesign cannot make PDF suitable, no matter what options you use.
what's your PDF reading software?
Did you try this with tabs instead of blank characters?
FWIW: Even if my suggestion would work in one PDF reading app, in another one it could fail.
Using formfields would require Adobe Reader either on Windows 10 or Mac OS X.
( ACP )
It is not about my reader, but what people are going to use to read it, so we can expect about anything. The most popular reader is a web browser (as in my case) since it does not require anything additional to be installed, and everyone has a web browser. The ebook is an addition to a printed book. We have no control over how people are going to read it.
So... it sounds like there's no way to leave the spaces in the text? Why is that (I'm really trying to understand this)? Also, tabs are not an option, the indentation in Python is made by 4 spaces (PEP 8 guidelines).
"Why is that?" Because spaces in text may be replaced with moving the cursor in a PDF. It can be more efficient, and it certainly is a requirement with justified text.
(Your code is not justified – but why create separate output routines for that when you can also use a single routine?)
So the PDF does not contain "one or two or four spaces", it contains text with a smaller or larger move inside. Spaces at the start are simply ignored; because you have to supply a starting point for the text anyway, why not start with the first visible character? It's all about efficiency, and originally the PDF format was designed to correctly represent a design on screen and in print.
That said: there are 2 options that may help you, although InDesign does not allow immediate access to these. A PDF may contain an "ActualText" entry for any text fragment, which should be the preferred contents when copying (with the right software – your intended browser usage is actually a bad omen, and Acrobat Pro may possibly perform better even with your current file).
The other option is kind of a guess. What will InDesign do when the space character is actually visible? It should then be required to output that literal character! Normally I would suggest to just try it with my very own IndyFont 🙂 as all it needs is a font with a small dot or dash in the space character – but, off the top of my head, I think IF ignores a custom space because it ought to be a rare use case, and the chances of messing up space handling would be too large.
Still: it might be worth locating a visible-space font elsewhere and just testing to see what happens.
Ok, I kind of understand this, but we're also using the monospaced font and when you use it, you never want to justify it. This, however, does not change the mechanics, you just simply do not justify it.
I understand your entire explanation, but from the other side a space is a character as other characters, just something called a whitespace, and, technically, could be retained.
"your intended browser usage is actually a bad omen" - again, it's not mine, but how people will read it, most likely. And it is not bad in my opinion, as we are talking about consumers who want to read. A version for print is treated by us in a different way. Acrobat Pro is not an option then, we cannot tell people to use it if they want to read the book.
The font-related solution - but wouldn't InDesign ignore spaces anyway? It has to be a space character, not another character which looks like space. So even if we "paint" a space, it is still a space and is going to be treated the same way, right? It cannot be some other looking-like-a-space character, as when you copy and paste it, it must be a space.
Tabs are really no option at all, they do not exist in PDF as a concept. What you see is all there is: the words are what you see. The spacing may be done in any number of ways, one of which is to put a space character, but that's often not the way. You have assumed, as many others have - but wrongly - that PDF is just a wrapper around the text, which will be unaltered.
"The spacing may be done in any number of ways, one of which is to put a space character, but that's often not the way." - it is the only way in a programming world. Well... maybe next to tabs used in some other languages.
"You have assumed, as many others have - but wrongly - that PDF is just a wrapper around the text, which will be unaltered." - ok, but space is a character like other characters, right? Why there is no option to leave it? We have so many paragraphs and char-level options to control different behaviours. I don't really see how a space, and an ability to leave them, might be this problematic. Just theorizing here as I already have learned that simple, as I thought, thing is impossible to do. 🙂
The user would need to use Acrobat or Reader to use the buttons. You can add a message to your PDF that would only appear if the PDF were opened outside of Acrobat or Reader, the message would say something like: This PDF must be opened in Acrobat or the free Adobe Reader for full functionality.
We'll think on it. It is some solution to our problem.