Copy link to clipboard
Copied
Hi. Using ID 20.5 under Win 11.
I've posted previously, but want to make sure I'm clear on my question:
Question:
Does InDesign include any way to insert an HTML bookmark containing specific text at a specific location in the InDesign source material prior to HTML output, so that ID outputs the bookmark as HTML at that specific location?
Background:
For example, I want to insert a bookmark such as "custom-bookmark" at the beginning of a specific paragraph -- just like I might insert a PDF bookmark (and so obtain HTML output reading id="custom-bookmark" at that location) so that another web page (NOT UNDER MY CONTROL and not part of my document) can include an <a href="somefile.html#custom-bookmark">go here</a> and display my standalone page.
Here's the use-case: I'm delivering an online HTML version of an existing manual for a client's SaaS app, and the programmers want to call into the docs to display specific content by referring to the specific bookmarks in the manual that I place there. This is a critical requrement. I cannot figure out how to enter such a bookmark*. I'm not talking about doing fancy HTML layout work with an inappropriate tool -- I'm outputting an existing InDesign text (just not as print or PDF).
*Yes, I can enter such a bookmark manually, but only after generating the HTML. There might be 100 bookmarks, but it can be done. (a) Have you seen the HTML code that ID generates? Each word is a separate item. Seems... odd. (b) Once done, if there were any revisions that resulted in new output, all the bookmarks would have to be redone... manually. Not a good approach. That's why I want to insert such bookmarks in the ID source. I have tried using the Bookmarks window and the Hyperlinks window... and while I may be doing something wrong, the HTML output appears to ignore both.
Thanks as always to the community.
-j
I remember seeing your question the last time you posted it - and since I will go a long, long way out of my way to avoid exporting HTML from InDesign, I thought that maybe one of our other regular posters would be able to write up a good answer for you.
(a) Have you seen the HTML code that ID generates?
Yep! Hence my reluctance to use it. Are you exporting "HTML5" or "HTML (Legacy)"?
Each word is a separate item. Seems... odd.
Not my experience. I mean, the "Legacy" HTML output c
...Copy link to clipboard
Copied
I remember seeing your question the last time you posted it - and since I will go a long, long way out of my way to avoid exporting HTML from InDesign, I thought that maybe one of our other regular posters would be able to write up a good answer for you.
(a) Have you seen the HTML code that ID generates?
Yep! Hence my reluctance to use it. Are you exporting "HTML5" or "HTML (Legacy)"?
Each word is a separate item. Seems... odd.
Not my experience. I mean, the "Legacy" HTML output can be dodgy sometimes, but I'm looking at a bunch of perfectly normal <p> tags. The HTML5 output is more full of seemingly spurious <span> tags, but it's not quite so bad as "each word is its own element".
Does InDesign include any way to insert an HTML bookmark containing specific text at a specific location in the InDesign source material prior to HTML output, so that ID outputs the bookmark as HTML at that specific location?
Not without doing some post-export editing. At least, I haven't found a method yet. I see that I can add a Hyperlink Destination of the "Text Anchor" variety, and I can name that destination "custom-bookmark" but upon Legacy HTML export, the the custom-bookmark designation is lost:
<a id="_idTextAnchor000" target="_blank"></a>
On the other hand, if I style that text anchor with a unique character style, let's say it's an empty character style called "custom-bookmark" then the resultant output is
<span class="custom-bookmark"><a id="_idTextAnchor000" target="_blank">
and it does this reliably enough that if I were to give each desired hyperlink destination its own uniqely named character style, then I can pop the HTML open in Notepad++ and use a regex to transform all such bookmarks throughout the file into
<span class="custom-bookmark"><a id="custom-bookmark" target="_blank">
Haven't actually tested it yet but the regex would look something like this:
Find: (span class=")(.+?)("><a id=")(_idTextAnchor\d\d\d)(")
Replace All with: $1$2$3$2$5
So no, I can't figure out a way to do it without doing post-export editing of HTML, but it needn't be manual; you could automate almost all of the heavy lifting.
Edit: I can see that the syntax is different in the HTML5 export, but if you're exporting HTML5 and want to pursue this kind of automation, I'm sure that I could cook up another regex find/change operation that would work on HTML5 exports as well.
Copy link to clipboard
Copied
This is great. I will try it -- but I think that, short of internal tinkering by Adobe, this will be as close to correct as it can be. Post-export processing as you describe is doable; I don't think there will be more than 100 bookmarks required (famous last words) and that empty character style approach should work.
BTW, I am using HTML5 output, and I should have made that plain. I tried the legacy, and it didn't work well. Yes, the words are appropriately grouped within <p> tags, but each word is slathered with code apparently intended to display the text as closely to the designed document as possible. Understandable, I suppose. IMHO, there should be a switch to turn off that behavior -- they do it for ebook output. But it's not my program. Perhaps there is another tool that would work better for this, but ID is what I had.
Thanks again. I will post any results from my experiements.
-j
Copy link to clipboard
Copied
This process may be a little more complex than I thought, but it may be doable.
<span id="_idTextSpan013" class="_00-x-600-H_cs-nonectata CharOverride-3" style="position:absolute;top:691.59px;left:4857.37px;">molor</span>
It would be possible to do a search-and-replace of some kind to add an id="cs-nonectata" to the collection of all the other stuff packed around that one word. It would probably be best done to the first word of the paragraph -- which I didn't bother to do in this test. Each empty character style might have to be named differently. The ids can't all be named the same or else the outside calling program couldn't specify which one it wanted.
It can be done. Not pleasantly, but it can be done.
Thanks very much for the suggestion. I will mark your answer as right.
-j
Copy link to clipboard
Copied
All of my experiments were with the HTML Legacy output, which helpfully adds _idTextSpan### to hyperlinks but not to every paragraph. Seems like the HTML5 output isn't quite as helpful - every paragraph has its own _idTextSpan.
Yes, the words are appropriately grouped within <p> tags, but each word is slathered with code apparently intended to display the text as closely to the designed document as possible.
I figure this must be a result of HTML5 output being a repurposing of the code that went into "Publish Online". Also I can imagine that it'd be desirable to have an XSL transform on hand to clean that mess up!
I remain confident that there's some way to make it comparatively pleasant, but I'll admit that the method I've outlined isn't it.
Copy link to clipboard
Copied
<grin /> It's not just that every paragraph is wrapped in a <span> by ID's HTML5 output. Every WORD is wrapped in a span, and it's a span with a lot of specification. Nonetheless, I think that (a) the string "cs-nonectata" could be searched for and leaving that untouched, an id="cs-nonectata" can be added. This is a kluge par excellence. It would be so much simpler to insert a bookmark into ID source, and then -- even if the output is... um... muddled -- the id would be present at the correct location.
Forgot totally about XSLT. Maybe that's the after-the-fact automation I should use. This is why they call us technical writers...
Thanks again.
-j
Copy link to clipboard
Copied
Every WORD is wrapped in a span
I'm not sure why I'm not getting that exact output, but what I'm getting instead is almost as much of a mess.
I'm going to take back what I said about XSLT - maybe all of that extra taggage falls under the category of sleeping dogs, best left undisturbed. Instead, here's a solution that only requires a single post-export regex fix.
1) I made a character style "tagThisWord" that had a character width of 1% and a bright ugly magenta color
2) I typed my fictional programmers' desired HTML bookmark id right into the body text in InDesign
3) I applied the "tagThisWord" style to the bookmark text:
After export, I popped the publication.html open in my preferred text-editor-that-has-regex-support and I replaced the div ID of any span that contains the tagThisWord class with the text content of the span, leaving nothing behind:
Find:
(<span id=".+?".* class=".*tagThisWord.+?>)(.+?)(</span>)
Replace with:
$1$3<span id="$2"><\\span>
This empties the span that contains the tagThisWord class, and moves the text into the id of a new empty span. My first shot at this method actually removed the _idTextSpan### id and replaced it with your bookmark, but I couldn't decide whether it was better to add more unnecessary unwieldy complexity, or to make changes to the preexisting machine-generated unweildy complexity, and "just add more stuff!" seems to have won out. A cleaner solution would remove the CSS connected to the tagThisWord character style, but I'm sticking with my stance that it's maybe best to just... let all that stuff lay where it fell?
Find more inspiration, events, and resources on the new Adobe Community
Explore Now