Copy link to clipboard
Copied
Hi:
I was wondering where I could get the official reference for Adobe InDesign Tagged Text. (I've used the Search mechanism, google, etc., but have had no luck.) A PDF would of course be wonderful! Does anyone have a link to it?
Thanks!
Hi all,
You can import and export Tagged Text from InDesign. We've not changed anything recently related to this. Here's a PDF file you can check for more details about InDesign Tagged Text: PDF
Thanks,
Om
Copy link to clipboard
Copied
Hi there,
Please check out this help article: Tag content for XML in InDesign.
Hope it helps!
Regards,
Srishti
Copy link to clipboard
Copied
Is this an official statement, and InDesign will not support importing and exporting tagged text anymore?
"Tagged text" is not XML. Its structure may look like XML, but only (very) superficially, and to the uninitiated only.
The new-and-improved IDML import/export is actually XML, but for simple text import and export, using IDML is massive overkill. Creating and then importing tagged text – in various ways, and with various workflows – has been a staple of my workflow for as long as I have used InDesign, even before IDML. It is ideal for quickly adding formatting to large amounts of plain text, and even though it does not support (not, or badly, or too complicted at least) stuff like bookmarks and hyperlinks, it shines with all plain text format.
Copy link to clipboard
Copied
Thanks shrishtib, but as Jongware correctly notes, tagged text is not XML. As he also correctly notes, tagged text is superior to xml for many purposes (such as mine).
I see that you are an Adobe employee. Is there a link somewhere to documentation for the current version of InDesign Tagged Text? (I'm currently using v13.1.)
Thank you!
Copy link to clipboard
Copied
Hi all,
You can import and export Tagged Text from InDesign. We've not changed anything recently related to this. Here's a PDF file you can check for more details about InDesign Tagged Text: PDF
Thanks,
Om
Copy link to clipboard
Copied
Hi. If nothing has changed since IDCS4 (wow!), then this will do nicely. Thanks!
Copy link to clipboard
Copied
It's a very old tagging system (which dates back to when InDesign was competing with QuarkXPress in the 1990s) and is very little used these days. There have been no changes I've ever heard to it.
Copy link to clipboard
Copied
Ha, very old but still very fantastic and useful! I wonder why they bother to increment the version number while not changing anything else about it?
Copy link to clipboard
Copied
90% of every book manuscript I do starts as tagged text (typically out of Word or a database). Doesn't matter if the tagged text is going into ID or Q. I use TT (tagged text) for more than books as well, such as price lists (in conjunction with data merges via Em Software's data merge & TT solutions), posters destined for hospital hallways, etc. I would hate to return to XML for these jobs as it simply isn't as flexible/extensible.
But while Q has updated its TT (slowly over time), it is still way behind what current versions can produce. ID should perhaps update its TT as well to what newer versions are capable of. Heck, both applications need to for that matter.
Copy link to clipboard
Copied
https://forums.adobe.com/people/MW+Design wrote
90% of every book manuscript I do starts as tagged text (typically out of Word or a database).
Exactly. I have always found that a database to tagged text file to InDesign workflow is by far better and more efficient than the xml alternative.
Copy link to clipboard
Copied
Hi together,
currently I'm looking for some documentation on the tag cSpecialGlyph that is not listed in the linked PDF from reply 4.
It seems that at least one tag is missing in the documentation there. I suspect that are many or some more undocumented.
I found one hint about cSpecialGlyph in a Adobe Tech Note from 2012:
Excruciating details about the Adobe Tech Note #5079 update
…There is special syntax for specifying glyphs by CID. The following is used to specify CID+23058:
<cSpecialGlyph:23058><cSpecialGlyph:>
And another hint in an ExtendScript script from about 2009 where in effect the following is written to a temp Tagged Text file:
<cSpecialGlyph:num><0xFFFD><cSpecialGlyph:>
where num is a variable inside a for loop. For every glyph in the loop the same unicode code point is "applied" ( ? ). Or is this meant as a fall back mechanism if the special glyph with a distinct number is not available?
Source of the script can be downloaded here:
https://indesignsecrets.com/make-a-font-contact-sheet-in-indesign.php
If <0xFFFD> is meant as a fall back it does not work as expected.
On the contrary it leaves the rendered glyphs on a page in a very strange state of existence:
What do I mean with very strange state?
Three main issues:
1. You cannot use Find/Change with Text or GREP to find this selected character.
What you can find is <FFFD> with Text Find or \x{FFFD} with GREP find instead.
But, as I said earlier, all writing glyphs ( not the blanks ) are turned to Unicode point FFFD.
And Text Find or GREP find will only find the ones according to the information in the Info Panel.
2. If you apply a different font to the text the appearance of the glyphs will change.
But in a surprising way!
Just an example. Same document as before. Same text frame. Same story.
The only change is: I applied Minion Pro to the text frame and its text:
3. If I test the contents of the text frame by ExtendScript's contents property the value is a string where blank characters and special character FFFD do alternate.
If someone likes to have a look into this strange document download it from my Dropbox account as IDML file:
Dropbox - SourceSansPro-Light-CharacterMap-CC-2019.idml - Simplify your life
Regards,
Uwe
Copy link to clipboard
Copied
Hi Uwe!
I'm leaving for a day of appointments so I don't have time to download/test the document. I did grab the script and had a quick look at it.
I have no idea why that character, the <0xFFFD>, is in the script. In the normal course of things it wouldn't be used in a TT file. It seems to almost be intended as both a test character for if the file can even be created and also inserted as a pseudo placeholder character for the eventual glyph insertion.
But that character seems it would also abut the regular glyph itself once inserted into the TT file. Kinda like a merge file with a blank field gets it. (At least if I'm recalling properly.)
I don't know what the ramification would be if that character is removed from the script would be. Maybe replaced with a zero-width space. However, maybe what is happening when selecting a glyph is that ID is also selected that little invisable bugger too and so the info is reporting the id of the "last" selected glyph?
Mike
Copy link to clipboard
Copied
Hi Mike,
thank you for commenting!
If I do not insert <0xFFFD> with:
<cSpecialGlyph:num><0xFFFD><cSpecialGlyph:>
the temp Tagged Text file will be imported without contents.
Next thing I'm testing:
I will change the script so that the Tagged Text file will not be removed and show some of its contents later.
With a small sample just for one single glyph.
Regards,
Uwe
Copy link to clipboard
Copied
Hi Mike,
found some explanation about cSpecialGlyph at:
CJKV Information Processing:
Chinese, Japanese, Korean & Vietnamese Computing
by Ken Lunde *
"O'Reilly Media, Inc.", Second Edition 2009
Since this is copyrighted material I will do no quotes here.
It seems there is no official documentation about cSpecialGlyph.
What I will take from this explanation:
I will not work with Tagged Text to do a character map of a given font with InDesign.
Hm. At least not with the cSpecialGlyph tag.
Regards,
Uwe
* Dr Ken Lunde is with Adobe:
CJK Type Blog | CJK Fonts, Character Sets & Encodings. All CJK. #AllOfTheTime.
EDITED
Copy link to clipboard
Copied
Uwe, I have used that same script as a basis for a font explorer* and can confirm all you have found out. The document is perfectly usable but somehow the characters are not "real". The reason you find the characters changing with the font, by the way, is because it uses the glyph index instead of the Unicode code point. Compare what you see with the Glyphs panel, sorted by Glyph id, and you'll see it fits.
Since it worked well enough for my purposes I haven't looked beyond this; perhaps you could check what happens when a IDMS snippet contains a raw glyph (i.e., one without a Unicode, inserted through the Glyphs panel).
*That script goes through an OpenType font and unthreads the GSUB tables and lists all ligatures and substritutions as their components. Internally, a GSUB works exclusively with raw glyph indexes, so that's why I needed it. (Due to – a reasonable guess – the high working memory requirements it only works on the smallest of fonts, though, so no I won't be posting it. It's also excruciatingly slow. I am still hoping for a good Python interface to solve pretty much all of my high scripting demands...)
Copy link to clipboard
Copied
Okay, doing that test with an IDMS tells me absolutely nothing .
The contents of regular text gets inserted as
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Font">
<Properties>
<AppliedFont type="string">Myriad Pro</AppliedFont>
</Properties>
<Content>test = </Content>
</CharacterStyleRange>
and for an unassigned character you get the same as when using the INX route:
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Font">
<Properties>
<AppliedFont type="string">Calibri</AppliedFont>
<CustomGlyph type="string">$ID/g2155</CustomGlyph>
</Properties>
<Content>�</Content>
</CharacterStyleRange>
-- notice that �? Its code is U+FFFD again, InDesign's personal internal placeholder. Now, what if you change the CustomGlyph value to something that should be a character (in Calibri), like 845, the GID for U+003F "?"? It does not work. Not only do you still get the U+FFFD code again, but, as with INX in the worst-case scenario, you also lose the original GID and so you get the Pink Box of Denial. It works again correctly with a custom CID if – and only if – the character does not have a Unicode assigned to it:
By the way, did anyone else know that you can copy an image off your screen and paste it directly into the Jive Editor? It saves much time saving temporary images such as this one. It was always possible with Mac OS X (Shift+Alt+Cmd+4), and the most recent Windows 10 update has added the shortcut Windows+Shift+S for exactly the same purpose.
Copy link to clipboard
Copied
https://forums.adobe.com/people/Om+N+Jha wrote
Here's a PDF file you can check for more details about InDesign Tagged Text: PDF
Thank you, Om,
Is this an official Adobe PDF? It looks like one but it’s on this website: https://acdowd-designs.com/. Is there anything on Adobe’s website?
~ Jane
Copy link to clipboard
Copied
jane-e wrote
…Is this an official Adobe PDF? …
Hi Jane,
yes, this is an official Adobe PDF. Perhaps a little bit outdated.
At least there once was a PDF about Tagged Text that had CS5 in its title instead of CS4 like this one.
I have no high hopes that tag cSpecialGlyph will be explained in that, because that tag already was available with InDesign CS3.
For all reasons explained I would never work with that tag. The results are visually ok, but the "wrong" Unicode code point FFFD will even travel to an exported PDF. So for strictly printing only, yes, we can use this, but never for distributing digital files like PDFs or InDesign documents.
Here I exported this document done with the script that used Tagged Text import with cSpecialGlyph to PDF/X-4 and used Acrobat's Find feature for character "A". One of the results is rather baffling:
Hi Jongware,
IDMS unfortunately gives no hint about the Unicode code points.
Here an excerpt of a sample IDMS file:
<Story Self="ud8" AppliedTOCStyle="n" UserText="true" IsEndnoteStory="false" TrackChanges="false" StoryTitle="$ID/" AppliedNamedGrid="n">
<StoryPreference OpticalMarginAlignment="false" OpticalMarginSize="12" FrameType="TextFrameType" StoryOrientation="Horizontal" StoryDirection="LeftToRightDirection" />
<InCopyExportOption IncludeGraphicProxies="true" IncludeAllResources="false" />
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/FontTableStyle">
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
<Properties>
<CustomGlyph type="long">1</CustomGlyph>
</Properties>
<Content>� </Content>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
<Properties>
<CustomGlyph type="long">2</CustomGlyph>
</Properties>
<Content>� </Content>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
<Properties>
<CustomGlyph type="long">3</CustomGlyph>
</Properties>
<Content>� </Content>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
<Properties>
<CustomGlyph type="long">4</CustomGlyph>
</Properties>
<Content>� </Content>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
<Properties>
<CustomGlyph type="long">5</CustomGlyph>
</Properties>
<Content>� </Content>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
<Properties>
<CustomGlyph type="long">6</CustomGlyph>
</Properties>
<Content>� </Content>
</CharacterStyleRange>
Regards,
Uwe