[JS/CS6] How can I reach or remove leftover text that's got stuck in the XML Structure

Report · Sep 27, 2013

Sometimes the remainders of deleted text are found at a parent or root level of the XML structure. How can I "find it" from scripting?

Above, a sample image. The root should normally be just "<>", and there can exist hundreds of tagged elements under the root, that must not be removed. The only thing that is to be removed is the text that is showing inside the Root element, since it is not in use anywhere.

To fix it the way I want it, I can manually drag the root element to the document, enter the Story Editor (Ctrl+Y) and delete the text. I do it in the story editor in order to avoid deleting all other child nodes of the root, which easily happens if i just delete the text in the textframe. When I delete the textframe, after having deleted the top level text that was "stuck" before, everything looks nice.

I would like a to write a script that removes text that has got stuck in the root in the way that is shown in the image (the root is not placed on the page, so it should be empty in the case I'm on).

Setting the contents of the xmlElement to '' results in all sub elements being deleted (not what I want).

Thanks,

Andreas

Report · Oct 01, 2013

1. Remove all the child elements using xmlElement.remove()

2. Empty the root elements content,

3. Untag toor element (see this thread for how to untag the root element).

http://forums.adobe.com/message/4038391

Report · Oct 08, 2013

But removing all child elements wouldn't that kind of... remove all child elements..?

As I wrote in the question "there can exist hundreds of tagged elements under the root, that must not be removed. The only thing that is to be removed is the text that is showing inside the Root element,"

Report · Oct 08, 2013

This isn't elegant but should work. I can't think of a way to distinguish the 0xFEFFs that "hold" the child elements from other 0xFEFFs you might have in the Root story, so this would leave any of those you had hanging around.

var root = app.activeDocument.xmlElements[0],
    i, character;
if (root.parentStory.constructor.name === 'XmlStory') {
    for (i = root.xmlContent.characters.length - 1; i >= 0; i--) {
        character = root.xmlContent.characters;
        if (character.contents.charCodeAt(0) !== 65279) {
            character.remove();    
        }
    }
}

Jeff

Report · Oct 10, 2013

I didn't think of checking the individual characters and using "0xFEFF" to leave the xml elements alone. I was up on the xmlElement level looking for a solution.

This seems to work just fine. It removes all text from the root element, and leaves the children of the root untouched, just as I wanted. No child elements should be removed, so your concern about distinguishing child elements with xml contents from other (empty?) child elements was probably due to a misunderstanding, or to my question being vague.

Thanks!

Report · Oct 10, 2013

I was up on the xmlElement level looking for a solution.

In a normal xpath implementation, "text()" from the Root node would get you everything you want to remove here. But

doc.xmlElements[0].evaluateXPathExpression("text()");

gets you the entire story in InDesign.

Report · Mar 22, 2017

Maybe something changed (ID 2015.4 Release 11.4..1.102) but I had to change absqua's code to character.contents.toString().charCodeAt(0) otherwise I get a "character.contents.charCodeAt is not a function" -

/* REMOVE LEFTOVER TEXT STUCK IN THE XML STRUCTURE
   PROPS: absqua - https://goo.gl/e0drdN
*/
/* console.log - WTFPL licence - https://goo.gl/xUbtH7 */
var console = {
  log: function(msg) {
  $.writeln(msg);
  }
};
var root = app.activeDocument.xmlElements[0],  
    i, character;  
  
if (root.parentStory.constructor.name === 'XmlStory') {
  console.log("root.xmlContent.characters.length:" + root.xmlContent.characters.length);
  for (i = root.xmlContent.characters.length - 1; i >= 0; i--) {  
    character = root.xmlContent.characters;  
    console.log(character.contents.toString() + "[" + character.contents.toString().charCodeAt(0) + "]");
    if (character.contents.toString().charCodeAt(0) !== 65279) {
      character.remove();      
    }  
  }
}

Report · Mar 23, 2017

I don't think anything changed in InDesign... Probably you just happened to get a hold of a SpecialCharacters enum character.

I can't test this at the moment, but it seems like the comparison could just be

if (character.contents !== "\ufeff") { //
...

rather than the charCodeAt test I had in the original post.

Jeff

Report · Mar 24, 2017

Hi Jeff,

your code should work.

don't know if it's relevant here, but also see this post by Marc Autret:

Re: Search document for SpecialCharacters Enumerator

I tested several combinations of contents with a selected XML tag marker with InDesign CS6 8.1.0.
And also with a selected Note object, that is also FEFF.

// XML tag mark selected in text frame:
var character = app.selection[0].characters[0];
character.contents === "\ufeff" ; // true
character.contents === "\uFEFF" ; // true
character.contents.charCodeAt(0) === 65279 ; // true
character.contents.charCodeAt(0).toString() === "65279" ; // true
character.contents.charCodeAt(0).toString(16) === "feff" ; // true
character.contents.charCodeAt(0).toString(16).toUpperCase() === "FEFF" ; // true
character.texts[0].contents === "\ufeff" ; // true
character.texts[0].contents === "\uFEFF" ; // true
character.texts[0].contents.charCodeAt(0) === 65279 ; // true
character.texts[0].contents.charCodeAt(0).toString() === "65279" ; // true
character.texts[0].contents.charCodeAt(0).toString(16) === "feff" ; // true
character.texts[0].contents.charCodeAt(0).toString(16).toUpperCase() === "FEFF" ; // true

Regards,
Uwe

[JS/CS6] How can I reach or remove leftover text that's got stuck in the XML Structure

1 Correct answer

Photos