Legend

Question

How to determine which column a certain character is in for a text with three columns?

Forum|Forum|2 years ago
August 31, 2023
7 replies
1558 views

I need to extract the index from the main text, and I don't like the indexing function that comes with the system because it's too difficult to use.

The customer hopes that the index keyword will be followed by the page number and the corresponding column (A, B, C),

I ran into it today, I was very anxious and exhausted

Like this:

Overview 168a

Finance 260b

Industry 210c

If I can use a script to determine which column a certain character is in, I can add a, b, and c to that character using regular expressions

Please provide guidance from experts.

Thank you very much~

This topic has been closed for replies.

P

Peter Kahrel

Community Expert

Moving goal posts. . .

> But this can only mention the characters in [], but many do not have [] identification.

It would have been useful if you had mentioned this straight away.

> Character style extraction index needs to be specified.

Which character styles? Be precise and complete.

> Some paragraph styles also need to be extracted.

Again, which?

> Some paragraph styles span an entire paragraph, so there is no need for abc after the page number

Paragraph styles always target entire paragraphs, that's why they're called paragraph styles.

That references to them therefore need no column identifier doesn't make sense to me, but, oh well.

Anyway, you can't expect more in this forum than you got now. Time to hire a script writer and sort out exactly what you want.

dubloveAuthor

Legend

Sorry, I have gone too far.

Actually, I only want to mark columns A, B, and C for character or paragraph styles.

I can handle the rest manually.

P

Peter Kahrel

Community Expert

I am weak. I can't resist indexes. The script below looks for all words wrapped in brackets and stores their locators (page numbers and column ids). That's the easy part. Then it sorts the locators by page number and column id, and sorts the index. That's a bit more involved. Finally, it outputs the index as a single string into a new document.

(function () {
  
  // Set a regex to look for words wrapped in brackets
  
  app.findGrepPreferences = null;
  app.findGrepPreferences.findWhat = '\\[\\K.+?(?=\\])';

  var index = {};
  var index2 = [];
  var stories = app.documents[0].stories.everyItem().getElements();

  // Return an object like this:
  // A term followed by an array of locators,
  // where each locator is an object
  // index[worda]: [{folio: 23, letter: b}]
  // index[wordb]: [{folio: 13, letter: a}, {folio: 15, letter: c}]

  function getIndex (story) {
    var o = {};
    var term;
    var frame;
    var pageNum;
    var columnId;
    
    var terms = story.findGrep();
    for (var i = 0; i < terms.length; i++) {
      try {
        frame = terms[i].parentTextFrames[0];
        pageNum = terms[i].parentTextFrames[0].parentPage.name;
        
        columnId = story.insertionPoints.itemByRange (
          frame.insertionPoints[0].index,
          terms[i].index
        ).textColumns.length-1;
        
        term = terms[i].contents;
        o = {
          folio: pageNum, 
          letter: String.fromCharCode (columnId+97)
        }
        
        // If the term is already in the index,
        // add the page
        if (index[term]) {
          index[term].push(o);
        } else {
          index[term] = [o];
        }
      } catch (_) {
        // Probably found something in overset text
      }
    }
  }

  // Sort by folio and letter
  // This may raise some eyebrows,
  // but it works fine.

  function objSort (a, b) {
    return Number(a.folio) - Number(b.folio) || a.letter > b.letter;
  }

  //------------------------------------------------------
  // Assemble the index

  for (var i = 0; i < stories.length; i++) {
    getIndex (stories[i]);
  }


  // If a term has two or more page references, sort them.
  // Transform the index into an array,
  // and add the term into the elements 
  // so that we can sort the index.

  for (i in index) {
    if (index[i].length > 1) {
      index[i].sort (objSort);
    }
    
    index2.push ({
      term: i,
      pp: index[i],
    });
  }

  // Sort the index by term

  index2.sort (function (a, b) {
    return a.term > b.term 
  });

  // Create output string

  var s = '';
  var j, refs;
  for (i = 0; i < index2.length; i++) {
    s += index2[i].term + ' ';
    refs = index2[i].pp;
    for (j = 0; j < refs.length; j++) {
      s += refs[j].folio + refs[j].letter + ', ';
    }
    s += '\r';
  }

  // Remove line-final commas (ugh)

  s = s.replace(/, (?=\r)/g,'');

  // Now place it somewhere.
  // To be threaded manually

  app.documents.add().textFrames.add ({
    geometricBounds: [0,0,'20cm','20cm'],
    contents: s,
  });

}());

dubloveAuthor

Legend

Thank you.

But this can only mention the characters in [], but many do not have [] identification.

Character style extraction index needs to be specified.

Some paragraph styles also need to be extracted.

Some paragraph styles span an entire paragraph, so there is no need for abc after the page number

Robert at ID-Tasker

Legend

Are you on a PC?

P

Peter Kahrel

Community Expert

> I don't like the indexing function that comes with the system because it's too difficult to use.

I'd be interested to know why you think it's difficult to use.

What you're after shouldn't be too hard to script: Look for all the words/phrases in brackets and collect the page number they're on and the column index they're in.

Robert at ID-Tasker

Legend

Another way would be to check HorizontalOffset.

dubloveAuthor

Legend

@m1b ~ Hello, could you also help me take a look at this

--------------------

File is here

--------------------

I want to use the directory index and need to know which column 【　】 is in (a or b or c)

For example, this:

Overview 166a

Newyok 332c

I am now manually applying the corresponding character styles (】 A,】 B,】 C), and after proposing the directory, I use regular notation to mark it as a, b, and c

This kind of efficiency is very low

What I urgently need to solve now is: what is a good way to quickly apply the corresponding style.

Cannot insert characters, as this will cause version changes.

(We need to apply a style to the last character of the target for better control in the later stage)

Robert at ID-Tasker

Legend

You won't get column info from the style.

The easiest and most efficient way would be to create custom script for that...

L

Laubender

Community Expert

Hi @dublove ,

can you show a typical spread of your layout?

( Hidden characters visible, frame edges visible. One text frame selected so we can see how it is threaded. )

Are there any split or span paragraphs?

Are there any tables that contain text that should be indexed with a, b or c for the specific column?

Oh. And this question is really important to answer:

Does this layout use text frames with three columns?

Or is the text running through three different text frames per page?

Regards,
Uwe Laubender
( Adobe Community Expert )

P

Peter Kahrel

Community Expert

This one may well be slower, no idea, but it certainly is less typing!

frame = app.selection[0].parentTextFrames[0];

frame.parentStory.insertionPoints.itemByRange (
  frame.insertionPoints[0].index,
  app.selection[0].index
).textColumns.length-1;

Of course, it ignores spans and splits.

P.

Marc Autret

Legend

Thanks Peter, much better solution! itemByRange(x1, x2).textColumns.length is a great trick.

Best,

Marc

Marc Autret

Legend

Hi @dublove,

Given a Text instance (e.g. a word, or some selection), you can use its textColumns[0] property to identify the TextColumn it belongs to. Then, relative to the parent text frame, you need to retrieve the corresponding index to get the column number. (Maybe there is faster method but I'm not aware of an intrinsic property of the TextColumn object that would tell its number.)

Here is a basic code for testing the idea:

function getColPos(/*Text*/tx,  a,x,i)
//----------------------------------
// Return the column position (0-based) of a Text instance.
// => uint [OK]  |  -1 [KO]
{
   // Array of column indices.
   // ---
   a = tx.parentTextFrames;
   if( !( a && a.length ) ) return -1;
   a = a[0].textColumns.everyItem().index;

   // Particular column index of `tx`.
   // (-1 deals with overset text.)
   // ---
   x = tx.textColumns[0];
   x = x.isValid ? x.index : -1;

   // Find the column number i (0-based).
   // ---
   for( i=a.length ; i-- && x != a[i] ; );
   return i;
}

// TEST (Assuming a word is selected.)
// =======================================
var tx = app.selection[0];

var colNum = 1 + getColPos(tx);
var sample = "`" + tx.contents + "`";
if( colNum )
   alert( sample + " found in column #" + colNum );
else
   alert( sample + " is not visible" );

Best,

Marc

TᴀW

Legend

Excellent script.

(You don't use x !== a[i]? I've been trying (misguidedly?) to get myself into the habit of using === as much as possible.)

Not that it's relevant to the OP's question so much, but in the case of a frame with span and split columns, the script will give possibly counterintuitive answers since InDesign counts each span and split as a separate column.

I've often tried to puzzle out how to get the index of the "parent" column – the column in which the span or split paragraph resides, but it's been too much of a brain-twister for me: there just seem to be too many permutations because splits can themselves have spans and splits...

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded