• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
8

How to use scripts or regexes to remove duplicate entries when creating an index?

Advocate ,
Aug 31, 2023 Aug 31, 2023

Copy link to clipboard

Copied

How to use scripts or regexes to remove duplicate entries when creating an index?

As shown in the figure below, I want to convert A into C, which may pass through B.

 

It may not be possible to achieve A to C at once.

But it should be possible to achieve A to B in one step.

May I ask how to implement it using regular or script.

 

Here is an important question: how to find duplicate 【】ABC.jpg

Thank you.

TOPICS
Bug , Feature request , How to , Print , Scripting

Views

478

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 31, 2023 Aug 31, 2023

Copy link to clipboard

Copied

Hi @dublove, hope you are well. You should really attach a sample .indd whenever possible. Then people can check the file rather than guessing exactly how it is set up. (But make sure the demo is a good enough representation of the real document such that solving it in the demo will solve it in the real document.)

- Mark

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advocate ,
Sep 01, 2023 Sep 01, 2023

Copy link to clipboard

Copied

The ID file is here

Thank you.

By the way, where can I upload files for free and for longer storage?

666.jpg

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 01, 2023 Sep 01, 2023

Copy link to clipboard

Copied

Something like this could work. You could then use built in Sort Paragraph script to sort. The new text frame would be added in the upper left corner of the page with the text frame you select. The styling of the grafs probaly could be solve with some negative indentation/tab reconfiguration. Might need to sort the page numbers per custom object array. 

 

var main = function() {
    var o = {};
    try {
        var sel = app.selection[0];
        alert(sel.constructor.name);
        var ps = sel.parentStory;
    } catch(e) { 
        alert("Select a text frame and try again.");
        return; 
    }
    var pars = ps.paragraphs;
    var i = 0;
    var n = pars.length;
    var sp;
    for (i; i < n; i++) {
        sp = pars[i].contents.split("\t");

        if (!o.hasOwnProperty(sp[0])) {
            o[sp[0]] = [];
        }
        o[sp[0]].push(sp[1].contents.replace(/\r|\n/g,""));
    }
    var tf = sel.parentPage.textFrames.add({})
    for (var p in o) {
        tf.parentStory.contents += p + "\t" + o[p].join("\u2003") + "\r;
    }
}
main();

 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advocate ,
Jan 24, 2024 Jan 24, 2024

Copy link to clipboard

Copied

LATEST

I almost forgot about this post.

There is an error in this script.

Sorry, I didn't notice and opened a new post.

https://community.adobe.com/t5/indesign-discussions/can-regularization-automatically-detect-duplicat...

If we can use regularization to remove complexity, that's okay.

Alignment needs to be done manually.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 31, 2023 Aug 31, 2023

Copy link to clipboard

Copied

Are you not using the built-in Index tool? Solves this problem for you. Otherwise, some kind of GREP in a couple rounds could work: 

 

If not, some variation of this regex code could help: https://community.adobe.com/t5/indesign-discussions/grep-for-duplicate-lines-and-then-replacing-it/m...

 

Brighter GREP minds than mine could probably help sort it out. 

 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 02, 2023 Sep 02, 2023

Copy link to clipboard

Copied

Are you not using the built-in Index tool?

 

InDesign's index function has many shortcomings, two of which are relevant here. First, you can't have page references with suffixed letters (though that would in itself be relatively easy to script). Second, a topic term can't (optionally) have two references to the same page.

 

Therefore the following approach is not possible:

1. Create three character styles, 1, 2, and 3.

2. Do a script that creates page references at words wrapped in brackets., and set the page number style override, matching the item's column number with a character style.

3. Generate the index.

4. In the generated index, look for numbers in character style 1, add 'a' to the page reference; llook for numbers in character style 2, add 'b' to the number; etc.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advocate ,
Sep 02, 2023 Sep 02, 2023

Copy link to clipboard

Copied

I tried their regularization and couldn't seem to find anything

 

Your script has not been able to run either.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 02, 2023 Sep 02, 2023

Copy link to clipboard

Copied

I tried their regularization and couldn't seem to find anything

 

What do you mean by this?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 02, 2023 Sep 02, 2023

Copy link to clipboard

Copied

What is separating the name from the number; is it a tab or something else? 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advocate ,
Sep 03, 2023 Sep 03, 2023

Copy link to clipboard

Copied

The interval between them is a mandatory right alignment symbol~y

I didn't understand finding duplicate regular expressions

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines