Skip to main content
Participating Frequently
March 10, 2023
Answered

How to merge duplicate values in array in InDesign?

  • March 10, 2023
  • 2 replies
  • 2774 views

Hi everyone

I am looking for some help, I am new to scripting but I have managed to create a basic script to collect the infomation from a series of tables into one large table with the help of this community, see snippit of code below:


var myDoc = app.activeDocument;
var myFileName = myDoc.name.substr(0,myDoc.name.lastIndexOf("."));
var myTablesArray = new Array;
myTablesArray = myDoc.stories.everyItem().tables.everyItem().getElements();
var numOfHeaderRows = 1;
var numOfBodyRows = myTablesArray.length;
var contentsArray = myTablesArray[0].columns[0].contents;
for (var n=0;n<myTablesArray.length; n++)
{
    contentsArray = contentsArray.concat(myTablesArray[n].columns[1].contents);
};

 

In some of the tables I have duplicate values:

 

Filename, File 1.                                     
Code, 1234
Dimensions, 798 (w) x 798 (h)
Material, Card
Notes, Gloss
Quantity,1
Elevation, Bay 1
Approved, Yes

 

Filename, File 1.                                     
Code, 1234
Dimensions, 798 (w) x 798 (h)
Material, Card
Notes, Gloss
Quantity, 3
Elevation, Bay 2
Approved, Yes

 

Is it possible to merge the duplicates in each column by sorting the array based on the filename and sum the quantity in row 6. Any help or advice would be greatly appriciated!

Thanks

Vincent

<Title renamed by moderator>

 

This topic has been closed for replies.
Correct answer nicosh

hi @Vincent@GYUWorks 

My solution it’s totally different.

I process all the data with arrays, the final goal is to get a final array to populate the final table, so:

1. Populate an array with the columns[1] of every table.

2. Fill an array with all the filenames (file1.pdf, file2.pdf…)

3. Sort and delete duplicates of this array.

4. Group the colums[1] by filename in other array.

5. Do all the process with this last array.

6. Add header content to the final array.

7. Add the final table.

 

Considerations:

. I’m not an expert in scripting, I would say the opposite. So the script, surely, it can be improve to have less lines of code and more efficient. For example, there is a shorter way to delete duplicates in an array ( var newArray = […new Set(array)] ) but due to the next point I couldn’t test it.

 

. I have work with inDesign 2020 (15.1). In this version there are some array methods that doesn’t work. I have found an old thread in this community and a link that solve this problem. That’s why the first lines of my code some methods are implemented in array.prototype.

 

. I don’t know if you delete this lines of ‘prototype’ the script still works.

 

. I’m Spanish, my English is poor.

 

This is my code

//Array methods solution for older versions of indesign
//https://community.adobe.com/t5/indesign-discussions/indexof-is-not-a-function/m-p/9050746
//https://www.weepee.de/forum/indesign/viewtopic.php?f=10&t=56

Array.prototype.indexOf = function(/*any value*/ value, /*int*/startPosition) {
    var l = this.length;
    if (startPosition == null) startPosition=0;
    if (startPosition < 0) startPosition=l+startPosition;
    if (startPosition < 0) startPosition = 0;
    for (var i = startPosition; i<l; i++)
        if (this[i]==value)return i;
    return -1;
  }
  
Array.prototype.filter = function (functionName){
    var a = new Array (this.length);
    var l = this.length;
    var c = -1;
    for (var i = 0; i < l; i++) {
      if(functionName(this[i], i, this))
        a[++c] = this[i];
    }
    a.splice(c+1);
    return a;
}



var myDoc = app.activeDocument;
var myFileName = myDoc.name.substr(0,myDoc.name.lastIndexOf("."));
var myTablesArray = new Array;
myTablesArray = myDoc.stories.everyItem().tables.everyItem().getElements();
var header = myTablesArray[0].columns[0].contents;

//Array with all the table.columns[1]
contents_arr = [];

//Array with all the names of table.columns[#].cells[0]
//filename row
allFiles = [];

//allFiles array but sorted and deleted duplicates
files = [];

//Array with all the table.columns[1] grouped by file name
contentsGrouped_arr = [];

//Final array with the data to populate table
finalTbl_arr = [];

Main()

function Main() {
    //Populate array with every col 1
    populateArrayCol(myTablesArray);

    //Get every filename row content (File1.pdf, File2.pdf...)
    getFilesNames(contents_arr);

    //Delete duplicates and sort
    files = delDup(allFiles);

    //Group all file#.pdf contents
    colOrg(contents_arr);

    //Obtain final table
    finalTbl(contentsGrouped_arr);

    //Add header content to final table
    finalTbl_arr.splice(0, 0, header);

    //Add table to text frame
    addTable();    
}

function populateArrayCol(table_arr) {
    for (var i = 0; i < table_arr.length; i++) {
        var col = table_arr[i].columns[1].contents;
        contents_arr.push(col);
    }
}

function getFilesNames(temp_arr) {
    for (var i = 0; i < temp_arr.length; i++) {
        allFiles.push(temp_arr[i][0]);
    }
}

function delDup(temp_arr){
    tempSort_arr = temp_arr.sort();
    final_arr = tempSort_arr.filter(function(element, index) {
        return tempSort_arr.indexOf(element) === index;
    }); 
    return final_arr;  
}

 

function colOrg(table) {
    var temp_arr = [];
    if (table.length > 0) {
        for (var i = 0; i < files.length; i++) {
            var fileName = files[i];
            var fileNameRegEx = new RegExp(fileName);
            for (k = 0; k < table.length; k++) {
                var temp = table[k][0].match(fileNameRegEx);
                if (temp != null) {
                    temp_arr.push(table[k]);
                }              
            }
            contentsGrouped_arr.push(temp_arr);
            temp_arr = [];
        }
    }
}

function finalTbl(finalTbl_param) {
    var quantity = 0
    var elevationTemp_arr = [];
    var elevation_arr = [];
    for (var i = 0; i < finalTbl_param.length; i++) {
        if (finalTbl_param[i].length > 1){
            for (var k = 0; k < finalTbl_param[i].length; k++) {
                var qtyTemp = Number(finalTbl_param[i][k][5]);
                var file_temp = finalTbl_param[i][k][0];
                quantity = quantity + qtyTemp;
                elevationTemp_arr.push(finalTbl_param[i][k][6]);
            }
            elevation_arr = delDup(elevationTemp_arr);
            finalTbl_param[i][0][5] = quantity.toString();
            finalTbl_param[i][0][6] = elevation_arr.toString();
            var removeCols = finalTbl_param[i].length - 1;
            finalTbl_param[i].splice(1, removeCols);
            finalTbl_arr.push(finalTbl_param[i][0]);
            quantity = 0;
            elevationTemp_arr = [];
        } else {
            finalTbl_arr.push(finalTbl_param[i][0]);
        }
    }
}


function addTable() {
    var selection = app.selection[0];
    var props = {
        headerRowCount: 1,
        columnCount: finalTbl_arr[0].length,
        bodyRowCount: (finalTbl_arr.length - 1)
    }
    var finalTbl = selection.insertionPoints[0].tables.add(undefined, undefined, props);
    var headerRow = finalTbl.rows[0];
    headerRow.contents = finalTbl_arr[0];
    for (var i = 1; i < finalTbl_arr.length; i++) {
        finalTbl.rows[i].contents = finalTbl_arr[i];        
    }    
}

2 replies

nicoshCorrect answer
Inspiring
March 14, 2023

hi @Vincent@GYUWorks 

My solution it’s totally different.

I process all the data with arrays, the final goal is to get a final array to populate the final table, so:

1. Populate an array with the columns[1] of every table.

2. Fill an array with all the filenames (file1.pdf, file2.pdf…)

3. Sort and delete duplicates of this array.

4. Group the colums[1] by filename in other array.

5. Do all the process with this last array.

6. Add header content to the final array.

7. Add the final table.

 

Considerations:

. I’m not an expert in scripting, I would say the opposite. So the script, surely, it can be improve to have less lines of code and more efficient. For example, there is a shorter way to delete duplicates in an array ( var newArray = […new Set(array)] ) but due to the next point I couldn’t test it.

 

. I have work with inDesign 2020 (15.1). In this version there are some array methods that doesn’t work. I have found an old thread in this community and a link that solve this problem. That’s why the first lines of my code some methods are implemented in array.prototype.

 

. I don’t know if you delete this lines of ‘prototype’ the script still works.

 

. I’m Spanish, my English is poor.

 

This is my code

//Array methods solution for older versions of indesign
//https://community.adobe.com/t5/indesign-discussions/indexof-is-not-a-function/m-p/9050746
//https://www.weepee.de/forum/indesign/viewtopic.php?f=10&t=56

Array.prototype.indexOf = function(/*any value*/ value, /*int*/startPosition) {
    var l = this.length;
    if (startPosition == null) startPosition=0;
    if (startPosition < 0) startPosition=l+startPosition;
    if (startPosition < 0) startPosition = 0;
    for (var i = startPosition; i<l; i++)
        if (this[i]==value)return i;
    return -1;
  }
  
Array.prototype.filter = function (functionName){
    var a = new Array (this.length);
    var l = this.length;
    var c = -1;
    for (var i = 0; i < l; i++) {
      if(functionName(this[i], i, this))
        a[++c] = this[i];
    }
    a.splice(c+1);
    return a;
}



var myDoc = app.activeDocument;
var myFileName = myDoc.name.substr(0,myDoc.name.lastIndexOf("."));
var myTablesArray = new Array;
myTablesArray = myDoc.stories.everyItem().tables.everyItem().getElements();
var header = myTablesArray[0].columns[0].contents;

//Array with all the table.columns[1]
contents_arr = [];

//Array with all the names of table.columns[#].cells[0]
//filename row
allFiles = [];

//allFiles array but sorted and deleted duplicates
files = [];

//Array with all the table.columns[1] grouped by file name
contentsGrouped_arr = [];

//Final array with the data to populate table
finalTbl_arr = [];

Main()

function Main() {
    //Populate array with every col 1
    populateArrayCol(myTablesArray);

    //Get every filename row content (File1.pdf, File2.pdf...)
    getFilesNames(contents_arr);

    //Delete duplicates and sort
    files = delDup(allFiles);

    //Group all file#.pdf contents
    colOrg(contents_arr);

    //Obtain final table
    finalTbl(contentsGrouped_arr);

    //Add header content to final table
    finalTbl_arr.splice(0, 0, header);

    //Add table to text frame
    addTable();    
}

function populateArrayCol(table_arr) {
    for (var i = 0; i < table_arr.length; i++) {
        var col = table_arr[i].columns[1].contents;
        contents_arr.push(col);
    }
}

function getFilesNames(temp_arr) {
    for (var i = 0; i < temp_arr.length; i++) {
        allFiles.push(temp_arr[i][0]);
    }
}

function delDup(temp_arr){
    tempSort_arr = temp_arr.sort();
    final_arr = tempSort_arr.filter(function(element, index) {
        return tempSort_arr.indexOf(element) === index;
    }); 
    return final_arr;  
}

 

function colOrg(table) {
    var temp_arr = [];
    if (table.length > 0) {
        for (var i = 0; i < files.length; i++) {
            var fileName = files[i];
            var fileNameRegEx = new RegExp(fileName);
            for (k = 0; k < table.length; k++) {
                var temp = table[k][0].match(fileNameRegEx);
                if (temp != null) {
                    temp_arr.push(table[k]);
                }              
            }
            contentsGrouped_arr.push(temp_arr);
            temp_arr = [];
        }
    }
}

function finalTbl(finalTbl_param) {
    var quantity = 0
    var elevationTemp_arr = [];
    var elevation_arr = [];
    for (var i = 0; i < finalTbl_param.length; i++) {
        if (finalTbl_param[i].length > 1){
            for (var k = 0; k < finalTbl_param[i].length; k++) {
                var qtyTemp = Number(finalTbl_param[i][k][5]);
                var file_temp = finalTbl_param[i][k][0];
                quantity = quantity + qtyTemp;
                elevationTemp_arr.push(finalTbl_param[i][k][6]);
            }
            elevation_arr = delDup(elevationTemp_arr);
            finalTbl_param[i][0][5] = quantity.toString();
            finalTbl_param[i][0][6] = elevation_arr.toString();
            var removeCols = finalTbl_param[i].length - 1;
            finalTbl_param[i].splice(1, removeCols);
            finalTbl_arr.push(finalTbl_param[i][0]);
            quantity = 0;
            elevationTemp_arr = [];
        } else {
            finalTbl_arr.push(finalTbl_param[i][0]);
        }
    }
}


function addTable() {
    var selection = app.selection[0];
    var props = {
        headerRowCount: 1,
        columnCount: finalTbl_arr[0].length,
        bodyRowCount: (finalTbl_arr.length - 1)
    }
    var finalTbl = selection.insertionPoints[0].tables.add(undefined, undefined, props);
    var headerRow = finalTbl.rows[0];
    headerRow.contents = finalTbl_arr[0];
    for (var i = 1; i < finalTbl_arr.length; i++) {
        finalTbl.rows[i].contents = finalTbl_arr[i];        
    }    
}
brian_p_dts
Community Expert
Community Expert
March 10, 2023

A few different way you can roll it. I'd roll it like this, assuming all the tables follow this six-row format: 

 

 

var myDoc = app.activeDocument;
var myFileName = myDoc.name.substr(0,myDoc.name.lastIndexOf("."));
var myTablesArray = new Array;
myTablesArray = myDoc.stories.everyItem().tables.everyItem().getElements();
var numOfHeaderRows = 1;
var numOfBodyRows = myTablesArray.length;
var contentsArray = myTablesArray[0].columns[0].contents;
//let's make an object!
var o = {};
var fn, qty;
for (var n=0;n<myTablesArray.length; n++) {
    //get just the filename from the 0 index of your contents array
    fn = myTablesArray[n].columns[1].contents[0];
    //parse the 5th index of contents into number by using Regular Expression to fine any number of digits in the line
    qty = Number(myTablesArray[n].columns[1].contents[5].match(/\d+/gi)[0]);
    //if the object doesn't have a filename property, create it, and set quant to 0
    if (!o.hasOwnProperty(fn)) {
        o[fn] = 0;
    }
    //then add your number
    o[fn] += qty;
};
//output your key/value pairs
for (var f in o) {
   $.writeln(f + ": Qty >> " + o[f];
}

 

 

Edit: Or are the dupe values in the same table? If so, my method above won't work, but maybe it steers you on the right path. You probably have to do some more parsing of your columns[1] contents values by row to get something right. 

Participating Frequently
March 10, 2023

Hi @brian_p_dts , thanks for the response,

When I run the code I get an error on the following line: Error 21. null is not an object.

qty = Number(myTablesArray[n].columns[1].contents[5].match(/\d+/gi)[0]);

I have included the script and some sample files if that helps

Thanks once again

Vincent

rob day
Community Expert
Community Expert
March 10, 2023

No problem, thanks for all your help!


Hi @Vincent@GYUWorks , Does @brian_p_dts ’s code help you with removing the duplicates? You could add the new table using your existing code, and clean it with this function:

 

 

 

//call the cleanTable function after you add the contents to the table
myMasterTable.contents = contentsArray;
cleanTable(myMasterTable, 5)

/**
* Removes duplicate rows and updates column content
* @ param the table to clean 
* @ param the column number to update
* @ return void 
*/
function cleanTable(t, col){
    var ra = t.rows.everyItem().getElements();
    var fn, cnt, nn;
    var dupes = [];
    for (var i = 0; i < ra.length; i++){
        cnt = 0
        nn = 0
        fn = ra[i].cells[0].contents;
        for (var j = i; j < ra.length; j++){
            if (ra[j].cells[0].contents == fn) {
                cnt = cnt + Number(ra[j].cells[col].contents)
                nn ++
                if (nn > 1) {
                    ra[i].cells[col].contents = cnt.toString()
                    dupes.push(ra[j])
                }
            } 
        }; 
    };
    var n = dupes.length; 
    while (n--) dupes[n].remove();
}

 

 

 

 

This is what I get:

 

Also, I have my application view preferences set to Inches, which was breaking the script because you are assuming Millimeters. Maybe add this:

 

 

 

with(myNewDoc.viewPreferences){
    horizontalMeasurementUnits = MeasurementUnits.MILLIMETERS;
    verticalMeasurementUnits = MeasurementUnits.MILLIMETERS;
}

 

 

 

And I think you want the new table’s row count to be this:

 

 

 

var myMasterTable = myTextFrame.tables.add({columnCount:8, numOfHeaderRows:1, bodyRowCount:numOfBodyRows+1});