Copy link to clipboard
Copied
Hi.
In the first step I try to load a text-based list to create an index with certain topics. The loading is not the problem:
var myDoc = app.activeDocument;
app.doScript(main, ScriptLanguage.JAVASCRIPT, undefined, UndoModes.ENTIRE_SCRIPT, "Funktionsprozess");
function main() {
var myList = File.openDialog ("Indexliste laden");
if (!myList) exit();
myList.open ('r', undefined, undefined);
var theText = myList.read();//+'\n';
theText = theText.replace(/ +\n/g, '\n').replace(/\n+/g, '\n');
var words = theText //.replace(/\s+/g, '\n');
var words = theText.split("\n");
listLength = words.length;
myList.close();
function makeMyList() {
app.documents.everyItem().indexes.everyItem().topics.everyItem().remove()
newIndex = myDoc.indexes.add()
for (var i = 0; i<listLength; i++){
myWord = words[i];
if (myWord != "") {
newTopic = myDoc.indexes[0].topics.add (myWord);
myDoc.indexes[0].topics.add(myWord);
myDoc.indexes[0].update();
}
}
}
makeMyList();
}
I had to change the script because making a mistake (wanted to use a string instead an array for the enties)🙂Now it works but I also need the page numbers. This is how it looks like:
But I´m afraid, the imported topics needs to be at the references (Verweise) instead of the theme (Thema) to get the page numbers, isn´t it?
Copy link to clipboard
Copied
You next step would be to look for those entries in all documents and create page references at each instance.
There are various scripts around that do that. One example is
https://creativepro.com/files/kahrel/indesign/index_from_wordlist.html
Copy link to clipboard
Copied
thanks Peter. Now I try to understand your script 🙂
First I want to take advantage of it by completing mine.
Copy link to clipboard
Copied
In short - you need to do your own search, then add each found instance to the index using pageRefetences.add().
https://www.indesignjs.de/extendscriptAPI/indesign-latest/#PageReferences.html#d1e127099__d1e127148
Copy link to clipboard
Copied
Hi Peter.
I analyzed your script index_topic_list.jsx, but I didn't really get any wiser. 🙂
It's quite complex (to me) and with lots of nested functions. The crux of the matter - in my opinion - is the index_documents function with the both loops. Unfortunately, I don't understand what's happening. Here's an example:
duplicates[search_item] ? duplicates[search_item].push (word_list[j]) : duplicates[search_item] = [word_list[j]];
And in the function index_from_list, where most other functions are called, there is also something I don't understand: __count__, what is that good for?
I would need some very simple lines of code to get the page numbers of all imported topics and also to create index entries for them. A very simple script, without any safeguards for all possible errors. Just to understand it first.
The theoretical steps are, that´s clear:
- open the list (txt document)
- import all terms from the list
- search for the terms in each document of the book
- determine the page number of the words found
- create a new index based on the terms/their page numbers found in the book documents
Here's something I tried that doesn't work either. 😞
The script is not entirely made by myself, I used snippets. And so I do not get everything that is happening there :). There are comments on what I understood and what I didn't.
app.doScript(main, ScriptLanguage.JAVASCRIPT, undefined, UndoModes.ENTIRE_SCRIPT, "Funktionsprozess");
//////////// main-function, contains all other functions
function main() {
var myDoc = app.activeDocument;
var allOpenDocs = app.documents.everyItem().getElements();
var allOpenDocsLength = app.documents.length;
importList();
//////////// importing the list from an external txt-document
function importList(){
//////////// Generation of the dialog to open a list as a source for the index topics
var myList = File.openDialog ("Indexliste laden");
if (!myList) exit();
myList.open ('r', undefined, undefined);
var theText = myList.read();//+'\n';
// removing spaces at the end of the paragraph and removing multiple spaces
theText = theText.replace(/^$/g, '');
theText = theText.replace(/ +\n/g, '\n');
theText = theText.replace(/\n+/g, '\n'); // unfortunately does not do the job
words = theText.split("\n");
listLength = words.length;
myList.close();
thatsMyDoc();
}
//////////// Iterate through all documents
function thatsMyDoc(){
for(d = 0; d < allOpenDocsLength; d++){
thisDoc = allOpenDocs[d];
// Creating a variable whose content is an object: a function with two parameters
var indexEntries = findTextInDocument(thisDoc, words);
for (var i = 0; i < indexEntries.length; i++) {
createIndexEntry(thisDoc, indexEntries[i].term, indexEntries[i].page);
}
}
}
//Search for terms in the documents
function findTextInDocument(thisDoc, words) {
app.findTextPreferences = app.changeTextPreferences = NothingEnum.nothing;
// setting indexEntries as an empty Array
var indexEntries = [];
// iterating through each term (list element)
for (var i = 0; i < listLength; i++) {
// >>>>>> PROBLEM: if the list isn´t clean (no empty paragraphs)
// an error will be displayed here ("Object contains no text to find/replace.")
app.findTextPreferences.findWhat = words[i];
//Search in each document for each term
var foundItems = thisDoc.findText();
//only continue if something has been found
if (foundItems.length < 0 ) {continue;}
for (var j = 0; j < foundItems.length; j++) {
// Do not continue working with the entry if it is not on the page
if (foundItems[j].parentTextFrames[0].parentPage == null) {continue;}
var checkWord = foundItems[j].select();
var page = foundItems[j].parentTextFrames[0].parentPage.name;
// QUESTION: what happens here / What does term: words[i], page: page mean?
indexEntries.push({ term: words[i], page: page });
}
}
app.findTextPreferences = app.changeTextPreferences = NothingEnum.nothing;
return indexEntries;
}
// Create index entries
function createIndexEntry(thisDoc, term, page) {
var index = thisDoc.indexes.length > 0 ? thisDoc.indexes[0] : thisDoc.indexes.add();
var topic = index.topics.itemByName(term);
if (!topic.isValid) {
topic = index.topics.add(term);
}
// >>>>> PROBLEM: Invalid value for parameter "source" of method "add". Expected text, but received page.
topic.pageReferences.add(thisDoc.pages.itemByName(page), PageReferenceType.currentPage);
}
alert("Index erfolgreich erstellt!");
}
Copy link to clipboard
Copied
You need to combine findTextInDocument with createIndexEntry.
After you find texts in InDesign - just use pageReferences.add().
Please check again link I've provided - there is info what params you should supply to add() - reference to text and PageReferenceType.currentPage.
Copy link to clipboard
Copied
Your second line here:
newTopic = myDoc.indexes[0].topics.add (myWord);
myDoc.indexes[0].topics.add(myWord);
is a duplicate of the one above - so you can remove it.
Copy link to clipboard
Copied
Oh yes, you´re right 🙂 Thanks!
I wanted to shorten it, and forgot to remove....
Copy link to clipboard
Copied
topicName = ...; // reference to a topic name, e.g. 'amber'
app.findTextPreferences = null;
app.findTextPreferences.findWhat = topicName;
found = thisDoc.findText();
if (found.length > 0) {
topic = thisDoc.indexes[0].topics.add (topicName);
for (i = found.length-1; i >= 0; i--) {
topic.pageReferences.add (found[i], PageReferenceType.CURRENT_PAGE);
}
}
Copy link to clipboard
Copied
Now I have a working script! 🙂
var myDoc = app.activeDocument;
var allOpenDocs = app.documents.everyItem().getElements();
var allOpenDocsLength = app.documents.length;
var allTopics = [];
app.doScript(scriptTimer, ScriptLanguage.JAVASCRIPT, undefined, UndoModes.ENTIRE_SCRIPT, "Funktionsprozess");
function scriptTimer(){
var timeDiff = {
setStartTime:function (){d = new Date(); time = d.getTime();},
getDiff:function (){d = new Date(); t = d.getTime() - time; time = d.getTime(); return t;}
};
// Start timer
timeDiff.setStartTime();
// Start Main Funktion
main();
// get result
alert("Dauer der Ausführung: " + timeDiff.getDiff() / 1000 + " Sekunden", "IndiSnip /// Scipt execution time");
}
function main() {
var myList = File.openDialog ("Indexliste laden");
if (!myList) exit();
myList.open ('r', undefined, undefined);
var theText = myList.read();//+'\n';
theText = theText.replace(/^$/g, '');
theText = theText.replace(/ +\n/g, '\n');
theText = theText.replace(/\n+/g, '\n');
var words = theText.split("\n");
listLength = words.length;
myList.close();
thatsMyDoc();
function thatsMyDoc(){
for(d = 0; d < allOpenDocsLength; d++){
thisDoc = allOpenDocs[d];
showActiveDoc();
if(thisDoc.indexes > 0) {thisDoc.indexes.everyItem().topics.everyItem().remove();}
newIndex = thisDoc.indexes.add();
serch4Topics();
thisDoc.indexes[0].update();
}
}
// snippet by Adobe Community-Member Laubender
function showActiveDoc(){
var dlog = new Window("palette");
dlog.size = [400,50];
dlog.add("statictext", undefined , "Suche nach Schlagworten im Dokument: "+ thisDoc.name );
dlog.show();
// Have a nap:
$.sleep(50);
// Closing the dialog:
dlog.close();
}
function serch4Topics(){
for(var j = 0; j<listLength; j++){
topicName = words[j]; // iterating through the list, topics one by one
app.findTextPreferences = null;
app.findTextPreferences.findWhat = topicName;
found = thisDoc.findText();
if (found.length > 0) {
topic = thisDoc.indexes[0].topics.add (topicName);
for (i = found.length-1; i >= 0; i--) {
topic.pageReferences.add (found[i], PageReferenceType.currentPage);
}
}
}
}
}
To run it over nearby 300 Pages, based on a list with over 200 terms it takes round about 20 Minutes. But that ist not a problem.
I would like to eliminate duplicates.
@Peter Kahrel In your script index_topic_list.jsx there are some lines like (I do not understand):
...
duplicates[search_item]
? duplicates[search_item].push (word_list[j])
: duplicates[search_item] = [word_list[j]];
...
...and other references to "duplicates" in other functions.
Is there another, more simple way to avoid duplicates?
Copy link to clipboard
Copied
> Is there another, more simple way to avoid duplicates?
The script deals with (tries to, anyway) potentially ambiguous entries, especially in a name index. If the list contains two or more entries for Smith, as in
Smith, John
Smith, James
It looks for 'Smith' in the text and notices that there are more Smiths. So these are special cases.
If you're worried about a full entry occurring twice or more, e.g. 'border collie', then don't worry. There's no need to check for duplicates because InDesign checks for them internally (one of the very few clever features of InDesign's index).
Copy link to clipboard
Copied
It looks for 'Smith' in the text and notices that there are more Smiths. So these are special cases.
If you're worried about a full entry occurring twice or more, e.g. 'border collie', then don't worry. There's no need to check for duplicates because InDesign checks for them internally (one of the very few clever features of InDesign's index).
By @Peter Kahrel
That´s nice! 🙂
But: mea culpa, I explained it wrong. The list of sources is hand-made and none of the terms are duplicated. The duplicates I get are the page numbers. Some of the terms appear more than once on a page. So in the index list I`ll generate, I need the page reference for each term only once...
Copy link to clipboard
Copied
Duplicate page numbers are not a problem, each page number is printed only once. In fact, you want to keep those 'duplicates' because if an item occurs twice on a page, after some changes in the text they may be on different pages.
You see all instances of 'duplicate' pages in the Index panel, but when you generate the index they're filtered out.
Copy link to clipboard
Copied
Perfect, thanks! So I´m ready for the next step... 🙂
Copy link to clipboard
Copied
Only one last question (we´ll see). If I like to avoid to search/find only parts of a word, what I have to do in the funktion serch4Topics()? For example, the (german) word "Bambus" is in the list, but I don't want "Bambustisch" or "Bambusstock" to be found.
Copy link to clipboard
Copied
Ally you need to do is enable the whole-words-only preference.
Copy link to clipboard
Copied
Sorry, but I didn´t get it 😞
Searched around the net, but didn't found anything. Even not in the ExtendScript API. Which preferences? findTextPreferences ?
Copy link to clipboard
Copied
Copy link to clipboard
Copied
cool, thanks. Now it´s safe. 🙂
Copy link to clipboard
Copied
Sorry, it's not a preference, but an option. You can use this
https://www.indesignjs.de/extendscriptAPI/indesign-latest/#about.html
site to browse InDesign's object model. Look for findchangetextoptions and you'll hit upon this, which tells you how to set it:
Copy link to clipboard
Copied
yes, Robert gave me the hint 🙂
Thanks!
(Wow, the ExtendScriptAPI contains a huge amount of information. One has to know where to look for the right ones... 😞 )
Copy link to clipboard
Copied
Copy link to clipboard
Copied
If there is an expression in the list that consists of two hyphenated words, will it still work with this setting (app.findChangeTextOptions.wholeWord = true)?
Copy link to clipboard
Copied
You can try that in he interface: type abc-def in a frame, enter abc-def in the Find what field, enable whole word, and start the search. What happens in the interface is wha happens with the script.
Be aware that if the text uses a non-breaking hyphen, and you type a normal hyphen in the Find/Change window, you won't find it.
Copy link to clipboard
Copied
Thank you Peter, Robert.
The difficulty for me is that I can follow the code for a while, but then I lose focus :). Unfortunately, it is not so easy for me to establish the logical connection between functions in more complex codes. I think that for experts like you, this is a piece of cake and can be written between two cigarettes. 🙂 But it takes me weeks to understand it, and even longer to implement it, in between during working hours .
For exemple:
if (Math.abs(p_ref.sourceText.index - ip)
or when one set as the value of a variable a funftion with parameters:
var indexEntries = findTextInDocument(thisDoc, words);
I know what "count" means, but the double underline confused me.
With your code snippet I will try to keep it simple and to solve the issue
I will look for the references again, it might take some time