Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
12

Find hyperlink destination (URL) or anchor name from text source

Community Beginner ,
Nov 07, 2023 Nov 07, 2023

Hi all,

 

First at all, as this is my very first message on this forum, but not the first time that I take advantage of your discussions, I would like to thank you all for your help, even if you don't know that you helped me! I discovered scripting a few years ago, and managed to develop some useful scripts for me and my team (I work for the French National History Museum journals), and I often found some solutions on this forum, and elsewhere too. So thank you all again.

 

Now this is my problem (I did not find any online solution for it on this forum or elsewhere, but perhaps I did not find the right words to search...):

I have an article prepared in Indesign, and in this article there are two kinds of links applied on the text: 1°/ links to external URL (e.g., https://science.mnhn.fr); and 2°/ links to internal anchors (for bibliography and figures, tables, etc.).

I would like to be able, with a script that uses a GREP expression to find some text, e.g. a specimen number like P01964577...

Emmanuel5E13_0-1699351028881.png

...to get the URL that corresponds to the link applied to this text.

I would also be able to "read" the name of the anchor, in case of a link to an internal target:

Emmanuel5E13_1-1699351167408.png

Nguyễn Kim Đào 2003 (just above) is linked in the article to an anchor (his name: "Nguyễn Kim Đào 2003") that has been placed at the beginning of the corresponding entry, in the bibliography:

Emmanuel5E13_2-1699351268787.png

In the same way as previously, I would like to be able, with a script that "reads" the text, to get the anchor name designated by the link on the text of the article.

So, to resume:

I have a grep script that searches a GREP expression, like "P\\d+"; it fills a table variable "res" that contains the result of the search, as objects.

res[0].contents.toString()

would send me (for example) "P00745313"

How do I get the URL (or the anchor name) that is placed on res[0]?

I precise that I visited (many times, it is dark-purple on my browser now) this thread: https://community.adobe.com/t5/indesign-discussions/find-hyperlink-text-source-from-destination/td-p...

and what I would like is exactly the opposite!

I hope I have been clear enough in my explanations, please do not hesitate to ask me some precisions if needed, and thank you in advance for your answers, that will be useful for me in all cases!

Emmanuel

TOPICS
Scripting
809
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Beginner , Nov 07, 2023 Nov 07, 2023

OK, if this is the only way to find the URL placed on a portion of text, I got it with:

// adapted from https://community.adobe.com/t5/indesign-discussions/script-to-extract-hyperlinks-from-indesign-file-with-page-number/td-p/10919365
//
var tf = app.selection[0];
 
for (a=0; a<app.activeDocument.hyperlinks.length; a++)  
{  
	try
	{  
		d = app.activeDocument.hyperlinks[a].destination.destinationURL;  
		m = app.activeDocument.hyperlinks[a].source.sourceText ;
		
		if (d.match(/^http/))  
		{
		
...
Translate
LEGEND ,
Nov 07, 2023 Nov 07, 2023

What is your END GOAL? 

 

If you work on a PC, you can use free version of my tool to preview and analyse  all Hyperlinks, Bookmarks and Cross-References. 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Nov 07, 2023 Nov 07, 2023

Hi!

I know how to list the hyperlinks of the document, but I need to be able to distinguish the kind of contents (for example, P00123445 is a specimen number when Totoro 2022 is a reference call), because I would apply some tags on them, in order to export the article in XML format (and so, the tags will be different if it is a specimen or if it is a call to a reference).

I tried many scripts but never found one which do that stuff. Some of them list the links of the document (but that's not what I want), I found many for adding a link to a portion of text (but I have already added links on text), etc. So, it's the reason of my question above.

Thank you for  your answer,

Emmanuel

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Nov 07, 2023 Nov 07, 2023

If you want to process only some Hyperlinks - and you know which ones - based on their visibile part - then you should go through the collection of Hyperlinks and process the ones that meet your requirements? 

 

So, use your search phrase on the visible part of the text of each hyperlink to validate. 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Nov 07, 2023 Nov 07, 2023

OK, if this is the only way to find the URL placed on a portion of text, I got it with:

// adapted from https://community.adobe.com/t5/indesign-discussions/script-to-extract-hyperlinks-from-indesign-file-with-page-number/td-p/10919365
//
var tf = app.selection[0];
 
for (a=0; a<app.activeDocument.hyperlinks.length; a++)  
{  
	try
	{  
		d = app.activeDocument.hyperlinks[a].destination.destinationURL;  
		m = app.activeDocument.hyperlinks[a].source.sourceText ;
		
		if (d.match(/^http/))  
		{
			if (m==tf) { 
				alert (d); 
				}
		}
	} catch(_) {}  
} 

But it means that for each occurrence, I have to compare it to the whole collection of links of the article, and it makes a huge amount of operations. I thought that we could find another way to directly get the URL from the text, but if it not possible, I can understand.

I guess that I should do the same for the anchors.

Thank you for your answer,

Emmanuel

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Nov 07, 2023 Nov 07, 2023

You've misunderstood my idea.

 

In your opening post, you've mentioned, that you are using GREP "P\\d+" to find all texts that you want to extract Hyperlinks, right ?

 

So my suggestion was - iterate through all the Hyperlinks collection and do GREP search/comparison on the visible text part.

 

But you've already got reply from @m1b.

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Nov 07, 2023 Nov 07, 2023

But I will try the tools you cite, if you tell me where I can them... 🙂

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 07, 2023 Nov 07, 2023

Hi @Emmanuel_du_Muséum,

 

How do I get the URL (or the anchor name) that is placed on res[0]?

 

You could use Text.findHyperlinks() method. So:

// get reference to the first hyperlink of the text
var hl = res[0].findHyperlinks()[0];

That's a start at least I hope.

 

For more help, please make two small sample .indd documents: one before the script and the other after the script, so we can look at them both and know *exactly* what you want to do.

- Mark

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Nov 07, 2023 Nov 07, 2023

Hello Mark!

so, if I try your example (thanks for it):

var tf = app.selection[0];
var hl = tf.findHyperlinks()[0];
alert (hl);

It answers me:

Emmanuel5E13_0-1699362156576.png

(I'm sorry, but as you must have realized, I'm not a "true" javascript developper, perhaps even not a "false" one, only a desk editor who test some things and who does not know by heart the InDD DOM 🙂

--> What would I like to get is the URL placed on "PE00036371" in my alert box.

--

In fact, the script modifys the document as follows:

text bal bla bla P00152060 bla bla bla

text bal bla bla <ref="http://coldb.mnhn.fr/catalognumber/mnhn/p/XXXXXXXX">P00152060</ref> bla bla bla

where the URL in blue is the link already applied to the portion of text "P00152060".

I hope it is clear enough,

Thank for your help,

Emmanuel

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Nov 07, 2023 Nov 07, 2023

And "hl" is a reference to the Hyperlink - now you need to access its properties and do whatever you need to do:

 

https://www.indesignjs.de/extendscriptAPI/indesign-latest/#Hyperlink.html#d1e113288

 

RobertTkaczyk_0-1699363675089.png

 

RobertTkaczyk_1-1699363689569.png

 

And to be more specific:

https://www.indesignjs.de/extendscriptAPI/indesign-latest/#HyperlinkURLDestination.html#d1e119368

 

So:

 

var tf = app.selection[0];
var hl = tf.findHyperlinks()[0];
alert (hl.destination.destinationURL);

 

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Nov 07, 2023 Nov 07, 2023

Thanks for the answer, but I've already tested that:

Emmanuel5E13_0-1699365810962.png

(With that code:)

var tf = app.selection[0];
var hl = tf.findHyperlinks()[0];
alert (hl.destination.destinationURL);

 But thanks to your first answer, I made a function that search all the hyperlinked text I need, and add a tag before with the URL of the hyperlink on it (with a loop on all the hyperlinks of the document).

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Nov 07, 2023 Nov 07, 2023

Because - from the previous screenshot - you have a reference to HyperlinkTextSource:

 

RobertTkaczyk_0-1699366692696.png

 

sorry - missed that.

 

So it's a wrong "type" of a Hyperlink.

 

It doesn't look like you can get "destination" from the "source"...

 

That's why - I think - the only solution is to iterate Hyperlinks collection of the Document...

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Nov 07, 2023 Nov 07, 2023
LATEST

That's what I was afraid of! In all cases, thank you for your answers, that helped me to think about this problem that I encountered for a long time!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines