Skip to main content
Participating Frequently
June 13, 2021
Question

Creating index out of URLS

  • June 13, 2021
  • 1 reply
  • 878 views

I need to create an index automatically generated with the urls which are in the text. e.g. "London" has the URL https://www.geonames.org/2643743/london.html. London has to be listed in the index. If  "Capital of England" has the url mentioned above assigned it has be also to be listed as London in the index. Thanks for your help.

This topic has been closed for replies.

1 reply

Community Expert
June 14, 2021

Your description is not entirely clear. Please show how London and its URL occur in the text, and how they should appear in the index. Can a city have more than one URL? 

And 'If  "Capital of England" has the url mentioned above assigned it has be also to be listed as London in the index' -- A script doesn't know that London is the captal of England (it's the capital of the UK, in fact, but never mind that). So how is that relation made explicit in the text so that a script can work out these things?

 

Illustrate profusely.

 

P.

Participating Frequently
June 14, 2021

Sorry I' wasn't able to decribe it clearly. The URL https://www.geonames.org/2643743/london.html is assigned to the word "London" and "Londonium" in the text.

The sript has to regognise london out of the url. London has to appear in the index.

Thank you P.

Community Expert
June 14, 2021

It's becoming a bit clearer. So you want to look for all hyperlinks with a URL destination, and create topics from the URL text sources, and insert an index marker at the text source.

 

All URLs? Only those that contain www.geonames?