I am involved in a project that involves exporting to epub 20-30 books of about 500 pages each.
All these books have large and essential indexes that need to make it into the epub, preferably linked.
Is there any way to have Indesign, or a script, run through the index and rest of the file, in order to link the pages and page ranges to the proper pages in the book?
This would save our poor assistents (me, myself, and I) so much work, we're talking about at least a thousand index links per book. Medieval European monks would be envious of the work that awaits me, if I can't get this done automatically.
Thanks for your attention. I hope someone has a solution!
It can probably be done by scripting, though it would not be trivial. It would be helpful to view a sample book, or at least the index document and another desitnation page document. I assume a Hyperlink would suffice? For page ranges, Hyperlink could only send to the first page of the range. Feel free to PM.
Sadly, this entails creating a new index by an indexing program. I already have beautifully created indices that I'm very happy with.
I have now found a halfway solution in
exporting the books to PDF
using acrobat to export the PDF to MS Word. This maintains the page numbers, as they are carried into the Word document.
Exporting the Word document to ePub with OpenOffice Writer's writer2ePub plugin. This yields an epub with for each page an xhtml document.
In Sigil, an ePub editor, regexing the page numbers of the book to a self-closing <a> tag with the proper id, and moving them to the top of the page.
Then merging all the pages that are not the index into one xhtml file. Doing the same for the pages that are part of the index.
Iterating through the index file, using regexes to find page numbers and link them to the proper anchor inside the book document. There are ins and outs to this, that I will gloss over here.
Finally, splitting the book up into its logical chapters - usually one xhtml file per chapter, same with the notes etc.
Sadly, bc of the PDF export, there is a lot of cleaning still to do, as hyphenation is not understood by acrobat's PDF reading and exporting system. Also, there will be headers and/or footers, lots of whitespace, and no styles except (hopefully) italics and bold. Also, the index now only links to the page, and not to the proper paragraph or sentence, which might be possible in an ePub.
It is definitely a half-way solution, but one that can be automated to some extent within Sigil, by using the Saved Searches functionality.