Skip to main content
Noyster2
Inspiring
October 17, 2023
Answered

Import html to Indesign?

  • October 17, 2023
  • 5 replies
  • 2675 views

I'm creating ebooks and often the text comes from html files. I want to import the text with styles intact (italics, bolds, superscripts, etc). Does anyone know a proceedure for this? 

Thanks in advance.

This topic has been closed for replies.
Correct answer Joel Cherney

My favorite multitool for these circumstances is pandoc. It has a wide variety of input and output formats, but the relevant output format for you would be .icml - that is, an InCopy file. Pandoc will keep your text styles intact... and that's it. So it may not be the right tool for you, and I honestly don't know if it will handle your superscripts or not. But it will turn epub into something you can place into InDesign with text styling intact. 

5 replies

James Gifford—NitroPress
Legend
October 17, 2023

I just tried saving a relatively simple informational page from Chrome, and then opening it in Word 365 (all I have available at this location). It worked pretty well and even preserved most styles.

 

I assume you're pulling content from simple pages (i.e., not complex interactive ones), so this might be worth exploring, again with some macros and style importing.

Noyster2
Noyster2Author
Inspiring
October 17, 2023

Weird, I wonder why it did not work for me. It was a long, complex html file.

 

James Gifford—NitroPress
Legend
October 17, 2023

Complex is relative. A page of text and images, regardless of length or number of styles, is never very complex in these terms. It's active pages using PHP and scripting and calls to e-commerce modules that get "complex." Again, my assumption is that you're working with book-like material in the first place.  If you're really recasting active web elements... some different approaches will be needed. 

Joel Cherney
Community Expert
Joel CherneyCommunity ExpertCorrect answer
Community Expert
October 17, 2023

My favorite multitool for these circumstances is pandoc. It has a wide variety of input and output formats, but the relevant output format for you would be .icml - that is, an InCopy file. Pandoc will keep your text styles intact... and that's it. So it may not be the right tool for you, and I honestly don't know if it will handle your superscripts or not. But it will turn epub into something you can place into InDesign with text styling intact. 

Noyster2
Noyster2Author
Inspiring
October 17, 2023

This is great. It did not keep italics, but does do headings, sub and supertext. I used the online demo and downloaded .docx files. Thank you!

James Gifford—NitroPress
Legend
October 17, 2023

Bob L, goats are so last millennium. We sacrifice wombats these days. 🙂

 

That aside, about the only path for this process i can think of is to open the HTML files with Word, tidy up and reformat as necessary, then save in .docx or .rtf for import into InDesign.

 

Then, most usefully, the original CSS styles should be used as the starting point for EPUB export.

Noyster2
Noyster2Author
Inspiring
October 17, 2023

Thanks Bob (and everyone). I used to do it through word (import html, save as .docx) but my 3 free MS 365 account doesn't want to let me open .html files. Am I doing something wrong? I woudn't mind paying for the one time Word software if I knew it had this function.

BobLevine
Community Expert
Community Expert
October 17, 2023

So, this is content on the web or do you have the HTML files? If it's on the web, is it WordPress?

I seem to remember some plugin or script that could handle this, even if it was clunky.

TᴀW
Legend
October 17, 2023

Try this: https://www.id-extras.com/html-import-script

It's free. But you will need to upload the files somewhere -- it only works with HTML files on the web for now.

Visit www.id-extras.com for powerful InDesign scripts that save hours of work — automation, batch tools, and workflow boosters for serious designers.
James Gifford—NitroPress
Legend
October 17, 2023

That may not be a limitation. A lot of e-books are created by web scraping. 😐

Noyster2
Noyster2Author
Inspiring
October 17, 2023

In this case I am employed by the original content creators to repurpose the content. I was afraid someone would think I was swiping someone else's work. Not the case here. 🙂

BobLevine
Community Expert
Community Expert
October 17, 2023

There may be some scripts floating around but honestly, sacrificing a goat is probably your best bet to get this to work properly. You'd be dependent on honoring CSS styles and mapping them to InDesign styles.

 

That barely works with Word.