Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Import html to Indesign?

Participant ,
Oct 17, 2023 Oct 17, 2023

I'm creating ebooks and often the text comes from html files. I want to import the text with styles intact (italics, bolds, superscripts, etc). Does anyone know a proceedure for this? 

Thanks in advance.

TOPICS
Type
2.2K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Oct 17, 2023 Oct 17, 2023

My favorite multitool for these circumstances is pandoc. It has a wide variety of input and output formats, but the relevant output format for you would be .icml - that is, an InCopy file. Pandoc will keep your text styles intact... and that's it. So it may not be the right tool for you, and I honestly don't know if it will handle your superscripts or not. But it will turn epub into something you can place into InDesign with text styling intact. 

Translate
Community Expert ,
Oct 17, 2023 Oct 17, 2023

There may be some scripts floating around but honestly, sacrificing a goat is probably your best bet to get this to work properly. You'd be dependent on honoring CSS styles and mapping them to InDesign styles.

 

That barely works with Word.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
People's Champ ,
Oct 17, 2023 Oct 17, 2023

Try this: https://www.id-extras.com/html-import-script

It's free. But you will need to upload the files somewhere -- it only works with HTML files on the web for now.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 17, 2023 Oct 17, 2023

That may not be a limitation. A lot of e-books are created by web scraping. 😐

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Oct 17, 2023 Oct 17, 2023

In this case I am employed by the original content creators to repurpose the content. I was afraid someone would think I was swiping someone else's work. Not the case here. 🙂

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 17, 2023 Oct 17, 2023

I should have qualified that post so it didn't look like an accusation. I just see a lot of flat-out crapware "books" slashed together from scraping. A tool optimized for that is... dismaying.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Oct 17, 2023 Oct 17, 2023

Hey TaW, thanks for the tip but it did not work for me. It says its 'fetching url' but there is no result when its done running the script. Has it worked for you?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
People's Champ ,
Oct 17, 2023 Oct 17, 2023

I haven't used for a while. The id-extras.com server was migrated a while ago, I wonder if it's not working because of that. I'll check it out and post back...

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 17, 2023 Oct 17, 2023

Bob L, goats are so last millennium. We sacrifice wombats these days. 🙂

 

That aside, about the only path for this process i can think of is to open the HTML files with Word, tidy up and reformat as necessary, then save in .docx or .rtf for import into InDesign.

 

Then, most usefully, the original CSS styles should be used as the starting point for EPUB export.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Oct 17, 2023 Oct 17, 2023

Thanks Bob (and everyone). I used to do it through word (import html, save as .docx) but my 3 free MS 365 account doesn't want to let me open .html files. Am I doing something wrong? I woudn't mind paying for the one time Word software if I knew it had this function.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 17, 2023 Oct 17, 2023

So, this is content on the web or do you have the HTML files? If it's on the web, is it WordPress?

I seem to remember some plugin or script that could handle this, even if it was clunky.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Oct 17, 2023 Oct 17, 2023

I am getting the content from the web site of the client. Looks like html 5.

 

I'm going to try the script TaW suggested now. Will report back.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 17, 2023 Oct 17, 2023

Cut and paste from browser to Word is an alternative, if you make judicious use of Word's paste options and macros for cleanup.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 17, 2023 Oct 17, 2023

Buy a license for Office 2016 and never update further. Office 365 is... crippleware for office drones. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 17, 2023 Oct 17, 2023

Duly noted. Now, where to find a wombat?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 17, 2023 Oct 17, 2023

My favorite multitool for these circumstances is pandoc. It has a wide variety of input and output formats, but the relevant output format for you would be .icml - that is, an InCopy file. Pandoc will keep your text styles intact... and that's it. So it may not be the right tool for you, and I honestly don't know if it will handle your superscripts or not. But it will turn epub into something you can place into InDesign with text styling intact. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Oct 17, 2023 Oct 17, 2023

This is great. It did not keep italics, but does do headings, sub and supertext. I used the online demo and downloaded .docx files. Thank you!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Oct 18, 2023 Oct 18, 2023
LATEST

Correction, it did retain italics and I was able, with a minimum of fussing, to redefine the imported style sheets to my specs. Thanks very much for pointing me towards this useful tool.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 17, 2023 Oct 17, 2023

I just tried saving a relatively simple informational page from Chrome, and then opening it in Word 365 (all I have available at this location). It worked pretty well and even preserved most styles.

 

I assume you're pulling content from simple pages (i.e., not complex interactive ones), so this might be worth exploring, again with some macros and style importing.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Oct 17, 2023 Oct 17, 2023

Weird, I wonder why it did not work for me. It was a long, complex html file.

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 17, 2023 Oct 17, 2023

Complex is relative. A page of text and images, regardless of length or number of styles, is never very complex in these terms. It's active pages using PHP and scripting and calls to e-commerce modules that get "complex." Again, my assumption is that you're working with book-like material in the first place.  If you're really recasting active web elements... some different approaches will be needed. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines