Copy link to clipboard
Copied
Hi, I'm new to InDesign scripting and only wish I discovered it sooner! I absolutely love it! However, I'm working on a script which is somewhat 80% complete now, but have grinded to a halt. I need now to parse a simple html string, and format it in ID.
I'm working with Javascript on CS5
The input string would look something like:
<p>Here is a sample <b>bold</b> format, <i>italic</i> string.</p>
I'd need to support tags p,b,i,dl,dd,dt.
I'm guessing converting the p tags to \r is not an issue, but I'm unsure how to apply a style to part of a paragraph. Also, would I need to create a style for each style tag that I want to support?
I've been reading about XML support but I'm confused as to how I'd use it for this problem. Would I use it as it's not a complete HTML nor XML string.
Any tips / code would be greatly appreciated, thanks!
Copy link to clipboard
Copied
If you don't care about nesting, you should be able to do a simple parse with a few GREPs. If you need to take nesting into account, your best bet is probably to use a state machine to parse it...
Google state machines for more info...
Harbs
Copy link to clipboard
Copied
Yeah, that's what I did by way of experimenting with HTML Input/Output: write a state machine. It doesn't need to be very complicated either. If you can be (quite) sure your HTML is properly formatted -- preferably, correctly-formed XHTML is best -- all you have to do is "push" each open tag, then process the last one on "pop".
Since InDesign text works per paragraph, I think you'd best scan back for an "open P" tag (or, in the case of not-properly formatted HTML, any other block element), then insert all of the text text and apply formatting.
It's quite fun to write this -- up to a certain point, anyway.
Copy link to clipboard
Copied
A while ago I wrote this script:
http://www.ixta.com/InDesign/scripts/html2charstyle.html
It does not support the mentioned block-level tags, though.
http://www.w3.org/TR/CSS2/visuren.html#block-boxes
For those you would add paragraph styles and hard returns, and don't forget to support the class attributes
.
Dirk
Get ready! An upgraded Adobe Community experience is coming in January.
Learn more