Fixing truncated HTML
Hi all,
I'm having an issue which is proving very hard to solve, and I can't find a solution anywhere. I'm SURE that someone must have had this issue before:
User entered text is formatted with a javascript WYSIWYG text editor (tinyMCE) and saved into a db and displayed on some pages - that works great. The problem is that one page shows an excerpt (the first 100, or 500) characters of each entry and if the text is truncated just after an open tag, the formatting runs on across the rest of the whole page.
For example, if the break comes after a <strong> tag and before the closing </strong>, the whole rest of the page will appear bold. I need some way to count the opening tags and their position and make sure there is a matching close tag for each, excluding self-closing tags like <img />, and writing in required close tags - in the correct position. This is the example HTML I'm using, it could be chopped off at any point:
<div class="outerHolder"> <h3>Test <a>heading</a></h3> <p>This is a paragraph, with a <strong>bold</strong> word in. </p> <ul> <li>list item 1</li> <li>list item 2</li> <li>list item 3</li> </ul>
<table>
<tr>
<td>tablecell1</td>
<td>tablecell2</td>
<td>tablecell3</td>
</tr>
</table> </div>
I'm sure the solution is somewhere along the lines of building an array of open tags and then an array of close tags, and then comparing them somehow. I just can't see a way to get the positions correct.
