Problem with getting word count in TLF text
Hi,
I want to get the word count from my TLF text, but the problem is that I am not being able to handle th case for space.
I am using the findNextWordBoundary property of ParagraphElement as shown below:
private function countWords( para : ParagraphElement ) : void
{
var wordBoundary:int = 0;
var prevBoundary:int = 0;
while ( wordBoundary != para.findNextWordBoundary( wordBoundary ) )
{
// If the value is greater than 1, then it's a word, otherwise it's a space.
if ( para.findNextWordBoundary( wordBoundary ) - wordBoundary > 1)
{
wordCount += 1;
}
prevBoundary = wordBoundary;
wordBoundary = para.findNextWordBoundary( wordBoundary );
// If the value is greater than 1, then it's a word, otherwise it's a space.
if ( wordBoundary - prevBoundary > 1 )
{
var s:String = para.getText().substring( prevBoundary, wordBoundary );
lenTotal += s.length;
}
}
}
Now I have 2 issues here:
If my string is for eg: Hi, I am writing in "TLF". And I want to get its word count then
1) Suppose I take the case of the string Hi, . Then para.getText().substring( prevBoundary, wordBoundary ) gives the text as Hi i.e without the comma. Same case for the string "TLF forums" , It treats each " as a single word and not the whole "TLF" as a single word. Why doesn't it compute till spaces, that should be the ideal case. So until we don't give a space it should count the whole thing as a word.
2) So now the problem is I have applied a condition if ( wordBoundary - prevBoundary > 1 ) to check if it is a space i.e. if the diff is <= 1 it is a Space. But if I use this I miss out on single words. Like for eg if I have "Hi, This is a string" ,then 'a' is ignored too.
Now I could have added a check here along with the space check that the string between prevBoundary and wordBoundary is " "(i.e a space), Then also it is a problem as then the single words like a,&,I will be ignored.
So, now I am stuck with this issue and need some help from you guys.
Thanks
