getPageNumWords in Acrobat SDK is not detecting words with '-' and '_'
we are using Acrobat SDK for identifying and linking the words on our pdf files. using the getPageNumWords method we get the number of words and using the quad and rect for the word we link it.
for (int page = 0; page < numPages; page++) { //for each page // number of words object objNumWords = COMUtils.invokeMethod(jso, "getPageNumWords", page); if (objNumWords == null) throw new PDFProcessingException("Acrobat API Error. Cannot access doc.getPageNumWords()"); int numWords = ConvertUtils.getInt(objNumWords); //Other logic goes here } when there is a word like ABCD-EFGH or ABCD_EFGH in the PDF file. the above method returns them as ABCD and EFGH instead of one word.
Is it a bug in Acrobat SDK or are we not using it as it is intended?
BTW we are using Acrobat SDK 1.1
what am I missing.
Thanks,
Tippu
