getPageNthWord combines pairs of words instead of reporting each word individually
When using getPageNthWord to obtain words in a PDF, every sentence that is wrapped causes the wrapped pair of words to be concatenated.
for example assume wrap at the 7
6 CFR 210.5(c), 7
CFR
There should be 6 words obtained
6 CFR 210.5 c 7 CFR
instead only 5 worrds are obtained
6 CFR 210.5 c 7CFR
The 7 is concatanated to the CFR which is incorrect and makes searching fail if I am searching for "7 CFR".
I assume this is a bug in the SDK for the javscript example I am using
object[] getPageNthWordParam = { p, i };
word = (String)T.InvokeMember(
"getPageNthWord",
BindingFlags.InvokeMethod |
BindingFlags.Public |
BindingFlags.Instance,
null, jsObj, getPageNthWordParam);
If indeed this is a bug, where can I report it?
Is there a work around such as disabling sentence wraping?
Another way to read individual words other than getPageNthWord?
I can not use my program and it is a lot easier to convert PDF to DOCX and not even use Acrobat.
