Copy link to clipboard
Copied
Hello,
I have written a plugin to select highlighted text from PDF document.
However,text I get from PDF document is retrieved with actual formatting, for example, new line etc.
I want the text without any formatting.
Here is the code snippet to get the highlight annotated text.
Here is my code for getting the annotated text:
String stream contains the string with all formatting.
How can I get the text without any formatting?
Thanks
Copy link to clipboard
Copied
1. You do not check the string length before strcpy. This is a serious bug.
2. If you don't want newlines you can strip them or change them to spaces.
Copy link to clipboard
Copied
I am not able to replace the new line with any other character.
I am comparing each word with \n\r, It seems that PDF formatting character is different for new line.
I can't see \n or \r in the text when I compare.
Please tell me how can I strip them?
Thanks.
Copy link to clipboard
Copied
What are the hex values?
Copy link to clipboard
Copied
Sorry I did not get. Which hex values do you mean?
Copy link to clipboard
Copied
Hello,
I did some debugging. Hex value is x85
Copy link to clipboard
Copied
You mentioned ellipsis (...) in some other post I think? If so we should consider that in Latin1 encodings that 0x85 is an ellipsis (single character for three dots). It is not layout.
Find more inspiration, events, and resources on the new Adobe Community
Explore Now