Skip to main content
Participant
February 6, 2023
Question

How to get the character '±' correctly?

  • February 6, 2023
  • 2 replies
  • 1121 views

I use the function 'AVDocGetPageText' to get the '±' character in the page, but keep returning '? '

My code is as follows:

PDTextSelect textSelect = PDDocCreateTextSelect(pdDoc, 0, &rect);
bool ret = AVDocSetSelection(avDoc, ASAtomFromString("Text"), textSelect,true);
ASAtom format = ASAtomFromString("Text");
string title = "";
AVDocGetPageText(m_avDoc, vecPages[i], textSelect, format, TextSelectProc, &title);

 

void TextSelectProc(ASAtom format, void *buf, AVTBufferSize bufLen, void *clientData)
{

          // Look at the memory of *buf and it returns 3f

}

This topic has been closed for replies.

2 replies

MikelKlink
Participating Frequently
February 7, 2023

You have chosen "Text" as format. Thus, TextSelectProc is called twice for your selection. Have you verified that on both calls the '±' character is transformed to the replacement character?

Participant
February 7, 2023

Yes, I tried it twice, and it returned 0x3f

Legend
February 7, 2023

Maybe there is a problem with the encoding inside the PDF. Are you able to copy/paste the text including the plusminus character into other apps? Maybe you can share a PDF.

Thom Parker
Community Expert
Community Expert
February 6, 2023

Have you tested with other selections to ensure that other selected text is returned correctly?  Did you check the entire buffer to ensure that ASCII text is returned?  Have you checked the same character codes using a different method?  

 

 

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often
Participant
February 7, 2023

Thank you for your reply.Please confirm the attached file. 

The memory returned is shown below

The character returned by AVDocGetPageText is incorrect.

Thom Parker
Community Expert
Community Expert
February 7, 2023

Looks like several special characters may be off. Have no idea.

 

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often