PDTextSelect Object not Getting Special Characters like (≤ Ω Β ∞ ≠ ≥).

Community Beginner ,
Jul 19, 2018

Copy link to clipboard

Copied

When i extracted selected text, some special characters like (≤ Ω Β ∞ ≠ ≥) are displayed as junk character.

Please suggest me, how to Get all characters.

Thanks.

TOPICS
Acrobat SDK and JavaScript

Views

496

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

PDTextSelect Object not Getting Special Characters like (≤ Ω Β ∞ ≠ ≥).

Community Beginner ,
Jul 19, 2018

Copy link to clipboard

Copied

When i extracted selected text, some special characters like (≤ Ω Β ∞ ≠ ≥) are displayed as junk character.

Please suggest me, how to Get all characters.

Thanks.

TOPICS
Acrobat SDK and JavaScript

Views

497

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Jul 19, 2018 0
Adobe Employee ,
Jul 20, 2018

Copy link to clipboard

Copied

Can you post the actual code fragment you are using?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jul 20, 2018 0
Community Beginner ,
Jul 20, 2018

Copy link to clipboard

Copied

TextSelect = AVPageViewTrackText(pageView, xHit, yHit, NULL);

PDDoc pdDoc = AVDocGetPDDoc(AVAppGetActiveDoc());

PDPage pdPage = AVPageViewGetPage(pageView);

int iPage = PDPageGetNumber(pdPage);

BKMCRot = PDPageGetRotate(pdPage);

if (TextSelect != NULL)

{

PDTextSelectEnumText(TextSelect, ASCallbackCreateProto(PDTextSelectEnumTextProc, BKMCTextEnumProc), NULL);

PDTextSelectEnumQuads(TextSelect, ASCallbackCreateProto(PDTextSelectEnumQuadProc, BKMCTextEnumQuadProc), NULL);

AVPageViewHighlightText(pageView, TextSelect);

ASBool bselection = AVDocSetSelection(AVAppGetActiveDoc(), ASAtomFromString("BMCreatorText"), TextSelect, true);

}

ACCB1 ASBool ACCB2  BKMCTextEnumProc(void* procObj, PDFont pdFont, ASFixed size, PDColorValue Color, char *buff, ASInt32 asLen)

{

//for getting Font Size

int iFontSize = FixedRoundToInt16(size);

csFontSize.Format(L"%d", iFontSize);

//For Getting Text Color

long Textcolor = CPDFLink::GetRGBFromPDColor(*Color);

long lRValue, lGValue, lBValue;

CString csRVal, csGVal, csBVal;

CColor::COLORREFToRGB(Textcolor, lRValue, lGValue, lBValue);

csRVal.Format(L"%d", lRValue);

csGVal.Format(L"%d", lGValue);

csBVal.Format(L"%d", lBValue);

csColorValue = (L"R=") + csRVal + (" G=") + csGVal + (" B=") + csBVal;

//for Getting Font name

char fontNameBuf[PSNAMESIZE];

PDFontGetName(pdFont, fontNameBuf, PSNAMESIZE);

csFontname = (CString)fontNameBuf;

//For multiple words we need to add each time.

CString csChar;

for (int iIndex = 0; iIndex < asLen; iIndex++)

{

char cBuff = buff[iIndex];

if (cBuff != 13 && cBuff != 10)

{

csChar = cBuff;

csBKMCKeyword += csChar;

}

}

buff = "";

return true;

}

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jul 20, 2018 0
Adobe Employee ,
Jul 20, 2018

Copy link to clipboard

Copied

Two things…

1 – Use PDTextSelectEnumTextUCS as that will return UCS (aka Unicode) encoded information so that you will be sure to get all text in a standardized fashion

2 – We careful with CString as (IIRC) it’s not great for arbitrary encodings.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jul 20, 2018 0
Community Beginner ,
Jul 23, 2018

Copy link to clipboard

Copied

Thanks for the reply...

how to extract special characters using PDWordFinder object?

When i extracting "≤ Ω Β ∞ ≠ ≥" it display as  " = O . 8 . = " .

i am using the following code...

ACCB1 ASBool ACCB2 SearchTextBasedonSelectedFont(PDWordFinder wObj, PDWord wInfo, ASInt32 pgNum, void* clientData)

{

CString csText;// = "";

CString csColor, csBlueText;

COLORREF wordColor = NULL;

char buf[256];

bool NonAlphaNum = false, LeadingPunc = false, LeadingSpace = false;

ASInt32 liStyleIndex;

bool bcolorValue = false;

static int FirstOccurence = 0;

PDStyle pdWordStyle;

PDColorValueRec color;

color.space = PDDeviceRGB;

color.value[0] = color.value[1] = color.value[2] = color.value[3] = fixedZero;

liStyleIndex = 0;

ASFixedQuad quad;

ASInt16 llAttr, liNumQuads;

long llCurSequence;

llCurSequence = ++(*(long*)clientData);

try

{

//Get the word color

if ((pdWordStyle = PDWordGetNthCharStyle(wObj, wInfo, liStyleIndex)) != NULL)

{

PDStyleGetColor(pdWordStyle, &color);

}

// To get the word in buffer

PDWordGetString(wInfo, buf, 256);

csText = buf;

PDStyleGetFont(pdWordStyle);

PDStyle aoPDWordStyle = PDWordGetNthCharStyle(wObj, wInfo, 0);

liNumQuads = PDWordGetNumQuads(wInfo);

}
}

Thanks..

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jul 23, 2018 0
Most Valuable Participant ,
Jul 24, 2018

Copy link to clipboard

Copied

If you use the UCS word finder you wull get a Unicode word string. This must be treated as an array of WCHAR. You NEED to Understand Unicode encoding.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jul 24, 2018 0
Community Beginner ,
Jul 24, 2018

Copy link to clipboard

Copied

Thank you for the reply...

I will Looking Unicode Encoding Concept...

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jul 24, 2018 0
Community Beginner ,
Jul 25, 2018

Copy link to clipboard

Copied

Hi Test Screen,

I am using PDDocCreateWordFinderUCS for Word Finder then

then i Get AsText from each PDWord.

then i Encode AsText to get all characters.

is it the correct way i am going?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jul 25, 2018 0