Acrobat/Reader CosStringValue returns trash when file name consist of russian letters

Question

Hello, I try to retrieve attachment name but in case when attachment name contains russian letters adobe functions returns trash.

Usage:

ACCB1 ASBool ACCB2 AdobePluginHelper::GetEmbeddedFiles(CosObj obj, CosObj value, void *clientData)
{
    auto adobe = CAdobePlugin::GetAdobeMethods();
    auto attachedFiles = (vector<string>*)clientData;

    PDFileAttachment fileAttachment = adobe->PdFileAttachmentFromCosObj(value);

    //Grab the file's name using the cos object dictionary and the File Specifcation String key. 
    ASTCount len = 0;
    std::string sFileName(adobe->CosStringValueFromCosObject(adobe->CosDictObjGet(value, adobe->AtomFromString("F")), &len));

    if (!sFileName.empty())
    {
        attachedFiles->push_back(sFileName);
    }

    return true;
}

void AdobePluginHelper::GetAttachedFiles(PDDoc document, vector<string>& attachedFiles)
{
    auto adobe = CAdobePlugin::GetAdobeMethods();

    PDNameTree nameTree = adobe->PdDocGetNameTree(document, adobe->AtomFromString("EmbeddedFiles"));
    
    if (adobe->PdNameTreeIsValid(nameTree))
    {
        //Apply the enum function to the nametree so it can iterate through, extracting the attachments.
        adobe->PdNameTreeEnum(nameTree, &GetEmbeddedFiles, &attachedFiles);
    }
}

char* Implementation::Adobe::CosStringValueFromCosObject(CosObj obj, ASTCount* nBytes)
{
    return CosStringValue(obj, nBytes);
}

CosObj Implementation::Adobe::CosDictObjGet(CosObj dict, ASAtom key)
{
    return CosDictGet(dict, key);
}

Original:

In other cases it works fine.

Karl Heinz Kremer · Accepted Answer

Why are you not using the PDFileAttachment.PDFileAttachmentGetFileName() method? It returns an ASText object, which you can then process using the correct encoding.

Test Screen Name · Answer

Also: a Cos strong will contain text in PDFDocEncoding, and that does not have any Cyrillic in its character table, so it would be impossible to see Cyrillic. Likely it is escaped UCS-2 Unicode, as described under “PDF strings” in the PDF Reference. But I agree, where there is an ASText API this will be easier.

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded