Use PaperCapture API in .Net

Report · Jul 08, 2010

Hai to All,

I just want to know how to use PaperCapture API in either VB.Net or C#.Net.. kindly pls help me in this scenario.

Rgds,
Parthasarathy.S

Report · Jul 09, 2010

PaperCapture.api is the old name for the OCR capabilities in Acrobat implemented via a plugin. There is no access to it from VB/C#.

Report · Jul 11, 2010

So there is no way to run OCR programmatically in VB.net or C#.net...?

Report · Jul 12, 2010

ONLY VB/C# - that is correct.

Report · Jul 12, 2010

Just tel me how to do that..Pls help with some samples...

Report · Jul 12, 2010

In the Acrobat SDK there is documentation and samples around the AVCommand set of APIs.

Report · Jul 12, 2010

Just tell me how to add these API's to me .net code..give some samples..

Report · Jul 13, 2010

I don't know how many times I have to repeat this - so let's try this one last time.

You CAN NOT call Acrobat's OCR from .NET code.

You can ONLY call it from C/C++ code, when written as a plugin to Acrobat.

Report · Jul 06, 2011

Hi dear Acrobat-API freaks,

from C# you can call OCR like that: acroAppC.MenuItemExecute("Cpt:CapturePages");

But this only pops up the Dialog. There is no way to automate it further.

I am about to make an Acrobat PlugIn "Bangert-OCR" in C++ (SDK-Samples) which I

call from C#: acroAppC.MenuItemExecute("ADBE:Bangert-OCR");

For the PlugIn that call works.

But I cannot find the C/C++ OCR-API only this one. But there I cannot find any C/C++

OCR functionality.

What I can find is the PaperCapture PlugIn with its different dll-s OCRLibraryInf.dll etc.

below D:\Program Files\Adobe\Acrobat 9.0\Acrobat\plug_ins\PaperCapture

In those dll-s I found the exported functions as there are:

Perhaps the do_ocr function could do the job . Where is the documentation for the use of

those functions in C++ ?

Best regards

Axel Arnold Bangert - Herzogenrath 2011

Report · Jul 06, 2011

The only way to automate OCR in Acrobat is via a plugin (written in

C/C++). That can use the AVCommand APIs to do its work.

Report · Jul 06, 2011

Dear Mr. "Irosenth",

in the C++ PlugIn the OCR-Dialog Call is:
ACCB1 void ACCB2 MyPluginCommand(void *clientData)
{
    AVCommand volatile cmd = NULL;
    ASAtom cmdName;
    cmdName=ASAtomFromString("Cpt:CapturePages");
    cmd=AVCommandNew(cmdName);
    AVCommandStatus status = AVCommandExecute (cmd);
}

in C# Windows Forms Application the same OCR Dialog Call is:

acroAppC.MenuItemExecute("Cpt:CapturePages");

That works in both cases. But there is - in both cases - no

documented way to automate it further.

But if we use in C++ the undocumented Acrobat PaperCapture

function do_ocr, which resides in the drs832.dll and which

you can see in my picture2 in the above statement, we can

do the job and automate the ocr completely. What we need

are the parameters of this do_ocr function.

Best regards

Axel Arnold Bangert - Herzogenrath 2011

Report · Jul 07, 2011

Please review SDK documentation for the AVCommand APIs and how to use them

to automate OCR.

Bringing up the dialog is NOT the correct way.

If a function is undocumented, that means it's UNSUPPORTED as well...

Report · Jul 07, 2011

Dear Mr. Ironsenth,

you want to indicate me, that there is a way to pass the necessary

parameters for cmdName=ASAtomFromString("Cpt:CapturePages");

by passing the parameters like shown in the SDK help?

An only general example for the ConfigureCommandParameters is

shown here.

But here is written:

"...An AVCommand object's parameter set is specific to each command..."

And I cant find anywhere the specific parameter set properties for the

command Cpt:CapturePages.

There are tree categories of parameters:

1 input : required

2.configuration : optional

3. AVCommand : optional

Naturally I would have to pass these parameters listed below, but where

can I find the exact properties of the input parameters, which are required

and the exact configuration parameters properties ( data type? pointer?

etc.)?

input para -> pages : actual page

config para -> ocr language : german

config para -> pdf outputstyle : searchable image

config para -> image calculate new : 150m dpi

This should be documented in the "Acrobat and PDF Library

API Overview" like this is done by Adobe for many commands

-for example -> GeneralInfo
kDocInfoCmdKeyTitle : ASText
kDocInfoCmdKeySubject : ASText
kDocInfoCmdKeyAuthor : ASText
kDocInfoCmdKeyKeywords : ASText
kDocInfoCmdKeyBinding : ASText
kDocCmdKeyLeaveAsIs : ASCab

But there is not any entry for CapturePages . Why is it left out?

Best regards

Axel Arnold Bangert - Herzogenrath 2011

P.S.:

This code (for the "GeneralInfo" command) is working without any error,

but it does not indicate the programmatically changed docTitleValue nor

the docSubjectValue. It pops up with the unchanged original values:

#ifndef MAC_PLATFORM

#include "PIHeaders.h"

#endif

const char* MyPluginExtensionName = "ADBE:Bangert-OCR";

ACCB1 ASBool ACCB2 PluginMenuItem(char* MyMenuItemTitle, char* MyMenuItemName,

bool bUnderAcrobatSDKSubMenu);

ACCB1 ASBool ACCB2 MyPluginSetmenu()

{

return PluginMenuItem("Bangert-OCR", "ADBE:Bangert-OCR", true);

}

ACCB1 void ACCB2 MyPluginCommand(void *clientData)

{

PDDoc pddoc;

AVDoc avDoc = AVAppGetActiveDoc();

PDDoc pdDoc = AVDocGetPDDoc (avDoc);

ASAtom cmdName;

AVCommand cmd;

cmdName = ASAtomFromString ("GeneralInfo");

cmd = AVCommandNew (cmdName);

ASCab inputs = ASCabNew();

ASCabPutPointer (inputs, kAVCommandKeyPDDoc, PDDoc, pdDoc, NULL);

if (kAVCommandReady == AVCommandSetInputs (cmd, inputs))

{

AVAlertNote("Input erfolgreich");

}

ASCabDestroy (inputs);

ASCab config = ASCabNew();

// kAVCommandUIErrorsOnly); kAVCommandUISilent);

ASCabPutInt (config, "UIPolicy", kAVCommandUIInteractive);

if (kAVCommandReady == AVCommandSetConfig (cmd, config))

{

AVAlertNote("Config erfolgreich");

}

ASCabDestroy (config);

const char *docTitleValue = "Titel";

const char *docSubjectValue = "Betreff";

const char *docAuthorValue = "Autor";

const char *docKeywordsValue = "Stichwoerter";

const char *docBindingValue = "Bindung";

ASCab params = ASCabNew();

ASText text = ASTextNew();

ASTextSetEncoded(text, docTitleValue,(ASHostEncoding)PDGetHostEncoding());

ASCabPutText (params, docTitleValue, text);

text = ASTextNew();

ASTextSetEncoded(text, docSubjectValue,(ASHostEncoding)PDGetHostEncoding());

    ASCabPutText(params, docSubjectValue, text);
    ASCab newLeaveAsIs = ASCabNew();
    ASCabPutBool(newLeaveAsIs, false, false );
    ASCabPutCab(params, false, newLeaveAsIs);
    if (kAVCommandReady == AVCommandSetParams(cmd, params))
    {
        AVAlertNote("Params erfolgreich");
    }
    ASCabDestroy (params);

AVCommandStatus status = AVCommandExecute (cmd);

return;

}

ACCB1 ASBool ACCB2 MyPluginIsEnabled(void *clientData)

{

return true;

}

Report · Jul 18, 2011

Dear Mr. Ironsenth,

the only way to find out the certainly undocumented parameters of the AVCommand -> cpt:CapturePages would

be to try to get the parameters from a running instance

// Invoke AVCommandGetParams

params = ASCabNew();

AVCommandGetParams(cmd,params);

But in effect this shows NOTHING !!!! So what about your incredible repetion of ... bla, bla, bla ... AVCommand.

I think that you really know nothing about that theme.

Axel Arnold Bangert - Herzogenrath 2011

Report · Jul 18, 2011

Cpt:CapturePages is the MENU ITEM. It is NOT the name of the AVCommand.

The AVCommand's name is "PaperCapture", as documented in the SDK.

Report · Jul 18, 2011

Dear Mr. Ironsenth,

thank you - it works without any parameter. This is the working code I am using. I can call

the plugin now from C# with the acroAppC.MenuItemExecute("ADBE:Bangert-OCR");

But I cant find the documentation in the sdk. Could you please insert here the exact link

to the corresponding "PaperCapture" sdk documentation.

Axel Arnold Bangert - Herzogenrath 2011

Report · Jul 20, 2011

I believe the possible parameters are provided in one of the header files

Report · Jul 20, 2011

Hi

i'm working On This API but i have problem

the acrobat ask me every time i try to open pdf to set the Setting of the OCR

can any one help me to do ocr without open this dialog

i'm using c++

Report · Jul 20, 2011

You don't mention HOW you are calling Acrobat, so we can't really help you

Report · Jul 20, 2011

Hi SDK-man,

at the moment I do it with a SendKey but that is naturally not a good coding,

for it only works in max. 90%:

...

acroAppC.Show();

System.Windows.Forms.SendKeys.Send("{ENTER}");
acroAppC.MenuItemExecute("ADBE:Bangert-OCR");

...

Therefore I search for the parameters. 'Chief Ironside'

said we could find it in a header - so we'll search it there.

Best regards

Axel Arnold Bangert - Herzogenrath 2011

Report · Jul 20, 2011

i use the same code that AxelArnoldBangert put in the last post

something like this

AVDoc avDoc = AVAppGetActiveDoc();

PDDoc pDDoc = AVDocGetPDDoc(avDoc);

ASAtom cmdName;

AVCommand cmd;

cmdName=ASAtomFromString("PaperCapture");

cmd=AVCommandNew(cmdName);

AVCommandStatus status =AVCommandExecute (cmd);

Report · Jul 20, 2011

What version of Acrobat are you using?

Report · Jul 20, 2011

Acrobat 9

Report · Jul 20, 2011

thanks my friend AxelArnoldBangert

but i need more help from you i'm Java Developer And

i don have any knlowage about c++

so i can't under stand how i can't use your nice code !!

so plz help me to know how i can find the System.Windows pack

and how i can use it because the VS Can't See it

and many thanx to you

Report · Jul 20, 2011

Hi SDK

in Java that is something like that:

public void keyPressed(KeyEvent e)

{

int key = e.getKeyCode();

if (key == KeyEvent.VK_ENTER)

{

System.out.println("ENTER pressed");

}

best regards

Axel Arnold Bangert - Herzogenrath 2011