Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

How to batch convert PDF to XML or flat text

New Here ,
May 09, 2016 May 09, 2016

How to batch convert PDF to XML or flat text. I have several files that I want to have converted to xml or flat text data. ( I have about 450 files to convert ). I do have adobe acrobat PRO 10.0.

I tried using ACTION with output as Don't save changes and I have also tried saving to different folder - But neither are working

"Execute JavaScript" command with this code:

this.exportXFAData({ cPath: this.path.replace(/\pdf$/i, "xml"), bXDP: false});

Using Don't Save Changes - there did not appear to be any visible change - no timestamp change, no new files generated

Using Save to Folder - After each command run, it generated another PDF and not an xml or text document in the folder.

TOPICS
Acrobat SDK and JavaScript , Windows
10.8K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 10, 2016 May 10, 2016

Are these LCD forms? Are there any error messages in the JS Console when

you run the Action?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 10, 2016 May 10, 2016

No these are not forms.. they are just PDF's... there are no errors, it is just not creating the files as expected.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
May 10, 2016 May 10, 2016

Have you carefully read the Acrobat JS API Reference for the  exportXFAData.  You may need to  specify the aPackets parameter. So you also need to change bXDP to true.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
May 10, 2016 May 10, 2016

Did you try to run your line of JavaScript in the Acrobat JavaScript console on an open form created with Acrobat with completed form fields?

It worked without errors for me.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 10, 2016 May 10, 2016

I am not sure what you mean by "OPEN FORM CREATED WITH ACROBAT with completed form fields." If you are asking if my PDF's were generated off of an ADOBE form then no. My PDF's were created off emails. I am trying to obtain the history of some reporting that was only generated in an email for the last year and a half. I need to get them in any form of readable format all combined and I will write some code around it to get into the ideal format for my javascript webpage.

WHen I actually run this...

First file...

After it processes - I just get another PDF file.. instead of the XML file.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 10, 2016 May 10, 2016

If the form doesn't contain any actual form fields then you can't extract any form data from it.

What you can do is convert the entire file to a text file, using the saveAs method (specifying the cConvID parameter to "com.adobe.acrobat.plain-text").

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 11, 2016 May 11, 2016

Can you give me some step by step? I have some coding background but not wiht Javascript. I got the solution that I have been attempting by searching the internet and found on another question on the adobe page just I tried it.

I actually just want one readable single file with all the data and I will get out what I need... whether its text, xml, doesnt really matter.

Here is what my PDF's look like. Which were extracted from the emails in my inbox.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
May 11, 2016 May 11, 2016

Why not just use the "Save as" and use the "XML 1.0" format option?

You can do this in an Action by setting the output options.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 11, 2016 May 11, 2016

I am sure I could. I am just really inexperienced here. I really could use your help. Is that what I type in the window? Unfortunately, I have this wonderful tool available to me but I do not really know how to use it and I don't really know Javascript. My experience has been unix based programming which is WAYYYYYY different.


"SAVE AS XML1.0"

If you could provide a sample - that would be a great help.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
May 12, 2016 May 12, 2016
LATEST

You create a new action. Select the "Save & Export" commands. Select the "Save" option. On the "Output Options" pop-up select "Export field to alternate format". Select "XML 1.0" from the "Export to:" drop down box.

exportXML.jpg

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines