Highlighted

How do you collect text data from filled forms in local PDF files?

New Here ,
Oct 04, 2020

Copy link to clipboard

Copied

I have about 80 local PDF files having input forms that have been filled by students. I would like to extract text data from them so that I can easily score their answers. How do you do that by the latest Acrobat Pro? I need do that on local files.

Most Valuable Participant
Correct answer by try67 | Most Valuable Participant

You didn't mention your version of Acrobat but it can be done using the Merge Data Files into Spreadsheet command, which is under Tools - Prepare Form (and then under More Form Options, in some versions).

TOPICS
PDF forms

Views

94

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

How do you collect text data from filled forms in local PDF files?

New Here ,
Oct 04, 2020

Copy link to clipboard

Copied

I have about 80 local PDF files having input forms that have been filled by students. I would like to extract text data from them so that I can easily score their answers. How do you do that by the latest Acrobat Pro? I need do that on local files.

Most Valuable Participant
Correct answer by try67 | Most Valuable Participant

You didn't mention your version of Acrobat but it can be done using the Merge Data Files into Spreadsheet command, which is under Tools - Prepare Form (and then under More Form Options, in some versions).

TOPICS
PDF forms

Views

95

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Oct 04, 2020 0
Adobe Employee ,
Oct 05, 2020

Copy link to clipboard

Copied

Hi there,

 

We are sorry for the trouble. As described, you want to extract data from the filled PDF form.

 

Please try the following steps and see if that helps

 

  1. In Acrobat, open the response file and select the data to export.
  2. In the left navigation panel, click Export, and then choose Export Selected.
  3. In the Select Folder To Save File dialog box, specify a name, location, and file format (CSV or XML) for the form data, and click Save.

 

For more information please look at the help page https://helpx.adobe.com/in/acrobat/using/collecting-pdf-form-data.html#export_user_data_from_a_respo...

 

Regards

Amal

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 05, 2020 0
New Here ,
Oct 05, 2020

Copy link to clipboard

Copied

The PDF files were collected via a web form as a file attachment, and so the individual users have not submitted the form. In this case, how do I create and initializethe response file you mentioned? Thank you very much for your help.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 05, 2020 0
Most Valuable Participant ,
Oct 05, 2020

Copy link to clipboard

Copied

You didn't mention your version of Acrobat but it can be done using the Merge Data Files into Spreadsheet command, which is under Tools - Prepare Form (and then under More Form Options, in some versions).

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 05, 2020 0
New Here ,
Oct 05, 2020

Copy link to clipboard

Copied

Thank you very much. It is what I was looking for and it worked, but all the Japanese characters in the form fields are broken after exporting to a CSV file.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 05, 2020 0
Most Valuable Participant ,
Oct 05, 2020

Copy link to clipboard

Copied

The encoding of the file created is UTF8, which might not cover Japanese characters. In order to do that you would need to use some other tool, I'm afraid. Maybe try exporting files as TXT or FDF files, and then merge them using a different utility. Another option is to use a script to do it, instead of the built-in Merge Data Files command.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 05, 2020 0
New Here ,
Oct 05, 2020

Copy link to clipboard

Copied

Thank you agai. The text encoding looks to be UTF-8 because I could etract fields text by using PyPDF2, which is a Python module to handle PDF forms. For the moment, the use of PyPDF2 is good enough for my purpose, but your suggestion to use the native Acrobat functionality was much easier except for the Japanese character problem.

 

If I find a fix for my problem, I will post it in this thread for someone else.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 05, 2020 0
Most Valuable Participant ,
Oct 05, 2020

Copy link to clipboard

Copied

Can you share a sample file with fields that has Japanese text in them?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 05, 2020 0
New Here ,
Oct 05, 2020

Copy link to clipboard

Copied

Here is a sample file.

https://www.dropbox.com/s/faupq7447hb84b9/sample.pdf?dl=0

"Answer1" and "Answer2" should be "日本語 Japanese 日本語" but it is convereted to "... Japanese ...".

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 05, 2020 0
try67 LATEST
Most Valuable Participant ,
Oct 05, 2020

Copy link to clipboard

Copied

When exporting it in UTF-8 explicitly it does seem to work correctly. I guess the default encoding is just plain ANSI, then. You can use this code I wrote to export it properly (you can run it from the JS Console, or from an Action, or something like that):

 

var names = [];
var values = [];
for (var i=0; i<this.numFields; i++) {
	var f = this.getField(this.getNthFieldName(i));
	if (f==null) continue;
	if (f.type=="button" || f.type=="signature") continue;
	names.push(f.name);
	values.push(f.valueAsString);
}

var doName = this.documentFileName.replace(/\.pdf$/i, "_data.txt");
this.createDataObject(doName, "");
var s = names.join("\t") + "\r\n" + values.join("\t");
this.setDataObjectContents(doName, util.streamFromString(s, "utf-8"));
this.exportDataObject(doName);
this.removeDataObject(doName);

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 05, 2020 0