Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
0

Filling out a form with metadata from another PDF

New Here ,
Apr 30, 2024 Apr 30, 2024

Hi! 

I work in product documentation and for every publication that we produce, we need to send a Print Specification PDF to the printer. The print specification needs to have the publication's title, page numbers, publication number and edition (found in the footer and last page), project, date, and a screenshot of the cover of the publication. 

I got the date automatically, and the filename which is the article number+edition. I also managed to get a button to insert an image, which prompts me to select an image file, I select the publication PDF and only the cover photo gets inserted, which works great.

 

However, I would like to be able to select the publication PDF file and extract all the other fields, hopefulyl at the same time. The Project needs to be inserted manually. 

 

Is there a way to do this with Acrobat Pro?

TOPICS
Acrobat SDK and JavaScript , Windows
945
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 02, 2024 May 02, 2024

Hi,
That is possible runing a wizard action from the publication file which will fill the specification file.

@+

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
May 21, 2024 May 21, 2024

Yes, you can automate the extraction of metadata such as publication title, page numbers, publication number, and edition from a PDF using Acrobat Pro, along with a script to streamline the process. Here’s a step-by-step guide to achieve this:

Step 1: Prepare Your Environment

  1. Open Acrobat Pro: Launch Adobe Acrobat Pro on your computer.
  2. Access JavaScript Console: Go to Tools > JavaScript > JavaScript Console.

Step 2: JavaScript for Metadata Extraction

You can use JavaScript in Acrobat Pro to extract metadata. Below is a sample script that extracts the title, number of pages, and other specific text if they follow a pattern or are in certain locations within the PDF.

 

 

 

// Open the console in Acrobat Pro and paste the following script:

// Function to get document metadata
function getDocumentMetadata() {
    var docTitle = this.documentFileName; // Use filename as title if no metadata title
    var numPages = this.numPages; // Total number of pages
    var publicationNumber = "";
    var edition = "";

    // Try to get the title from the metadata if available
    if (this.info.Title) {
        docTitle = this.info.Title;
    }

    // Loop through the pages to find specific text for publication number and edition
    for (var i = 0; i < numPages; i++) {
        var pageText = this.getPageNthWord(i, 0, true);
        if (pageText) {
            var match = pageText.match(/Publication Number:\s*(\S+)/);
            if (match) {
                publicationNumber = match[1];
            }

            match = pageText.match(/Edition:\s*(\S+)/);
            if (match) {
                edition = match[1];
            }

            // Break if both fields are found
            if (publicationNumber && edition) {
                break;
            }
        }
    }

    console.println("Title: " + docTitle);
    console.println("Total Pages: " + numPages);
    console.println("Publication Number: " + publicationNumber);
    console.println("Edition: " + edition);
}

// Run the function
getDocumentMetadata();

Step 3: Run the Script

  1. Open the JavaScript Console: Press Ctrl + J to open the JavaScript Console in Acrobat Pro.
  2. Paste the Script: Copy and paste the script above into the console.
  3. Execute the Script: Click on the "Run" button (or press Ctrl + Enter) to execute the script.

Step 4: Review the Output

The console will display the extracted information:

  • Title: The title of the document.
  • Total Pages: The total number of pages in the document.
  • Publication Number: Extracted from the document if a specific pattern is found.
  • Edition: Extracted from the document if a specific pattern is found.

Step 5: Automate Image Insertion

Since you already have a button to insert an image, you can combine this functionality with the metadata extraction script. Unfortunately, Adobe Acrobat Pro does not allow full automation of all tasks with JavaScript alone due to security restrictions, but you can streamline the process as much as possible.

Additional Tips:

  1. Refine Text Extraction: Adjust the text extraction part to fit the exact pattern or position of your publication number and edition.
  2. Batch Processing: For batch processing, you might need to look into more advanced scripts or third-party tools like Python with PyPDF2 or similar libraries.

Example for Further Customization:

If your publication number and edition are always in the footer or specific pages, refine the script to target those areas. Here’s an enhanced part of the script for targeting specific pages:

 

 

for (var i = numPages - 1; i >= 0; i--) { // Assuming footer info is on the last pages
    var pageText = this.getPageNthWord(i, 0, true);
    if (pageText) {
        var footerMatch = pageText.match(/Publication Number:\s*(\S+)\s+Edition:\s*(\S+)/);
        if (footerMatch) {
            publicationNumber = footerMatch[1];
            edition = footerMatch[2];
            break;
        }
    }
}

 

 

By following these steps and customizing the script as needed, you can automate much of the metadata extraction process for your print specification PDF.

 

opinion_reviewexpand imageElisabet

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 22, 2024 May 22, 2024

This seems amazing! Thank you very much. I will try to use it and come back with the info 🙂 I am just learning how to automate things within Acrobat and the use of scripts, and I get excited to test them. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 21, 2024 May 21, 2024

Here is my proposal I did (I was waiting for an answer of the requester).
you have to create an action wizard with this script:

var otherDoc=app.openDoc("Specifications.pdf",this);
otherDoc.getField("title").value=this.info.Title;
otherDoc.getField("fileName").value=this.documentFileName;
otherDoc.getField("date").value=util.printd("dd-mm-yyyy",new Date());
otherDoc.getField("project").value="No found on the document!";
otherDoc.getField("frontPage").buttonImportIcon(this.path);
var pt2mm=25.4/72;
var aRect=this.getPageBox();
otherDoc.getField("dimensions").value=(Number(aRect[2])*pt2mm).toFixed(1)+" x "+(Number(aRect[1])*pt2mm).toFixed(1)+" mm";
otherDoc.getField("nbPages").value=this.numPages;
var p=this.numPages-1;
for (var i=0; i<this.getPageNumWords(p); i++) {
		console.println("OK : "+this.getPageNthWord(p, i, true));
	try {
		if (this.getPageNthWord(p, i, true)=="No" && this.getPageNthWord(p, i+1, true)=="Publication") {
			otherDoc.getField("article").value=this.getPageNthWord(p, i+2, true);
		} else if (this.getPageNthWord(p, i, true)=="Edition") {
			otherDoc.getField("edition").value=this.getPageNthWord(p, i+1, true)+" "+this.getPageNthWord(p, i+2, true);
			break;
		}
	} catch(e) {}
}
otherDoc.saveAs({
	cPath: otherDoc.path.replace(/.pdf$/i," ("+this.info.Title+" - "+util.printd("dd mmmm yyyy",new Date())+").pdf"),
});
this.closeDoc();

Specification and publications files are in the same folder for this example.

Open a publication file.

Capture d’écran 2024-05-21 à 22.52.57.pngexpand image

Click on the action wizard tools then click on the one you created.

Capture d’écran 2024-05-21 à 22.49.32.pngexpand image

Add all other publication files you need to generate a specification file.

Capture d’écran 2024-05-21 à 22.49.11.pngexpand image

Then click on "Start". The action will generate all specification files.

Capture d’écran 2024-05-21 à 22.57.24.pngexpand image

The script must be adapted in accordance with real needs and real layout of the publication files...

@+

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 22, 2024 May 22, 2024
LATEST

It would be generous to call me a beginner when it comes to scripting, so I must admit I felt a bit lost with your first answer. I really really appreciate the lengthy, clear, and specific help here. Thank you very much! I will attempt it and get back on the results. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines