• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Split document by Content and save with content filename

New Here ,
Apr 02, 2018 Apr 02, 2018

Copy link to clipboard

Copied

I'm a total novice using Adobe Javascript and I'm trying to split a large pdf with invoices into separate invoice files which are all named by the unique invoice number. 

On each page the invoice number comes directly after the the words "Invoice No." and is just a 6 digit number, however the words "Invoice No." do not appear at the same word count on each page. So I am confused about using getnthword as it differs all the time.  Can anyone help me with a script for this?

TOPICS
Acrobat SDK and JavaScript , Windows

Views

248

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 02, 2018 Apr 02, 2018

Copy link to clipboard

Copied

What do you mean by "it differs all the time"? You provide it with a page number and an index number and it will return a word.

There's no guarantee as to the order of those words, though, which is what makes such scripts quite tricky.

I have a lot of experience in developing tools that do exactly what you described, so if you're interested I could create it for you (for a small fee). I'll need to see some sample pages, though. You can email it to me to try6767 at gmail.com.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 03, 2018 Apr 03, 2018

Copy link to clipboard

Copied

What I mean is that there is a header above the invoice number with name and address, and depending on the size of the address, the Invoice Number gets shifted along so it is never the after the same number of words on each page.  It does appear on the same line however of each page if that helps.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 02, 2018 Apr 02, 2018

Copy link to clipboard

Copied

The obvious approach (not necessarily successful) is to use getPageNthWord for each word in turn, checking each one. Once you see “Invoice” follows by “No” (no dot) there is a chance the next Word is the number. This is not hard to try if you are a programmer experienced in creating algorithms and turning to code. Don’t expect to be able to to this by googling...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 04, 2018 Apr 04, 2018

Copy link to clipboard

Copied

This is where I got so far, it's fine if the invoice number  is 25th word, but that changes on each page.  Any help appreciated.  The words preceding the actual number are always "I N V O I C E No."

try {

for (var i = 0; i < this.numPages; i++) {

    var j = 25;
    var invoice_no = this.getPageNthWord(i, j);

   
    this.extractPages({

        nStart: i,

        cPath: "/c/temp/"+invoice_no+".pdf"});        
}
}
catch (e) { console.println("Aborted: " + e) }

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 04, 2018 Apr 04, 2018

Copy link to clipboard

Copied

LATEST

You can't use the word number, as that's not always the same, as you wrote.

Instead, you need to use another loop to iterate over all the words in each page, looking for the ones before the text you're interested in.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines