Using javaScript to crop a pdf then ocr the area cropped to extract a number.

Report · Aug 31, 2017

I've been able to use the GetNthWord function to extract invoice numbers from most pdf documents but now I am encountering image documents that have varying numbers of words (post OCR) before the desired invoice information (different addresses cause the word count to change). I was told once that you can crop the document to the rough area where the invoice number should be, ocr the area and then grab the invoice number. the OCR part and grab the nth word part I think I can handle. I can find any examples of cropping with JavaScript and then restoring the document to orig full. I need to do batches of documents this way.