Skip to main content
Participant
September 11, 2022
Question

BAD_PDF Error in Extract API

  • September 11, 2022
  • 1 reply
  • 919 views

See the attached PDF.  The Extract API is throwing an error – “BAD_PDF - Unable to extract content” with no additional information.  What is wrong with the PDF? How can I program the API to fix the error or skip pages that show an error and keep processing?  Thanks. 

This topic has been closed for replies.

1 reply

Participant
October 26, 2022

A similar issue was faced by me. After viewing the attached pdf, I observed that you have different page widths of those containing tables. In such case, the API is most likely to fail. A work around could be preprocessing the pdf before passing to API and making sure that a given sub-part of the original pdf has all the pages of same size. This can resolve the issue.