Copy link to clipboard
Copied
See the attached PDF. The Extract API is throwing an error – “BAD_PDF - Unable to extract content” with no additional information. What is wrong with the PDF? How can I program the API to fix the error or skip pages that show an error and keep processing? Thanks.
Copy link to clipboard
Copied
A similar issue was faced by me. After viewing the attached pdf, I observed that you have different page widths of those containing tables. In such case, the API is most likely to fail. A work around could be preprocessing the pdf before passing to API and making sure that a given sub-part of the original pdf has all the pages of same size. This can resolve the issue.