Cannot concatenate string error thrown whenever I try and extract tables

Question

I am using the pdf extraction function provided here on my own PDF files: https://developer.adobe.com/document-services/docs/overview/pdf-extract-api/howtos/extract-api/ Whenever I try and extract tables, the program throws this error:INFO:adobe.pdfservices.operation.pdfops.extract_pdf_operation:All validations successfully done. Beginning ExtractPDF operation execution
ERROR:adobe.pdfservices.operation.internal.api.cpf_api:Failed in parsing Extract Result
Traceback (most recent call last):
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_pdf_api.py", line 52, in download_and_save
    extract_data_parser.parse()
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_parser.py", line 180, in parse
    self.ed_zipper.add_rendition_data(rendition_output)
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_zipper.py", line 28, in add_rendition_data
    file_name = rdata.file_name + rdata.rendition_extension
TypeError: can only concatenate str (not "NoneType") to str
ERROR:root:Exception encountered while executing operation
Traceback (most recent call last):
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_pdf_api.py", line 52, in download_and_save
    extract_data_parser.parse()
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_parser.py", line 180, in parse
    self.ed_zipper.add_rendition_data(rendition_output)
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_zipper.py", line 28, in add_rendition_data
    file_name = rdata.file_name + rdata.rendition_extension
TypeError: can only concatenate str (not "NoneType") to str Extracting just text appears to work fine, but as soon as I try and throw tables into the mix it breaks it. This is an issue as I am pretty much only using this API for its table extraction. Does anyone know a fix for this?

Cameron25433801mqlc · Accepted Answer

I found a solution! Turns out you need to specify the TableStructureType by adding .with_table_structure_format(TableStructureType.###) to your ExtractPDFOptions

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.