Cannot concatenate string error thrown whenever I try and extract tables

Report · Jul 28, 2022

I am using the pdf extraction function provided here on my own PDF files: https://developer.adobe.com/document-services/docs/overview/pdf-extract-api/howtos/extract-api/

Whenever I try and extract tables, the program throws this error:

INFO:adobe.pdfservices.operation.pdfops.extract_pdf_operation:All validations successfully done. Beginning ExtractPDF operation execution
ERROR:adobe.pdfservices.operation.internal.api.cpf_api:Failed in parsing Extract Result
Traceback (most recent call last):
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_pdf_api.py", line 52, in download_and_save
    extract_data_parser.parse()
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_parser.py", line 180, in parse
    self.ed_zipper.add_rendition_data(rendition_output)
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_zipper.py", line 28, in add_rendition_data
    file_name = rdata.file_name + rdata.rendition_extension
TypeError: can only concatenate str (not "NoneType") to str
ERROR:root:Exception encountered while executing operation
Traceback (most recent call last):
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_pdf_api.py", line 52, in download_and_save
    extract_data_parser.parse()
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_parser.py", line 180, in parse
    self.ed_zipper.add_rendition_data(rendition_output)
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_zipper.py", line 28, in add_rendition_data
    file_name = rdata.file_name + rdata.rendition_extension
TypeError: can only concatenate str (not "NoneType") to str

Extracting just text appears to work fine, but as soon as I try and throw tables into the mix it breaks it. This is an issue as I am pretty much only using this API for its table extraction. Does anyone know a fix for this?

Report · Jul 29, 2022

I found a solution! Turns out you need to specify the TableStructureType by adding .with_table_structure_format(TableStructureType.###) to your ExtractPDFOptions

Cannot concatenate string error thrown whenever I try and extract tables

1 Correct answer

Photos