Skip to main content
Participant
July 28, 2022
Answered

Cannot concatenate string error thrown whenever I try and extract tables

  • July 28, 2022
  • 1 reply
  • 396 views

I am using the pdf extraction function provided here on my own PDF files: https://developer.adobe.com/document-services/docs/overview/pdf-extract-api/howtos/extract-api/

 

Whenever I try and extract tables, the program throws this error:

INFO:adobe.pdfservices.operation.pdfops.extract_pdf_operation:All validations successfully done. Beginning ExtractPDF operation execution
ERROR:adobe.pdfservices.operation.internal.api.cpf_api:Failed in parsing Extract Result
Traceback (most recent call last):
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_pdf_api.py", line 52, in download_and_save
    extract_data_parser.parse()
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_parser.py", line 180, in parse
    self.ed_zipper.add_rendition_data(rendition_output)
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_zipper.py", line 28, in add_rendition_data
    file_name = rdata.file_name + rdata.rendition_extension
TypeError: can only concatenate str (not "NoneType") to str
ERROR:root:Exception encountered while executing operation
Traceback (most recent call last):
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_pdf_api.py", line 52, in download_and_save
    extract_data_parser.parse()
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_parser.py", line 180, in parse
    self.ed_zipper.add_rendition_data(rendition_output)
  File "C:\Users\camer\anaconda3\lib\site-packages\adobe\pdfservices\operation\internal\service\extract_data_zipper.py", line 28, in add_rendition_data
    file_name = rdata.file_name + rdata.rendition_extension
TypeError: can only concatenate str (not "NoneType") to str

 

Extracting just text appears to work fine, but as soon as I try and throw tables into the mix it breaks it. This is an issue as I am pretty much only using this API for its table extraction. Does anyone know a fix for this?

    This topic has been closed for replies.
    Correct answer Cameron25433801mqlc

    I found a solution! Turns out you need to specify the TableStructureType by adding .with_table_structure_format(TableStructureType.###) to your ExtractPDFOptions

    1 reply

    Cameron25433801mqlcAuthorCorrect answer
    Participant
    July 29, 2022

    I found a solution! Turns out you need to specify the TableStructureType by adding .with_table_structure_format(TableStructureType.###) to your ExtractPDFOptions