Skip to main content
Participant
January 15, 2025
Question

Extract Text while excluding tables data

  • January 15, 2025
  • 0 replies
  • 74 views

Good morning!

When using the PDF Services API to extract text from a PDF, is it possible to exclude the text related to tables?

# Create parameters for the job

extract_pdf_params = ExtractPDFParams(
    elements_to_extract=[ExtractElementType.TEXT],
)

The text and data related to tables are extracted separately into CSVs. Therefore, it would be ideal if the JSON containing the extracted text does not include the tables and their data again.