Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Cannot Extract Data from PDF:BAD PDF

New Here ,
Mar 17, 2023 Mar 17, 2023

Hi,

Im trying to extract pdf with api
but im inbtrouble

here is my code, and the output.

thank you

=========

extract_pdf_operation = ExtractPDFOperation.create_new()
zip_file = input_pdf[0][:-5] + '.zip'
source = FileRef.create_from_local_file(file)

extract_pdf_operation.set_input(source)

extract_pdf_options: ExtractPDFOptions = ExtractPDFOptions.builder() \
    .with_element_to_extract(ExtractElementType.TEXT) \
    .build()
extract_pdf_operation.set_options(extract_pdf_options)

result: FileRef = extract_pdf_operation.execute(execution_context)
suc_list.append(zip_file)
     79 try:
---> 80     response = polling2.poll(
     81         lambda: http_client.process_request(http_request=http_request,
     82                                             success_status_codes=[HTTPStatus.OK, HTTPStatus.ACCEPTED],
     83                                             error_response_handler=PlatformApi.handle_error_response),
     84         check_success=is_correct_response,
     85         step=0.5,
     86         timeout=10 * 60
     87     )
     88     logging.debug(f'Total polling time, Latency(ms): {(datetime.now() - start_time).microseconds / 1000}')
     89     return response

File ~/opt/anaconda3/envs/cathay/lib/python3.8/site-packages/polling2.py:203, in poll(target, step, args, kwargs, timeout, max_tries, check_success, step_function, ignore_exceptions, poll_forever, collect_values, log, log_error)
    199         LOGGER.log(log_error, "poll() ignored exception %r", e)
    201 else:
    202     # Condition passes, this is the only "successful" exit from the polling function
--> 203     if check_success(val):
    204         return val
    206 values.put(last_item)

File ~/opt/anaconda3/envs/cathay/lib/python3.8/site-packages/adobe/pdfservices/operation/internal/api/platform_api.py:64, in PlatformApi.status_poll.<locals>.is_correct_response(response)
     62 if status == JobStatus.FAILED:
     63     job_error_response = PlatformApiResponse(status,content.get('error')).get_error_response()
---> 64     raise ServiceApiException(job_error_response.get('message'), ResponseUtil.
     65                               get_request_tracking_id_from_response(response, False), job_error_response
     66                               .get('status'), job_error_response.get('code'))
     67 return status == JobStatus.DONE

ServiceApiException: description =BAD_PDF - Unable to extract content.;

 

TOPICS
PDF Extract API
384
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Mar 17, 2023 Mar 17, 2023
LATEST

The error is pretty clear. You've got a bad PDF on your hands. Are you sure that it was a PDF file that you uploaded? If it was, can you share the file? 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources