Skip to main content
Participant
March 17, 2023
Question

Cannot Extract Data from PDF:BAD PDF

  • March 17, 2023
  • 1 reply
  • 495 views

Hi,

Im trying to extract pdf with api
but im inbtrouble

here is my code, and the output.

thank you

=========

extract_pdf_operation = ExtractPDFOperation.create_new()
zip_file = input_pdf[0][:-5] + '.zip'
source = FileRef.create_from_local_file(file)

extract_pdf_operation.set_input(source)

extract_pdf_options: ExtractPDFOptions = ExtractPDFOptions.builder() \
    .with_element_to_extract(ExtractElementType.TEXT) \
    .build()
extract_pdf_operation.set_options(extract_pdf_options)

result: FileRef = extract_pdf_operation.execute(execution_context)
suc_list.append(zip_file)
     79 try:
---> 80     response = polling2.poll(
     81         lambda: http_client.process_request(http_request=http_request,
     82                                             success_status_codes=[HTTPStatus.OK, HTTPStatus.ACCEPTED],
     83                                             error_response_handler=PlatformApi.handle_error_response),
     84         check_success=is_correct_response,
     85         step=0.5,
     86         timeout=10 * 60
     87     )
     88     logging.debug(f'Total polling time, Latency(ms): {(datetime.now() - start_time).microseconds / 1000}')
     89     return response

File ~/opt/anaconda3/envs/cathay/lib/python3.8/site-packages/polling2.py:203, in poll(target, step, args, kwargs, timeout, max_tries, check_success, step_function, ignore_exceptions, poll_forever, collect_values, log, log_error)
    199         LOGGER.log(log_error, "poll() ignored exception %r", e)
    201 else:
    202     # Condition passes, this is the only "successful" exit from the polling function
--> 203     if check_success(val):
    204         return val
    206 values.put(last_item)

File ~/opt/anaconda3/envs/cathay/lib/python3.8/site-packages/adobe/pdfservices/operation/internal/api/platform_api.py:64, in PlatformApi.status_poll.<locals>.is_correct_response(response)
     62 if status == JobStatus.FAILED:
     63     job_error_response = PlatformApiResponse(status,content.get('error')).get_error_response()
---> 64     raise ServiceApiException(job_error_response.get('message'), ResponseUtil.
     65                               get_request_tracking_id_from_response(response, False), job_error_response
     66                               .get('status'), job_error_response.get('code'))
     67 return status == JobStatus.DONE

ServiceApiException: description =BAD_PDF - Unable to extract content.;

 

This topic has been closed for replies.

1 reply

Adobe Employee
March 17, 2023

The error is pretty clear. You've got a bad PDF on your hands. Are you sure that it was a PDF file that you uploaded? If it was, can you share the file?