Skip to main content
Participant
June 12, 2021
Question

convert response from PDF Extract API to image and csv files

  • June 12, 2021
  • 2 replies
  • 2822 views

Hello All,

 

I am using the PDF Extract API by making a REST call using postman. The call is successful and i am able to Poll and get the contentAnalyserResponse in text format. How do i get this as separate files that can get downloaded to my disk as images and/or csv/json files for text and tables. Can i receive the response as object(s) that can be written to different files with the proper extensions to achieve this. I remember reading somewhere but now cannot locate it again that the response can be received as a single downloadable zip file with all the separate files in a fixed folder structure...

 

If anyone has tried or achieved this, please could you share how you have done it.

 

Thanks and regards,

Adi

This topic has been closed for replies.

2 replies

Participant
August 7, 2021

Hi, 

I'm at the same spot right now and could use some guidance.

From looking at the json, it appears that I'm getting good results, but now I need to get it displayed properly. 

I'm beginning to think that I'm unable to render the response in postman and need to install the sdk to visualize the extraction. I'm hoping to send the GET responses directly into a db to display on website.

Any input on next steps would be greatly appreciated.

 

 

 

Raymond Camden
Community Manager
Community Manager
August 9, 2021

If you are getting the JSON, then you are good - I mean, as far as I can help. How you use the JSON depends on what your building. But at that point, it's outside the API/SDK and in your hands in terms of what you do with the JSON. Right?

Participant
August 9, 2021

Thanks for the response!

I suspect I'm missing something simple.

I'm getting alot of black triangles with white question marks. 

A character encoding issue? My source PDF is in Icelandic btw. It seems that alot of the response code is usable and correct, but I don't know how to get around this... 

Short example below

"cpf:outputs":{"elementsRenditions":[{"cpf:location":"fileoutpart0","dc:format":"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"},{"cpf:location":"fileoutpart1","dc:format":"image/png"}],"elementsInfo":{"cpf:location":"jsonoutput","dc:format":"application/json"}}}
--Boundary_86985_669621865_1628280852738
Content-Type: application/octet-stream
Content-Disposition: form-data; name="fileoutpart1"

�PNG


IHDRLQA��IDATx����q�غ`|�$ "�%� �B�C�Dh�c9���:�N�ɀ[e�U���oo��g��-bi�������~?~�����
�@`�@`�@`�@`�@`�@`�@`�@`�@`�@`��9�4M����?���-��opw�t߳�������y>���8�� �,���_������nW�p�e:��ax�=�]�X,:�?�$I�8��q~;u���G`������e�E�v��se���������t�Z�� ���d����2���jտ������y��y�W�#�8��0�N�?��砎}?7ے�g���lF�Qw/�e����zh�m4���*��y��J�t8�f��bQO��>��^__�4=�w��h4zi��f'w��٬���d�|�ͥ7=[����vY�Mz��r��7iv6�]ܽ��p����	�:�	��k�(������s��n�'������j�:��������W�������g�hd<w�M����i�����kl�:�~-_�	�}r����$ɳ~Gx\���s��J�הS��m��d�<�[���k���ֽK��*������`B�T�K.����u�M͗7?ȕ���_˷��֮���b���j��A���A�}����ϓ'E�~�(��,+ɲ�9-=�����Zw��n���f>�o�ۃ�����؍�A�y>�L.@�����:��t��N9k1Dӕ/�&.;��+[.�ɵW��c���

 

Raymond Camden
Community Manager
Community Manager
June 14, 2021

It should be a multipart form response. I haven't done that myself with Postman, but does that give you some help?