How to get PDF buffer when loading the pdf with a URL

Explorer ,
Jan 13, 2021 Jan 13, 2021

Copy link to clipboard

Copied

as title

 

thanks in advance. 

TOPICS
How to, PDF Embed API

Views

112

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct Answer

Adobe Community Professional , Jan 20, 2021 Jan 20, 2021
I think I understand but let me check. The title in the metadata isn't the text that would be recognized by a human as the title of the document if they were reading it on screen. If that's the case, take a look at our PDF Tools / Extract API.  You can read about it here https://medium.com/adobetech/extract-content-structure-from-pdfs-using-ai-powered-adobe-pdf-extract-api-1593ad6b79b5

Likes

Translate

Translate
Adobe Community Professional ,
Jan 13, 2021 Jan 13, 2021

Copy link to clipboard

Copied

Can you elaborate on your question?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jan 13, 2021 Jan 13, 2021

Copy link to clipboard

Copied

Hi thanks for your reply.

 

basically i used the previewPDF function in window.AdobeDC.View class to render a pdf from a URL. 

 

and after it loads the PDF automatically, i want to get a raw arrayBuffer content of the loaded PDF so that i could do some processing on it. 

 

So i wonder if there is an API for requesting the raw pdf buffer.

 

I know that annotationManager.removeAnnotationsFromPDF returns the PDF buffer, but to use this function, i have to set the "IncludePDFAnnotations" to true, which I don't want, cos the additional save button will appear...

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jan 13, 2021 Jan 13, 2021

Copy link to clipboard

Copied

 sorry, the API call should be previewFile()

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jan 19, 2021 Jan 19, 2021

Copy link to clipboard

Copied

At this time, you can't just get the PDF arrayBuffer whenever you want it. As you noticed, only a few API calls will give that to you and there currently isn't any way to write the modified buffer back without reloading the PDF. My suggestion would be to get the PDF via fetch and then pass the content to AdobeDC.View as a Promise after you've done whatever processing you need to do on the PDF.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jan 19, 2021 Jan 19, 2021

Copy link to clipboard

Copied

Yes actually this is what I have right now..

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jan 19, 2021 Jan 19, 2021

Copy link to clipboard

Copied

I'm curious to hear what preprocessing you are doing. Is it something that makes sense to add to Embed API? 

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jan 19, 2021 Jan 19, 2021

Copy link to clipboard

Copied

basically i wish to parse the pdf raw data to extract some useful information such as the title for instance.

Get Outlook for iOS

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jan 20, 2021 Jan 20, 2021

Copy link to clipboard

Copied

Ah - Ok - That's easy. Look at the getPDFMetadata call to get the title, number of pages etc. 

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jan 20, 2021 Jan 20, 2021

Copy link to clipboard

Copied

i need to parse the actual content for the real title... lots of time the metadata gives wrong one.

Get Outlook for iOS

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jan 20, 2021 Jan 20, 2021

Copy link to clipboard

Copied

LATEST

I think I understand but let me check. The title in the metadata isn't the text that would be recognized by a human as the title of the document if they were reading it on screen. If that's the case, take a look at our PDF Tools / Extract API. 

 

You can read about it here https://medium.com/adobetech/extract-content-structure-from-pdfs-using-ai-powered-adobe-pdf-extract-...

 

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines