Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
1

Does PDF Extract API support hyperlink extraction?

Community Beginner ,
Oct 06, 2023 Oct 06, 2023

Hi Support Community,
I have a query related to PDF Extract API.

Does it support extraction of hyperlinks from PDF content? If yes can you share sample json output with hyperlink.

992
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Oct 06, 2023 Oct 06, 2023

Update: I tried it in Extract API Demo, it was able to identify the hyperlink as Reference but it did not provide the URL of the hyperlink.
Refer snapshot below:

BhupeshMOHALI_0-1696598122469.pngexpand image

We need to extract hyperlinks also, so can you please clarify on this.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 06, 2023 Oct 06, 2023

Yes. It supports hyperlink extraction. Try the attached file. You'll see the hyperlinks on the right that are represented like this...

{
    "Bounds": [
        427.7200012207031,
        355.3280029296875,
        524.5761413574219,
        377.3710021972656
    ],
    "Font": {
        "alt_family_name": "Clean",
        "embedded": true,
        "encoding": "Custom",
        "family_name": "Adobe Clean",
        "font_type": "Type1",
        "italic": false,
        "monospaced": false,
        "name": "FHQAMC+AdobeClean-Bold",
        "subset": true,
        "weight": 700
    },
    "HasClip": false,
    "Lang": "en",
    "Page": 0,
    "Path": "//Document/Aside/P[2]/Reference",
    "Text": "(<https://www.adobe.io/apis/documentcloud/dcsdk/>)Adobe Acrobat Services › (<https://www.adobe.io/apis/documentcloud/dcsdk/pdf-extract.html>)Adobe PDF Extract API › ",
    "TextSize": 9,
    "attributes": {
        "LineHeight": 11
    },
    "elementId": 15
},

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Oct 08, 2023 Oct 08, 2023

Thanks Joel for your response. Yes I can see that hyperlinks are extracted properly in the PDF sample that you shared. However I am not seeing the same behavior with my sample PDF. Can you please check the attached PDF.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Oct 15, 2023 Oct 15, 2023

Hi @Joel Geraci ,
Any update on above?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 16, 2023 Oct 16, 2023

No. Generally when I submit test files, I don't get updates.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Oct 17, 2023 Oct 17, 2023

Hi @Joel Geraci ,

When I tried at my end hyper links were not extracted. Regarding your response I am not clear on following piece: "Generally when I submit test files, I don't get updates." - where did you submit test files?

I want to understand why it was not able to extract hyperlinks from the sample pdf that I shared earlier.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 18, 2023 Oct 18, 2023

I want to understand why it was not able to extract hyperlinks from the sample pdf that I shared earlier.

It's an AI, we don't know why it does what it does, we just have to train it more when it gets things wrong.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 05, 2024 Mar 05, 2024
LATEST

Hi Joel, hope you are well. Is there any update on this since your post? Has the model been trained a little bit since? I'm experiencing the same issue. Thanks, Tim

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources