Skip to main content
Participating Frequently
October 6, 2023
Question

Does PDF Extract API support hyperlink extraction?

  • October 6, 2023
  • 2 replies
  • 1487 views

Hi Support Community,
I have a query related to PDF Extract API.

Does it support extraction of hyperlinks from PDF content? If yes can you share sample json output with hyperlink.

    This topic has been closed for replies.

    2 replies

    Joel Geraci
    Community Expert
    Community Expert
    October 6, 2023

    Yes. It supports hyperlink extraction. Try the attached file. You'll see the hyperlinks on the right that are represented like this...

    {
        "Bounds": [
            427.7200012207031,
            355.3280029296875,
            524.5761413574219,
            377.3710021972656
        ],
        "Font": {
            "alt_family_name": "Clean",
            "embedded": true,
            "encoding": "Custom",
            "family_name": "Adobe Clean",
            "font_type": "Type1",
            "italic": false,
            "monospaced": false,
            "name": "FHQAMC+AdobeClean-Bold",
            "subset": true,
            "weight": 700
        },
        "HasClip": false,
        "Lang": "en",
        "Page": 0,
        "Path": "//Document/Aside/P[2]/Reference",
        "Text": "(<https://www.adobe.io/apis/documentcloud/dcsdk/>)Adobe Acrobat Services › (<https://www.adobe.io/apis/documentcloud/dcsdk/pdf-extract.html>)Adobe PDF Extract API › ",
        "TextSize": 9,
        "attributes": {
            "LineHeight": 11
        },
        "elementId": 15
    },
    

     

    Bhupesh76Author
    Participating Frequently
    October 9, 2023

    Thanks Joel for your response. Yes I can see that hyperlinks are extracted properly in the PDF sample that you shared. However I am not seeing the same behavior with my sample PDF. Can you please check the attached PDF.

    Bhupesh76Author
    Participating Frequently
    October 16, 2023

    Hi @Joel Geraci ,
    Any update on above?

    Bhupesh76Author
    Participating Frequently
    October 6, 2023

    Update: I tried it in Extract API Demo, it was able to identify the hyperlink as Reference but it did not provide the URL of the hyperlink.
    Refer snapshot below:

    We need to extract hyperlinks also, so can you please clarify on this.