Adobe Extract API Hyperlink is a different element then the string it is in
hello,
I am using the document extract REST api to parse pdf's that include hyperlinks. from my testing it looks like the api parses the hyperlink as a different element then the sentance that it is in. from what i can tell there is no indication of where the hyper link was located in the string so i can reassemble it, as it leaves no symbol and eats the trailing line space.
below is the content analyzer request i have been using. the documentation has given me no clue as how to either get it to ignore hyperlinks, or otherwise indicate where they were extracted from.
{
"cpf:engine": {
"repo:assetId": "urn:aaid:cpf:58af6e2c-1f0c-400d-9188-078000185695"
},
"cpf:inputs": {
"documentIn": {
"cpf:location": "InputFile0",
"dc:format": "application/pdf"
},
"params": {
"cpf:inline": {
"elementsToExtract": [
"text",
"tables"
]
}
}
},
"cpf:outputs": {
"elementsInfo": {
"cpf:location": "jsonoutput",
"dc:format": "application/json"
},
"elementsRenditions": {
"cpf:location": "fileoutpart",
"dc:format": "text/directory"
}
}
}
}Any help would be appreciated!
