Copy link to clipboard
Copied
Is the Font color available in Adobe Extract API's output? Could not find it in the response and the schema JSON linked here
Copy link to clipboard
Copied
You aren't missing anything. At this time we do not detect or output the color of the text. I've already made this feature request and I know what I'm looking for but I'm curious to hear what you are looking for. Do you want the color of the text drawn by the text operator or did you want the perceived color of the text on the page? For example, if you have some black text drawn on the page but later in the PDF display list, it's covered by a 50% transparent white box. Do you want the color to be reported as black or as gray?
Copy link to clipboard
Copied
I've already made this feature request
Good to know that a feature request is already in place.
Do you want the color of the text drawn by the text operator or did you want the perceived color of the text on the page?
PDFs are an interesting file format 🙂. Given that the requirement is to be able to republish the content, having access to the perceived color would help.
Copy link to clipboard
Copied
Hi @Joel Geraci
We are trying to achieve what you have done in the Acrobat reader Liquid mode. We have tested with different PDF files and noticed that the texts are shown in their original styling in Liquid mode. But the output doesn't provide font files, colors, etc.
Adobe advertise that Liquid mode is what you can achieve with the Extract API. So the question is how we can achieve that? If there another tools that supposed to used along with the Extract API.
Thanks,
Copy link to clipboard
Copied
While Extract API and the Liquid Mode engine share some common code, they aren't the same thing. Unfortunately, at this time, Extract still does not output the color of the text with the styling but I'll ping the product team again to see if I can get the request prioritized. I can't make any promises though.
That said, I am curious to know where you saw the representation that "Liquid mode is what you can achieve with the Extract API" because I need to get that corrected or at least clarified.
Copy link to clipboard
Copied
Hi,
We just went back to check some articles/videos we were researching, and we didn't find such a statement. I think that was an assumption because it was adverted that it's so powerful and uses Adobe Sensei AI for extracting the data etc. The Extract API still does an outstanding job, and you want differently achieve extracting accessible content out of PDF (because you don't need styling there).
But we really need to achieve what you did with the Liquid mode. So the question is should we look to other tools (link) and do more research or are you planning to do some changes so you can extract styling from PDF too?
Thanks,