Copy link to clipboard
Copied
I would like to automatically identify and edit the headlines of pdf text documents (newspaper articles). As the fontsize of headlines use to be larger than the text body, I could identify the beginning and the end of a headline by screening the fontsize changes of characters. Unfortunatelly I have not yet found a javascript method of dectecting fontsizes, while this information is provided in the UI - even for proprietary fonts. Can anybody help?
Use the "doc.getPageNthWordQuads()" function. It returns 4 points, one for each corner of the bounding box of a word on the page. The height of this box is the font size.
Copy link to clipboard
Copied
Use the "doc.getPageNthWordQuads()" function. It returns 4 points, one for each corner of the bounding box of a word on the page. The height of this box is the font size.
Copy link to clipboard
Copied
Many thanks for this advice. It works perfectly.
Copy link to clipboard
Copied
You can't edit static text using a script, though. Not directly, anyway.