How can I detect the fontsize of a character in a pdf document using javascript

Mar 10, 2018

Copy link to clipboard

Copied

I would like to automatically identify and edit the headlines of pdf text documents (newspaper articles). As the fontsize of headlines use to be larger than the text body, I could identify the beginning and the end of a headline by screening the fontsize changes of characters. Unfortunatelly I have not yet found a javascript method of dectecting fontsizes, while this information is provided in the UI - even for proprietary fonts. Can anybody help?

Use the "doc.getPageNthWordQuads()" function.  It returns 4 points, one for each corner of the bounding box of a word on the page. The height of this box is the font size.

TOPICS
Acrobat SDK and JavaScript, Windows

Views

112

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

How can I detect the fontsize of a character in a pdf document using javascript

Mar 10, 2018

Copy link to clipboard

Copied

I would like to automatically identify and edit the headlines of pdf text documents (newspaper articles). As the fontsize of headlines use to be larger than the text body, I could identify the beginning and the end of a headline by screening the fontsize changes of characters. Unfortunatelly I have not yet found a javascript method of dectecting fontsizes, while this information is provided in the UI - even for proprietary fonts. Can anybody help?

Use the "doc.getPageNthWordQuads()" function.  It returns 4 points, one for each corner of the bounding box of a word on the page. The height of this box is the font size.

TOPICS
Acrobat SDK and JavaScript, Windows

Views

113

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Mar 10, 2018 0
Mar 10, 2018

Copy link to clipboard

Copied

Use the "doc.getPageNthWordQuads()" function.  It returns 4 points, one for each corner of the bounding box of a word on the page. The height of this box is the font size.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Mar 10, 2018 0
Mar 10, 2018

Copy link to clipboard

Copied

Many thanks for this advice. It works perfectly.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Mar 10, 2018 0
Mar 10, 2018

Copy link to clipboard

Copied

You can't edit static text using a script, though. Not directly, anyway.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Mar 10, 2018 0