Relating the co-ordinates in bounds in JSON output with actual location in PDF

Forum|Forum|4 years ago
September 25, 2021
2 replies
1699 views

The text Find object locations is at 181px from left in the PDF, however, the JSON output from the PDF Extract API returns this:

{
      "Bounds": [
        108.02000427246094,
        692.2299957275391,
        246.02609252929688,
        708.1900024414062
      ],
      "ClipBounds": [
        108.02000427246094,
        692.2299957275391,
        246.02609252929688,
        708.1900024414062
      ]
}

From what I understand, 108 is bottom left location of text.

However, as per the PDF it is 180px.

Can you help understand the relation here?

Note: Input PDF has been attached.

Thanks!

input.pdf

PDF Extract API

This topic has been closed for replies.

T

Test Screen Name

Legend

Pixels are not a unit used in PDF. Screen size is irrelevant to the internals. Refer to the PDF Reference for page units, most likely it is 1/72 inch, origin the media box (not always the visible corner).

Nikhil RankaAuthor

Known Participant

Thanks for sharing the info @Test Screen Name. Would be a great time saver.

Since the media box is not always the visible corner, any other approach to translate the co-ordinates for republishing? The math appears simple but since the media box is not visible, it gets tricky.

Also, is the media box different for different PDFs?

T

Test Screen Name

Legend

The crop box should give you the visible rectangle. In the absence of a crop box, the media box is used. The media box is distinct in each PDF, but that doesn't matter, if you can find what it is. A media box can (but rarely does) start several inches from the coordinate origin. Study of the PDF reference may give you more insight.

Nikhil RankaAuthor

Known Participant

Here is the SS from the PDF viewer. Zoom is at 100%

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded