• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Get page numbers that contain rotated text

New Here ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

Hello,

I searched through the documentation for JavaScript for Adobe, however I did not have success in finding information about getting the rotation of text.

Practically, I would need a code for the JavaScript console in Adobe Pro, that would find pages that have rotated text. Rotated by 90 or 270 degrees. Basically, I would need to get the page numbers of landscape pages, that are shown as portrait. I tried using getPageRotation command, however the PDFs were converted with the pages in portrait, and show no rotation. This is why I would need a function that would recognize rotated text.

Is this possible? Thanks!

TOPICS
Acrobat SDK and JavaScript , Windows

Views

682

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

You can use the method getPageNthWordQuads of the doc object.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

I am not sure how to use this. Could you please provide more details on how should I proceed? If I loop through a really big document, Adobe would crash if I would search all words

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

Assuming that all the text on a page is oriented the same way, there is no need to loop over all the text in the document. The "quads" are the four points that mark the corners of the bounding box around the word. Create a document with four pages, with every page using a different text orientation. Then run the following code in your JavaScript console:

var q = this.getPageNthWordQuads(0, 1);

console.println(q.toSource());

q = this.getPageNthWordQuads(1, 1);

console.println(q.toSource());

q = this.getPageNthWordQuads(2, 1);

console.println(q.toSource());

q = this.getPageNthWordQuads(3, 1);

console.println(q.toSource());

This will print the quads for the first word on all four pages. When you now analyze the four points in each quad, you will find that they are arranged in a certain way (this assumes that you use a word with a width that is greater than it's height). Let's assume you get this output (simplified to integers):

[[85, 744, 115, 744, 85, 728, 115, 728]]

This means that P1 has an x component of 85, and a y component of 744, P2 is 115 and 744, and so on. You can see that the y components of P1 and P2 are the same, and P1.x is less than P2.x. You do this for all four pages, and you will have rules to find the text orientation on each page. To make sure that you don't accidentally pick a first word like "I", I would not just use one word, but maybe the first 3 and use a 2 out of 3 rule to determine the text orientation.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

@khkremer - Thanks for the response. This would work, however I am not sure how can I access the coordinates. Should I convert them into variables, and work with them in a for loop?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

LATEST

Take a look at the output of the toSource() method from above, you'll see that what the Doc.pageGetNthWordQuads() returns is an array of an array of numbers. This means that you can display the x component of the first point using this:

console.println(q[0][0]);

So to get the y component, you would do this:

console.println(q[0][1]);

This is just standard JavaScript, there is nothing specific to Acrobat here. You may want to invest some time to learn about how JavaScript works.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

Maybe instead of looking at the coordinates of the text just look at the size of the pages. That's much easier to do.

You can use the getPageBox method for that.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

The pages are in portrait mode. I already tried this. While in portrait mode, even if text is rotated 90/270 degress, the page size is as the ones they are in portrait mode.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

In that case looking at the text quads is probably the only option, but it should be sufficient to look at a single word in each page.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

Yes, however I would need to work on the JavaScript syntax to build the function. Are regular expressions applicable in Adobe Pro console? After finding a rotated word, whichever would be that (this is why I ask of regular expressions), to skip the rest of the words and go to next page. I still have to search for more documentation. I am not sure how can I work with the coordinates, to calculate the width and height of the word box.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 12, 2016 Feb 12, 2016

Copy link to clipboard

Copied

There's no need for regular expressions. This is a purely mathematical procedure. See Karl's reply (#6) for a good starting point on how to do it.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines