Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Logical reading order vs page content order in Acrobat

Explorer ,
Oct 11, 2018 Oct 11, 2018

I'm having a PDF which contains a tagged structure which defines the logical reading order by specified MCID for each tech object. Text objects in page content are placed in different order.

Article 14.7.1 in ISO 32000 states:

"A PDF document’s logical structure shall be stored separately from its visible content, with pointers from each to the other. This separation allows the ordering and nesting of logical elements to be entirely independent of the order and location of graphics objects on the document’s pages."

Based on the above the logical order defined by tags does not have to reflect the page object order.

Acrobat DC (older versions too) seems to have problem with this and in the attached file it does not highlight the whole text on the page and is not able to copy the text in the correct reading order.

The following is a link to the file:

Dropbox - accessible.pdf

Is this a bug in Acrobat or am I missing something?

The file I'm testing is very minimal and contains the following tag structure. MCIDs of each text element goes from 0 to 8 from top to bottom which you can check with the page content below.

Screenshot 2018-10-11 at 14.00.16.png

And the page content looks like this:

/P <</MCID 8>>BDC

BT

/F0 10 Tf

0.71 0 0 0.523223 429.119995 652.188049 Tm

( THE)Tj

0.714286 0 0 0.523223 446.160004 652.188049 Tm

( PUBLIC)Tj

ET

EMC

/P <</MCID 7>>BDC

BT

/F0 10 Tf

0.56 0 0 0.523223 419.040009 652.188049 Tm

( IN)Tj

ET

EMC

/P <</MCID 5>>BDC

BT

/F0 10 Tf

0.777143 0 0 0.545972 297.22287 651.764954 Tm

(BEFORE)Tj

0.533333 0 0 0.545972 325.200012 651.764954 Tm

( IT)Tj

0.48 0 0 0.545972 334.799988 651.764954 Tm

( IS)Tj

0.68 0 0 0.523223 343.440002 651.948059 Tm

( FILED)Tj

0.73 0 0 0.523223 367.919983 651.948059 Tm

( FOR)Tj

ET

EMC

/P <</MCID 6>>BDC

BT

/F0 10 Tf

0.8 0 0 0.523223 390.23999 651.948059 Tm

(RECORD)Tj

ET

EMC

/P <</MCID 4>>BDC

BT

/F0 10 Tf

0.795556 0 0 0.545972 254.373337 651.764954 Tm

(PROPERTY)Tj

ET

EMC

/P <</MCID 1>>BDC

BT

/F0 10 Tf

0.693333 0 0 0.545972 165.600006 651.524963 Tm

( AN)Tj

ET

EMC

/P <</MCID 3>>BDC

BT

/F0 10 Tf

0.546667 0 0 0.545972 217.439987 651.524963 Tm

( IN)Tj

0.744 0 0 0.545972 227.279999 651.524963 Tm

( REAL)Tj

ET

EMC

/P <</MCID 2>>BDC

BT

/F0 10 Tf

0.728889 0 0 0.545972 178.080002 651.524963 Tm

( INTEREST)Tj

ET

EMC

/P <</MCID 0>>BDC

BT

/F0 10 Tf

0.822222 0 0 0.545972 121.199997 651.284973 Tm

(TRANSFERS)Tj

ET

EMC

========

Acrobat selects the text like this (SelectAll):

Screenshot 2018-10-11 at 14.04.12.png

Acrobat copies the text as follows:

THE PUBLIC IN

BEFORE IT IS FILED FOR

RECORD

PROPERTY AN IN REAL INTERESTTRANSFERS

Thanks

Jozef

TOPICS
Standards and accessibility
1.1K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Oct 24, 2018 Oct 24, 2018
LATEST

Yes, this excerpt from the ISO standard seems to say this is OK.  But we know it is not OK from an accessibility standpoint.

It is true that the page display and the order of objects in the Tags panel, Reading Order and Contents panels do not always match.  But the Tags panel should match the display or actual "how you read the page" order.

If you consider reflow the Reading Order and Content panels should also match the "how you read it" order. 

Your PDF does not not resemble the output from most applications capable of generating PDF. 

How are you creating this PDF ?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines