pdf Extract - Incorrect order of paragraph after a paragraph spans across to the next page
Not sure if this is the right forum to report an issue with the pdf Extract.
When a paragraph spans across to the next page, Extract was able to capture the remaining paragraph from the next page. However, following right after it, Extract captures a paragraph that belongs to a different section (aka diff header in json output) below it and brings it up right after the spanned paragraph. The section where the paragraph should belong is empty. This creates inaccurate structure output of the pdf.
I can provide a sample pdf if your developer needs it to troubleshoot.
