Copy link to clipboard
Copied
I need to convert PDF to Excel, however, columns and tabs make many merged cells and many blank columns. In addition to not separate the columns correctly, I see many not separate lines together in the same cell. I'm even thinking that Adobe Acrobat Pro DC has limitations. There is no way to define what points in columns to force break column? Nor create many columns that are useless? How does text to column in Excel, fixed size when we import text, and define where the breaks have columns?
Google Tradutor para empresas:Google Toolkit de tradução para appsTradutor de sitesGlobal Market Finder
Desativar tradução instantâneaSobre o Google TradutorCelularComunidadePrivacidade e TermosAjudaEnviar feedback
PDF does not contain columns, rows, formats, styles, or other aspects of word processing or spreadsheet file formats.
This is because PDF is decidedly not a word processing or spreadsheet file format or something "like" one of those.
(see ISO 32000 for what PDF "is")
What can optimize the export of PDF page content is to start with a well-formed tagged PDF (ISO 14289-1, PDF/UA-1 compliant).
Without that export is what it is and one performs whatever content cleanup is needed using the native applica
...Copy link to clipboard
Copied
PDF does not contain columns, rows, formats, styles, or other aspects of word processing or spreadsheet file formats.
This is because PDF is decidedly not a word processing or spreadsheet file format or something "like" one of those.
(see ISO 32000 for what PDF "is")
What can optimize the export of PDF page content is to start with a well-formed tagged PDF (ISO 14289-1, PDF/UA-1 compliant).
Without that export is what it is and one performs whatever content cleanup is needed using the native application for the export file (MS Word or Excel).
Be well...
Copy link to clipboard
Copied
That makes no sense. We're want to take data in PDF and export them to an Excel file that uses columns. The export function has to make some decisions about where the columns are located. Telling us that "it is what it is" is not helpful. A better engine is needed to correctly parse the PDF data into the correct colums. I have a PDF file with mulitple pages with excactly the same layout and some of them parse correctly to Excel columns and others do not. I'm writing this 7 years after the initial post, so this is still a problem.
Copy link to clipboard
Copied
It's certainly possible to write your own algorithm to export the data in the desired format (using a script, plugin or stand-alone application), but it's not possible to change the way Acrobat does it.
If you try to do it yourself you'll see the complexity of such an endeavor and understand why it's not so simple to create such an algorithm that works with multiple layouts.
Copy link to clipboard
Copied
I agree with Anthony. Adobe HAS changed the way the export to Excel over the years. We still have users using Acrobat Pro 2017 and that version works as intended. Instead of exporting one giant column with all the rows combined into one row of text (like the current version)... it will export each row in a table as its own row.
Adobe changed something and they need to change it back... what is the point of paying for this feature if it doesn't work.
Copy link to clipboard
Copied
I don't have the knowledge or skills to write my own algorithms to export data. Nobody is saying this is not complex, but doesn't Adobe hire software engineers or train them to write what seems to be much more complex image processing software than than is needed to covert columns of data in a PDF to an Excel format? It makes no sense that 7 years later this is still not available.
Copy link to clipboard
Copied
Hey Anthony, I work from pdf to csv daily. One other option after you redacted text and images and removed what you don't need. Go to excel and import data from PDF. There is no sure way to parse columns until you get to excel.
Copy link to clipboard
Copied
Yeah, me too, but this doesn't help if the merged cells are all jacked, which they frequently are. The issue isn't conversion format so much as putting things in columns that match the visual displays, even with page headers/subtotal rows.
Copy link to clipboard
Copied
Hey guys, i got the same problem and thought why not try to make some lines in the PDF file with the drawing tool, to make the correct colums. For some reason it works. When your File has a repetetive pattern you can ctr c and v the grid on every page. Mabe their is a better way, but for my 15 sites it worked okey.
Copy link to clipboard
Copied
This is a major shortcoming of Adobe's PDF to Excel functionality. It does this even with well-formed, clean PDFs.
Here is one solution:
It works, but it's a round about way of doing what should be a simple feature for Adobe to include. Hope they add it in there soon.
Copy link to clipboard
Copied
Thanks for the suggestion. It solved my problems nicely.
Hey Adobe, if we can do this manually, at a minimum, why don't you automate this for us. Please be attentive to your well paying customers.