Skip to main content
Inspiring
September 9, 2011
Answered

cfspreadsheet file size problem

  • September 9, 2011
  • 1 reply
  • 4706 views

When I am reading in a list of Excel spreadsheets, sorted by size,

with <cfspreadsheet>, everything goes smoothly until the tag hits a file

bigger than 2MB.  That is when I get this error:

An error occurred while reading the Excel: org.apache.poi.hssf.record.RecordInputStream$LeftoverDataException: Initialisation of record 0xFF left 6 bytes remaining still to be read..

This happens consistantly at a file size of 2MB.  The files are Excel 2003.  The only important factor I can see is that I have quite a few columns in my spreadsheet.  The columns go out all the way to "CX".

Does <cfspreadsheet> have some sort of limitation when it comes to file size and/or columns?

TIA

Zach

    This topic has been closed for replies.
    Correct answer -__cfSearching__-

    I can remove records from the beginning, middle, or end, and it won't load until the file is smaller than 2MB.  And yes, you are right, 2MB isn't very many records (just over 2000, in my case).  Remember, that the records have a length of the last column being "CX", which is very long.

    As far as malformed files, that logic would mean that I wouldn't be able to use cfspreadsheet on any of the files.  The tag works great on the files less than 2MB, that I downloaded directly from the portal.

    As it stands right now, I have to chop up the files into 1000-record "bite site" smaller files.  After doing that, cfspreadsheet breezes right through them.

    When I first saw this problem a few months back, I poured hours into searching it up and down on Google and found very little.  What I did find was mainly about problems and bugs in the Java code (nothing having to do with CF, just the Java function(s) that CFSpreadsheet uses apparently).  All of these hits on Google being years old.


    I do not know what is causing the problem. But having read much bigger files, with more columns, I am not yet convinced it is solely a size problem.

    - Can you read any files over 2MB or ones with a similar number of rows and columns?

    - Any odd characters or charsets involved?

    - Any idea what program produced the files? Either from viewing the metadata or the headers?

    When I first saw this problem a few months back, I poured

    hours into searching it up and down on Google and found very

    little. 

    Did you ever try running the files through POI's BiffViewer?

    1 reply

    Inspiring
    September 9, 2011

    Hard to say. But I tend to doubt size is the cause seeing as how it is only 2MB.

    - Can you post a sample file?

    - Were all of the files created by Excel 2003 or a different program?

    Inspiring
    September 9, 2011

    Unfortunately, I cannot post a sample because it is chock full of

    Personal Health Information (PHI).  I wish I could.  Even going through and trying to replace the PHI with dummy info would take quite a bit of time, not to mention dangerous if I missed something.

    The files are all created in Excel 2003 format by one of the State of Colorado's websites.  (Actually, the quote from the website is "This report is available in Excel format.")  The files are not ".xlsx" for 2007 and 2010.  We just download them from their site.

    The reason I have leaned towards file size (or size in general) is because of the error message talking about "6 bytes".  If I remove records from the file to reduce its total size below 2MB, it loads just fine.  Is this error message familiar to anyone?

    Inspiring
    September 16, 2011

    OK, hidden "crappy" data or something 'else' is the

    answer. 

    Out of curiosity, did the poi dump crash or tell you anything you could report to the POI team?

     

    Thank you to everyone for helping me with this!  Sorry so

    anticlimatic.

    lol. Someone actually mentioned using that exact same fix on one of the POI threads. I was going to suggest that, but thought you had already tried it in your previous experiments. Wish I had suggested it now! But at least you have temporarily worked around the problem


    To tell you the truth, after having the epiphany of just re-saving the files in Excel, I never got around to doing the Biff Viewer testing.  One of these days I might.  Heaven knows the State won't be fixing their side of this equation soon...

    Thanks again!

    Zach