Anyone run into a problem with calls to CFINDEX after having applied HotFix 21? I'm getting the following stack after the hotfix (looks like during the indexing of a Word or Excel file):
at org.apache.poi.xwpf.usermodel.XWPFRun.text(XWPFRun.java:1001) at org.apache.poi.xwpf.usermodel.XWPFRun.toString(XWPFRun.java:971) at org.apache.poi.xwpf.usermodel.XWPFParagraph.getText(XWPFParagraph.java:215) at org.apache.poi.xwpf.usermodel.XWPFTable.<init>(XWPFTable.java:117) at org.apache.poi.xwpf.usermodel.XWPFDocument.onDocumentRead(XWPFDocument.java:152) at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:166) at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:118) at org.apache.poi.xwpf.extractor.XWPFWordExtractor.<init>(XWPFWordExtractor.java:59) at coldfusion.tagext.search.MSDocument.readDocx(MSDocument.java:164) at coldfusion.tagext.search.SolrUtils.getSolrDocument(SolrUtils.java:734) at coldfusion.tagext.search.SolrUtils.addDocument(SolrUtils.java:1349) at coldfusion.tagext.search.IndexTag.doUpdate(IndexTag.java:675) at ............
CFTRY/CFCATCH/CFDUMP should be a little more precise in providing error information.
I tried cfindex wth update 21 and it works well, can you create a new collection and try it.
Here's the code that is triggering the Site-Wide Error:
<CFINDEX ACTION = "UPDATE"
AUTOCOMMIT = "false"
COLLECTION = "#Variables.Collection[idx].CollectionName#"
EXTENSIONS = ".*"
CUSTOM1 = ""
CUSTOM2 = ""
CUSTOM3 = "#Variables.Collection[idx].FilesBaseDirRel#"
KEY = "#Variables.Collection[idx].FilesBaseDirAbs#"
LANGUAGE = "ENGLISH"
TYPE = "PATH"
URLPATH = "#Variables.Collection[idx].FilesBaseDirRel#">
Message - #cfcatch.message#
Type - #CFCATCH.TYPE#
Tag Content - <cfdump var="#cfcatch.tagcontext#">
Fairly straight forward, simply indexes all files in a certain directory. This directory is comprised of html, Word, Excel, PowerPoint and PDF documents. I backed out HotFix 21, returned to HotFix 20, and all works fine, so I'm fairly certain the problem is in HotFix 21.
Try Catch does not work, nor is this error being logged into the Application Log nor the Exception Log.
So the best I could do is put
in the Site-Wide Error handler.
Which gives me the error in the attached JPG.
IF the error is caused by the CFINDEX, and the CFTRY/CFCATCH isn't triggering, it's because your site-wide error handler in application.cfc is catching the error before your CFTRY. Comment out the site-wide error handler, and instead of CFOUTPUTing the cfcatch.message, just CFDUMP var="#cfcatch#" to get everything.
BTW, I am also DoD.
What is at line 255 of fullIndex.cfm? Is that the CFINDEX tag? It looks like some kind of parsing error. Could there be something in an Excel or Word document that might cause the parser to flip out?
Line 255 is
URLPATH = "#Variables.Collection[idx].FilesBaseDirRel#">
This is the end of the CFINDEX tag.
It is my theory that it is something in an Excel, Word, PowerPoint or PDF file that's triggering the error. But notice that I said all is fine (CFINDEX works) if I go back to HotFix 20 vice 21. So the problem is not one of the files, it's the HotFix.
If I take the Site-Wide Error handler out at the server level, all I get is a white screen with nothing in it and no Exceptions are caught in the server Exception Log. So the only way to get anything is by doing what I did - CFDUMP #ERROR# at the Site-Wide Error handler.
Yeah, the error line number is typically the last line of whatever tag; this is more noticeable with CFQUERY - usually the last line of the SQL is the error line number indicated.
I'm not doubting your assertion that HF21 is related. But it does strike me as odd that a hotfix is potentially involved with the issue. I don't know what is in any of the hotfixes (perhaps priyanks97293812 knows?), but I cannot imagine what could be changed where parsing works flawlessly in one iteration and not at all in the next.
Then, again, when Java 1.7u25 was working flawlessly for Solr collections and 1.7u31 broke it, it took a while to discover that Sun had closed many network accesses (as a security precaution) that were commonly open in previous versions. So, who knows?
But I'm _really_ puzzled by the CFTRY conundrum. If it triggers anything at all (like your #cfcatch.message#), then it should trigger enough for the CFDUMP to display everything. When you get the blank screen, did you do a "View Source" to see what the last line of HTML was before it errored out? Perhaps it started to display the dump, then something interrupted that? IDK. It's very weird.
If you look into the backup folders created by each HotFix (the backup is used in case the current installed HotFix needs to be backed-out or uninstalled), you can determine what is being changed. I tracked the problem down to changes in the Poor Obfuscation Implementation (POI) java libraries. These are Apache libraries used to read/write Microsoft type files (i.e., Word, Excel, PowerPoint, etc.). There are considerable changes to the libraries, unfortunately, it looks like the SOLR indexer is making a call to a class or method that is no longer present in the new libraries when attempting to open one of these Microsoft type files. Not sure what to do next, except stick with HotFix 20. Only thing is, eventually my Cyber Security folks are going to force me to go to HotFix 21 and then I'm broken. Hopefully someone in Adobe is reading this so they can come up with a fix soon (priyanks97293812)...
Is this happening with any specific PDF, Docx, Xlsx or failed with any of the docs. I tried many formats and everything works as expected. Hotfix 21 is a security update and no bug fix was shipped with it.
I've narrowed it down to .docm and .xlsm files. These are Word documents and Excel spreadsheets with built-in macros. I can provide you with an example file if required. Just provide me instructions on how, since this interface does not provide this capability.
Could you please send an email at email@example.com with files and please mention Priyank in subject, remove any content which you think is sensitive and share it.
I will take a look right now. Meanwhile, let me also create docm and xlsm files.
Thanks for providing the docm file. I am able to reproduce the issue with update 21 and with update 20, it works.
I am logging a bug and will keep you posted.
Thank you so much for your assistance...
I am encountering the same issue with CF 10 Update 21. Please let me know when you have a fix for the issue.
Has there been any updates on this. I am running into the same issue, only rolling back to CF 10 Update 20 breaks Solr completely. At least with Update 21 the search works, just can't index anything new.