I am curious if anyone else has encountered a problem where ColdFusion 2023 looks to <CFHOME>\wwwroot instead of an SMB file share or fails to load a file from an SMB share? We are running multiple CF instances and this happens to both instances at the same time. We have found "Tomcat is down or network problems. Part of the response has already been sent to the client" errors the isapi_redirect.log starting at the same time in both CF instances connectors.
When our servers are under load our IIS worker threads will periodically use all CPU, fail IIS application pool pings, recycle and then be fine. I cannot find anything in the logs other than errors in the application.log file in the CF instances saying it was looking into the wwwroot folder or it couldn't find the file on the SMB share. Along with this the isapi_
Looking at the httperr logs in our servers we find client_reset and queue_filled errors when this happens and seems reminicient of this server fault article (https://serverfault.com/questions/737619/application-pool-failling-with-client-reset-errors-in-httpe...)
We have all the latest updates and have confirmed that all the permissions are set for the coldfusion services, IIS, and the SMB share. During non peak times everything is fine and no errors are presented. We have two identical servers behind a load balancer it only does this on one server at a time and is completely random.
Looking for anyone else who may have experienced this problem or could offer pointers where to look. All other logs and request tracing show nothing out of the ordinary.
Chad, I think you have at least two different issues.
First, as for the tomcat down messages, it may simply be that CF is (or other cf requests are) bogged down, such that attempts by iis to pass in subsequent cf requests hang up.
As for the files requested via the share, note that the cf web server connector will indeed look in the cf wwwroot for a file if it's not found in the web server (iis or apache) web root. As for why it may not find them there, perhaps the share sometimes becomes inaccessible, such that cf is the victim here.
As for the first problem, you need to use some sort of diagnostic tool to see what's going on within cf. There are several such tools, from meager built-in ones like the command line cfstat tool, to the more powerful cf pmt (free and optionally installed for cf2018 and above), or the equally powerful fusionreactor (not free but which works with any cf version). There are still other jvm monitoring tools that CAN help, though they tend to not know anything about cf specifically.
I have a talk on all such monitoring options for cf (and lucee) available at carehart.org/presentations. I can also assist directly in finding and resolving the problem, using any of those tools (though perhaps none may need to be added). More on that at carehart.org/consulting.
And of course I and others may be able to offer more here,perhaps based on any reply you offer. I realize you may have hoped for a simple prescription to ease the pain, but I suspect more effort will be required.
Maybe you can answer one thing for me.
We are operating multiple CF instances that both have the same issue at the exact same time. Is there anything shared between multiple instances of CF on the same server? To my understanding each instance is completely independant from each other.
They can share many things :
The fact that both instances hang up at the same time only further supports these two broad possibilities being where the issue is.
And again a cf monitoring or diagnostic capability could show EXACTLY what cf is hung up trying to do.
Sorry I wasn't clear in my last response. I was more referring to the internal workings of ColdFusion specifically. Would there be any sharing of Tomcat resources for example? I know that the add on services like solr and pdf generation are shared but I wanted to ensure that there is indeed no core shared resources between instances.
There are indeed none. Now that that's clarified, can you please reconsider the rest of what I said in my first two replies? You seem to be trying to find a different path. What's your thinking?
To be clear, I help people solve these very sort of problems every day. It's unfortunate that Cf can seem to many to be a black box, with mysterious knobs and dials that seem to need to be tweaked. (I'm not saying that's what you're reflecting. I'm saying it's a common approach.) Instead, CF is a process like any other-- which is more often than not influenced by forces outside of it.
I'll say again, the hangups you experienced will have some explanation, with any of many potential causes, and job 1 is to use diagnostic tools to find and understand that cause, and then it can be resolved. Any other approach could amount to guesswork. It doesn't need to be that way, which is why I'm trying to show you a different path, which has nearly always worked when followed.
The reason I was asking is I am not sure if it is ColdFusion that is the problem. As it happens to an entire server and multiple instances at the same I was looking at higher level things that could be affected it such as network traffic and OS level issues. I do have diagnostic tools in place, however they have not been completely revealing as to the problem as of yet.
I have finally been able to get something out of our diagnostic tools and it looks like CFM and CFC requests are getting hung up on java.io.WinNTFileSystem which would point to an issue with our SMB share or at the communication with the file store server, what however I am not sure yet.
There you go. 🙂