We have browsed many threads on the subject of tuning CF11 (& CF10) IIS connectors (in particular ColdFusion 10 instance/Tomcat dying at predictable intervals (white screen of death) )
and we have applied the suggested changes to tune the connector.
worker.cfusion.connection_pool_timeout = 60
<Connector port="8014" protocol="AJP/1.3" redirectPort="8447" tomcatAuthentication="false" maxThreads="250" connectionTimeout="60000" />
The above has fixed the issues/failures we had under load (503 errors, white screens).
But now we are getting 502 errors in the isapi_redirect.log as follows:
[Mon Dec 01 18:35:53.717 2014] [20632:12484] [info] ajp_connection_tcp_get_message::jk_ajp_common.c (1308): (cfusion) can't receive the response header message from tomcat, tomcat (127.0.0.1:8014) has forced a connection close for socket 1088
[Mon Dec 01 18:35:53.717 2014] [20632:12484] [error] ajp_get_reply::jk_ajp_common.c (2260): (cfusion) Tomcat is down or network problems. Part of the response has already been sent to the client
[Mon Dec 01 18:35:53.717 2014] [20632:12484] [error] ajp_service::jk_ajp_common.c (2735): (cfusion) sending request to tomcat failed (unrecoverable), (attempt=1)
[Mon Dec 01 18:35:53.717 2014] [20632:12484] [error] HttpExtensionProc::jk_isapi_plugin.c (2612): service() failed with http error 502
These are popping up once every minute or so. But the site is running ok.
The log file is growing (obviously) and I would appreciate help on how to diagnose these errors (and whether these are worrying or not...)
You do not say about taking action to manually update the CF11 IIS connector. Doing so does get a mention in other post you refer. CF11 update2 IIS ISAPI connector file is size 388 KB (397,312 bytes) and dated Monday, 1 September 2014.
Wonder if trying AJP-APR (aka tomcat native) rather than AJP-BIO would be worth a try?
Please note that I have not come across that error myself so guidance of connector DLL and AJP-APR offered as is, may or may not help.
This other thread may be of interest. If big ISAPI log is difficult methods of rolling log are mentioned:
We have the same issue (and error messages) on CF10 update 14. Adobe always say to ignore them, but we are not 100% sure if it's affecting end-users or not. Correlating the isapi log to IIS's web logs can be helpful, for example we have found in some cases that these errors occur when POSTs are done to .php and .asp pages (which do not exist) by visiting bots. Can you see what happens in your logs at the times shown in the isapi log?
What is clear, to us at least, is that no amount of "tuning" will get rid of these particular 502 errors, as we've tried just about everything. The worker.cfusion.connection_pool_size=250 should really be 500, unless you are running multiple sites under high traffic, so Adobe tell us. We never got these errors on other connectors; the earlier connectors seemed much more stable, and also the error reporting is much more verbose in the later connectors. We cannot alter the log level at all for some reason - it just shows warn and info events, despite only "error" being enabled, for example. The people at Apache tell me that behaviour is wrong and doesn't occur in their connector.
Thanks for the info. I agree that this looks bad and being told "just ignore the errors" isn't very reassuring.
I've updated to 500 (from 250) but no difference. And correlating with IIS logs doesn't help either (I had tried...)
Thank you also for offering help. I have checked the connector and it is indeed 388 KB from 1 Sept 2014.
I took the actions on that thread (manually modified the two files and pasted the lines in my post) and I also applied log rotation (I was the one clicking "Like" on your comment there)
Now what you say about using AJP-APR instead of AJP-BIO is something I haven't read about anywhere and sounds interesting. How would I go about trying this? What are the advantages? Any pointers for how to do that?
The native Apache connectors for Windows are available here: Index of /dist/tomcat/tomcat-connectors/jk/binaries/windows
I've never tried the native connectors myself. If you do try, please let us know how you get on.
Regarding AJP APR.
If you open CF10 CF11 coldfusion-error.log you will notice something like EG:
INFO: The APR based Apache Tomcat Native library which allows optimal
performance in production environments was not found on the java.library.path etc
INFO: Initializing ProtocolHandler ["ajp-bio-8012"]
Some cut and pasting from tomcat doc you can read full detail from here:
The Apache Tomcat Native Library provides portable API for features not found in contemporary JDK's. It uses Apache Portable Runtime (APR) as operating system abstraction layer and allows optimal performance in production environments.
How to change CF from BIO to APR?
-Download tomcat native http://tomcat.apache.org/download-native.cgi windows binaries
-extract tcnative-1.dll to CF10 CF11 \cfusion\lib
-Note tcnative-1.dll in \bin\x64 for 64bit source versa 32bit
-Restart CF10 CF11 application service
Outcome coldfusion-error.log now says EG:
INFO: Loaded APR based Apache Tomcat Native library 1.1.32.
INFO: Initializing ProtocolHandler ["ajp-apr-8012"]
Post again if that helps with the 502 error message or not. Probably will not assist tribule's thread since I have seen same ISAPI log details with APR present.
I'll definitely look into this native/APR alternative, and at least experiment on our Dev server.
Regarding Production, I am more hesitant to jump to it and unluckily I only get these 502s in Prod under load.
Do you think this native/APR mechanism going to be part of a subsequent update of CF11 (so a Adobe-stamped supported mechanism?)
If not, any idea why they didn't use it? Any dangers/impacts?
Sorry for my ignorance of these connectors considerations...
Best to try something new like APR on dev environment before production. On dev you can use apache jmeter to push some load against CF.
I don't think APR is something officially Adobe CF supported. Strangely CF11 beta had APR present but release did not have it. I have been using APR on CF10 since update5 as well as have it on production CF11 update2.
Perhaps some monitoring of the AJP-BIO or AJP-APR with JDK (java dev kit) tools Jconsole or JMC (java mission control) might yeild some interesting results. However that does mean work via adding JMX (java management extentions) to JVM arguments and watching the catalina (aka tomcat) threads.