Clustered CF Instances Hang, Difficult to Restart
Hello all. Seems We have run into another issue with our newly deployed ColdFusion 2021 servers. We experienced a similar error back with CF8, and earlier update levels of CF10. It seems the issue was resolved in later updates of CF10, but now with CF2021, this issue seems to be back again.
We have three physical bare metal servers, each running four clustered instances of coldfusion. All of these are behind a fortinet load balancer. While the CF instances on each box are nammed the same (cfusion1 - cfusion4) each box's CF cluster is on a different port to eliminate multicast confusion between machines. We have changed channelSendOptions to 6 in the server.xml files in order to reduce the number of "Session Already Invalidated" error messages in the coldfusion-error.log files.
While we dont have much problem restarting instances when the server has been removed from the load balancer, we do see difficulty restarting instances even under moderate load. The CF instance will appear to start fine form the command line, however the instance never starts taking traffic, and the CFIDE/administrator for that instance will not load. Upon trying to stop the hung instance, we get an error:
[root@Node1 ~]# /opt/ColdFusion2021/cfusion1/bin/coldfusion start
Starting ColdFusion 2021 server ...
======================================================================
ColdFusion 2021 server has been started.
ColdFusion 2021 will write logs to /opt/ColdFusion2021/cfusion1/bin/../logs/coldfusion-out.log
======================================================================
[root@Node1 ~]# /opt/ColdFusion2021/cfusion1/bin/coldfusion stop
Stopping ColdFusion 2021 server, please wait
Jul 22, 2021 10:37:43 PM com.adobe.coldfusion.launcher.Launcher stopServer
SEVERE: Shutdown Port 8007is not active. Stop the server only after it is started.
ColdFusion 2021 server has been stopped
[root@Node1 ~]# /opt/ColdFusion2021/cfusion1/bin/coldfusion start
Starting ColdFusion 2021 server ...
======================================================================
ColdFusion 2021 server has been started.
ColdFusion 2021 will write logs to /opt/ColdFusion2021/cfusion1/bin/../logs/coldfusion-out.log
======================================================================
There doesn't appear to be an useful information in the coldfusion-error.log, nor the logs of it's peers.
Do I need to make any adjustments to the tomcat cluster timeouts perhaps? Amd I missing some other type of best practice when clustering CF instances? Any suggestions on how to troubleshoot this further?
Thanks for any advice,
-Tony
