Copy link to clipboard
Copied
The Coldfusion Application Server service can sometimes just freeze. It does not fail, it just stops working.
Restarting the service fixes this. Can occur when a server re-boots after overnight patching or just stop in the middle of the day. There are often serveral days between this happening.
(1) Does anyone know why this happens?
(2) Thinking of making the service Automatic Delayed start instead of Automatic start (in case it was waiting for other services to start in the re-boot cases) - BUT are there other services who will still be Automatic start that depend on this service?
If this does not resolve the issue, we will run a regular job to just restart this service each day.
Copy link to clipboard
Copied
Hi Paul,
Any error or warnings in coldfusion-out or coldfusion-error logs when that is in not responding state?
Perhaps include details for update level plus if you have upgraded Java that CF runs on and operating system environment.
Regards, Carl.
Copy link to clipboard
Copied
Hi Carl,
This is a brand new 2013 install (ColdFusion 2023 Standard – version 2023.0.0.330468) on a Windows 2019 server (Version 10.0.17763 Build 17763) with Java java version "17.0.6" 2023-01-17 LTS
There are no details in the error logs, and event viewer shows no issues.
Thansk
Paul
Copy link to clipboard
Copied
"no errors in logs" meant the Windows logs.
Copy link to clipboard
Copied
Hi Paul,
From logs this is likely the issue: GC overhead limit exceeded.
What setting do you have for minimum and maximum heap? What Garbage Collector is in use? From CFadmin > Server Setting > Java and JVM > sections - Memory Size settings determine the amount of memory that the JVM can use for programs and data and JVM arguments portion.
Best, Carl.
Copy link to clipboard
Copied
A few thoughts that I hope might help:
1) Fwiw, the GC overhead msg is more a warning than an error (despite cf/the jvm showing it with the error logging category). It's saying that Gc's are running often but the heap used value is not falling much. To be clear, it's not the same as an outofmemory error, though it is a harbinger that one could be coming in time. That would be in the coldfusion-error.log, but you don't show any.
Even so, could there be a memory (heap) problem? Sure, and Carl's right to ask what your cf heap max size is. Since you say you installed the new cf on a new machine, when you brought over any cf admin settings, you may have left this at the default of 1024mb. If your old cf was larger, see if changing that makes a difference.
2) All that said, I have to question your asserting that Cf is freezing. Each of your logs show that you are bringing down cf. (If it was frozen, you would find it would not stop, and you'd have to kill the process. There's would be NOTHING in the log about that sort of kill. You'd only see new lines when cf was starting.)
So by freezing do you mean no cf pages will run? Does that include cf admin pages, and are you 100% sure of that? You could have problems where it's only pages coming in via iis or apache are failing to run, but not the cf admin (which runs on its own web server).
Or it may be not even ALL pages of the but only those of some app, or using some db, etc.
3) Most important, when it comes to knowing WHETHER many requests are running slow, and WHY, that's where some cf monitor is vital, whether the cf pmt or fusionreactor (jvm tools can help but they don't tend to show urls of running requests but instead focus on low level jvm metrics).
4) Finally, as for your question about delayed autostart, that's a wholly different matter. If you find on box reboots that Cf is not coming up, using the delay MAY help, but it only delays it by less than a minute. It would be great if we might be able to tell it to start a few mins after the reboot. You COULD try to make it "dependent on" another service, but that's an iffy proposition.
Instead, the focus might be better put on either why CF is slow to come up (there can be configuration matters, and a cf monitor can help with that as well) or why the box is slow to restart (different possible explanations).
5) I appreciate that challenges like this can be frustrating, and even mysterious. But I help people solve them daily. You need not just put up with it. Diagnostics should be able to help you find resolution, and if you learn more, let us know. Or I can help directly, if interested. It's possible we might find and resolve this problem in less than a hour together.
But sure, do consider first and foremost whether it may be a heap size problem. (I'll say that normally if the heap fills and you get oom errors, then the cpu for cf tends go to 100% (as major gc's happen over and over but don't recover much garbage). And in that case you'll find you can't stop cf, as I noted above. But let's see what you find.
Copy link to clipboard
Copied
OK - taking it one step at a time. What should I increase the JVM Heap size to? Server has 8Gb RAM and this is the only application running on it (apart from Windows server 2019).
Minimum JVM Heap Size (in MB) 256 Maximum JVM Heap Size (in MB) 1024
Copy link to clipboard
Copied
If there were only one appropriate answer, you wouldn't have fields to fill in. We'd all have to know a lot more about your application. Without that information, all I can do is guess. My guess is that you could increase the max heap size to 2048 and your min heap size to 1024. We're kind of limited by only having 8 GB RAM installed. The min heap size, in theory, should reduce the number of requests from the VM to allocate more memory for itself. Ideally, this is what you'd use instrumentation (like @Charlie Arehart 's old company makes, FusionReactor) to find out.
As for delayed autostart, the only time I've run into a need for that is when one service depends on another. For example, CF may depend on your database service. There's a DependsOnService key you can add to the registry, just search for "registry dependsonservice" for more info.
Dave Watts, Eidolon LLC
Copy link to clipboard
Copied
Hi Paul,
What Dave said set sizing bigger. Those values mentioned are a good start.
Can values be set to match your loading better? Probably. Don't want to pay for good tools like Fusion Reactor mentioned? Then use some free tools like enable JMX within JVM and use JMC. Perhaps JVM or GC file logging will get you the information you need for CF stability.
As noted free tools like JMC, Jconsole etc will not display issues down to a CFM or CFC called but will give some good clues as to JVM stress points.
You don't mention the GC (garbage collector) in use. That has significant influence on memory.
Cheers, Carl.
Copy link to clipboard
Copied
Some clarifications based on the responses so far from each of the 3 of you:
Finally, Paul, if you "just need this solved", I'd offered yet another option in the form of direct consulting help, as indeed the other guys may as well.
Copy link to clipboard
Copied
Thanks for all the information. I will start by increasing the JVM heap.
For the other question, I am not a Java person, so cannot find the type of Garbage Collector in use for the JVM. But everything is the default for the ColdFusion 2023 install on a Windows 2019 server. The java version is 17.0.6 2023-01-17 LTS.
Copy link to clipboard
Copied
For information - it looks like doubling the JVM heap size has resolved these issues.
Before
Minimum JVM Heap Size (in MB) : 256
Maximum JVM Heap Size (in MB) : 1024
After
Minimum JVM Heap Size (in MB) : 512
Maximum JVM Heap Size (in MB) : 2048
Thanks everyone for all your help.
Paul
Copy link to clipboard
Copied
Good to hear it's resolved. I'm curious: had you been able to look at your previous cf install, to see what THAT max heap value had been? If it had been the same 2gb, that would help explain why you'd need to have changed this new install from the 1gb default.
That was the first point in each of my replies, though granted I'd offered still more given other info you'd shared and questions you'd asked. In any case, good that just doubling it (as Dave proposed) proved sufficient in your case.
Copy link to clipboard
Copied
The old server is now gone so I cannot check it. But when I set it up originally I would not have changed the JVM heap size defaults (since at the time I did not know what they did).
The old server was OK for over a year before these issues starting to appear. It was set up in 2021 with CF2018 as a result of a site migration. The CF2023 upgrade had issues from the start. Could that be from how some users were using the applications? eg. bigger searches? So that would have impacted the CF2018 setup at the end and CF2023 from the start. I just don't know.
Thanks
Paul
Copy link to clipboard
Copied
Yes, Paul, very likely. Or unexpected automated traffic, etc.
Copy link to clipboard
Copied
Hey Paul,
Like other readers glad to hear increase has helped.
Likely there is more you can do since that is Java Virtual Machine. It can be worth a little more effort to enable some tooling which is free, JMX or Logging then check for resourcing matters, JMC Jconsole read log file, to see if there is a stress point that can be tuned.
Perhaps your just pleased with it working, time is money and you got better things to do.
Cheers, Carl.
Copy link to clipboard
Copied
All ideas are welcome. At this time I am happy it is working. Currently I have a years backlog of things I should do on the ColdFusion apps (plus all the other apps I support), but I have added it to our backlog to look at if we (ever) get time. 🙂
Thanks again.