Coldfusion Memory Maxing
Copy link to clipboard
Copied
Good Afternoon Everyone,
I'm really hoping someone can help me out here, because we are all stumped.
Lets start with the basics, we are currently running Coldfusion 8.0.1.195765
Our Java version is 1.6.0.10
The server is a Linux box, 8gig ram, with a dual core xeon processor running at 3.00GHz each.
We are running 2 instances of Coldfusion on our server, but one instance is an SSL instance with low traffic. The main instances, set up with memory settings in the table below, has approximately 25 sites. Traffic for the sites is not huge (at the time of the information capture there
Heap Memory Usage - Committed1230 MB
Heap Memory Usage - Initial0.00 MB
Heap Memory Usage - Max1600
MB Heap Memory Usage - Used494 MB
JVM - Free Memory736 MB
JVM - Max Memory1600 MB
JVM - Total Memory1230 MB
JVM - Used Memory864 MB
Memory Pool - Code Cache - Used14.5 MB
Memory Pool - PS Eden Space - Used141 MB
Memory Pool - PS Old Gen - Used353 MB
Memory Pool - PS Perm Gen - Used114 MB
Memory Pool - PS Survivor Space - Used0.00 MB
Non-Heap Memory Usage - Committed133 MB
Non-Heap Memory Usage - Initial23.2 MB
Non-Heap Memory Usage - Max560 MB
Non-Heap Memory Usage - Used128 MB
I've also taken a screen shot of what our Coldfusion monitor is showing us for the main instance
http://www.miramar.ca/ScreenCap2.jpg
Currently at hte time of the screen shot, there were about 1900 active sessions. Now as a bit of a background on that, i've got a session manager that sets the session timeout for any search bots to about 5 seconds. even then, the max session timeframe for most sites on the server is about 1hour. As a further measure, every morning between 4 - 5AM, all sessions are deleted from the instance, so that the session count is brought back down to 0. Most of the sites also store most of there CFC's in application scope, to try to reduce the size of active sessions (unfortunately there are still a few legacy sites which use big session files).
Now are you can see from the screen shot, we see constantly a 300 - 500MB spike in memory (slowly rises and then drops i'm assuming when CF does memory recycling) unfortunately, the memory slowly climbs, until eventually the JRun instance spikes at 100% CPU, and max memory, and requires use to restart the instance (usually about every 3 - 4 days).
As an 2nd wierd part, our second instance of that server, which is an SSL layer, only has 2 active sites running, with very minimal traffic (any given time we see 5 - 6 visitors) and yet we see the same spikes in memory in the JRun for that instance.
if ANYONE can give us any ideas on what to try it would be greatly appreciated!!!
Thank you in advance,
Richard L
Programmer
Miramar Design Studio Inc.
Copy link to clipboard
Copied
Hello Richard,
The CF Monitor graphic is showing the Heap Memory usage that being New or Young Generation and Tenure or Old Generation together as such you are not seeing Permanent Generation. I think It looks like a Major or Full Garbage Collection is occurring every one to two minutes (large saw tooth). Over the top of the large saw tooth are little saw tooths which would be Minor GC’s. Not having PermGen in the picture means you could be missing clues as to the regular Full GC and the memory climbing then being CPU bound (I guess not shown in the graphic). Of interest is the CF Mon help which states in part “Memory usage information displayed in the Server Monitor is estimated and may vary from the actual memory usage” so you may not be seeing all the details you need to rectify the matter.
To get a better idea of how the memory buffers for New Old and Perm Gen are going along with some details on the GC’s it maybe handy to enable JVM logging. Reply if your now sure how to enable JVM logging or no one else has another idea to try. JVM logging will not in itself fix the problem however will likely show where the problem is and provide some idea of what to change to rectify problem.
Perhaps of interest would be your JVM args in Jrun/bin/jvm.config?
HTH, Carl.
Copy link to clipboard
Copied
Hi Carl,
Thanks for the help. We are going to look in to a few things you suggested, and here is a copy of the java arguments.
Hope you have a few ideas.
# Arguments to VM
java.args=-server -Xmx1800m -Dsun.io.useCanonCaches=false -XX:MaxPermSize=512m -XX:+UseParallelGC -Xbatch -Djava.awt.headless=true -Dcoldfusion.rootDir={application.home}/ -Djava.security.policy={application.home}/servers/cfusion/cfusion-ear/cfusion-war/WEB-INF/cfusion/lib/coldfusion.policy -Djava.security.auth.policy={application.home}/servers/cfusion/cfusion-ear/cfusion-war/WEB-INF/cfusion/lib/neo_jaas.policy
Thank you very much,
Richard L
Copy link to clipboard
Copied
Hi Richard,
notes on JVM logging:
Backup your JVM.CONFIG
Edit JVM.CONFIG to include these details -
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc -Xloggc:cfjvmGC.log
Eg
# Arguments to VM
java.args=-server -Xmx1800m -Dsun.io.useCanonCaches=false -XX:MaxPermSize=512m -XX:+UseParallelGC –Xbatch -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc -Xloggc:cfjvmGC.log -Djava.awt.headless=true -Dcoldfusion.rootDir={application.home}/ -Djava.security.policy={application.home}/servers/cfusion/cfusion-ear /cfusion-war/WEB-INF/cfusion/lib/coldfusion.policy -Djava.security.auth.policy={application.home}/servers/cfusion/cfusio n-ear/cfusion-war/WEB-INF/cfusion/lib/neo_jaas.policy
Restart CF service (daemon / process).
Some details from the log file generated will be of interest particularly leading up to the server fail in 4 -5 days time.
I suspect part of the problem might be what appears (from the CF Mon graphic so vague guessing here) to be a small size for NewGen that being too many java objects are being tenured to OldGen which subsequently need to be Full GC’ed rather than dying in a Minor GC pass.
Of interest a couple months back I did a talk about such matters at CFMEETUP (http://www.meetup.com/coldfusionmeetup/) site:
http://experts.adobeconnect.com/p55663036/
Something to note on your Args to VM, do not have any minimum or initial settings. Sometimes the minimum details are important as the maximum. So keep in mind you may want -Xms1024 and -XX:PermSize=256m tho the values might need touching a bit this way or that.
HTH, Carl.
Copy link to clipboard
Copied
This is a bit odd from the JVM details listed early on:
Memory Pool - PS Eden Space - Used141 MB
Memory Pool - PS Survivor Space - Used0.00 MB
NewGen is made of Eden Space and two Survivor Spaces. Strange how one SS is listed and it is 0Mb? Tho maybe that matches up with what I said about wild guess after looking at CFMon memory diagram and suspect a NewGen issue.
Copy link to clipboard
Copied
If I read your screenshot right, you have Memory Monitoring enabled. That iby itself increases the overhead sufficiently to make your instance crash eventually because it keeps some references preventing objects from being garbage collected. I would recommend disabling that and then forcing an explicit restart of the application son your server. You can still monitor overall memory consumption without Memory Monitoring.
Copy link to clipboard
Copied
Hello Jochem,
You are right, we did have Memory Monitoring enabled, but normally we leave it turned off. The problem was happening with and without memory management enabled. we monitor our memory consumption in a few different ways using various coldfusion code and 3rd party apps, which we only activate when we start running into issues. We've gone through enabling and dissabling all these and the issues remain so we've eliminated these.
Richard
Copy link to clipboard
Copied
Hi,
Ditto for turning Monitor off when not actively using it.
Same goes for once JVM logging has either helped or not, turn it off so it is not an overhead or accidental file system filler creating a big log file.
Copy link to clipboard
Copied
So as an update, i turned off all the monitoring so that we could get a shot of what goes on without it a little better, right now it's low traffic time for our server (8:15pm) there are currently about 370 active sessions on the server, it's been pretty stead at that level for about an hour now, and watching the memory monitor shows the same spikes (although climbing over an extra 2 minutes now) but the unfortunate part is that the Total Memory being isn't being released it's staying at about 1,160mb and keeps slowly climbing as well, this afternoon about 2 it was about 1,060 with 2000 sessions, and now it's 1,160 with 370 sessions, and the memory used is showing the same drops in memory about 300mb per spike.
Luckly with the lower sessions the used memory is a little lower (only going as high as about 600mb).
Copy link to clipboard
Copied
Hi Richard,
What you are noticing there with the memory is interesting details and is good to keep readers informed as to what you are sensing is happening. The forum thread can get ugly to read with long log details pasted in. If you have JVM logging on do you want to Private Message me what is currently in cfjvmGC.log? For interested readers and yourself I could paste in to the thread what I consider are the salient details.
HTH, Carl.
Copy link to clipboard
Copied
Hi Richard,
Something I missed that could help. Seeing the CF has been maxing out - some details in the Jrun\logs\coldfusion-out.log (might be named different and in a different place on Li/U nix) could show what the problem is and provide clue what parameter to adjust. Look in relevant log file at the time and just preceding CF maxing out for errors that are not normally showing up in other time periods of the log file.
HTH, Carl.
Copy link to clipboard
Copied
Good Morning,
I was able to find the event lots for the instance of coldfusion, and here are some pieces i've found.
I see a lot of this:
03/03 18:15:09 error Cannot create cookie: expires = Sat, 23-Feb-2041 23:15:09 GMT
03/03 18:15:09 error Cannot create cookie: path = /
03/03 18:15:09 error Cannot create cookie: expires = Sat, 23-Feb-2041 23:15:09 GMT
03/03 18:15:09 error Cannot create cookie: path = /
03/03 20:22:17 error Cannot create cookie: expires = Sun, 24-Feb-2041 01:22:17 GMT
03/03 20:22:17 error Cannot create cookie: path = /
03/03 20:22:17 error Cannot create cookie: expires = Sun, 24-Feb-2041 01:22:17 GMT
03/03 20:22:17 error Cannot create cookie: path = /
and also a few of Broke pipe messages
02/03 15:23:54 error Error in getRealPathFromConn
java.net.SocketException: Broken pipe
But the last entry before the last server lock up was almost 9 hours before.
Copy link to clipboard
Copied
Hi,
I do not know what the cookie or broken pipe means perhaps others do?
Was expecting to see some "java.lang.OutOfMemoryError" messages in there. Perhaps I am not on the right track with the maxing out problem. One can only be wrong and all we have done is wasted some time then if not on right track to resolve you can discount JVM as being issue or cause. Still I think the JVM log output will be interesting whether right or wrong on attributing to CF maxing out.
Regards, Carl.
Copy link to clipboard
Copied
Hi Carl,
My Networker just got around today to trying to insert that extra bit of java logging you mentioned earlier in this stream, and when he entered it, the instance of coldfusion wouldn't start up, kept giving the following error
Exception in thread "main" java.lang.NoClassDefFoundError: /Xbatch
Caused by: java.lang.ClassNotFoundException: .Xbatch
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
root@prod-www3:~# Exception in thread "main" java.lang.NoClassDefFoundError: /Xbatch
Caused by: java.lang.ClassNotFoundException: .Xbatch
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
Copy link to clipboard
Copied
Hi Richard,
Got to be careful pasting things to and from this form. I think there is a funny minus sign in there and spaces
#see the funny - infront of Xbatch and space in "cfusion/cfusio n-ear/" "cfusion-ear /cfusion-war/"
java.args=-server -Xmx1800m -Dsun.io.useCanonCaches=false -XX:MaxPermSize=512m -XX:+UseParallelGC –Xbatch -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc -Xloggc:cfjvmGC.log -Djava.awt.headless=true -Dcoldfusion.rootDir={application.home}/ -Djava.security.policy={application.home}/servers/cfusion/cfusion-ear /cfusion-war/WEB-INF/cfusion/lib/coldfusion.policy -Djava.security.auth.policy={application.home}/servers/cfusion/cfusio n-ear/cfusion-war/WEB-INF/cfusion/lib/neo_jaas.policy
#edited tho user beware
java.args=-server -Xmx1800m -Dsun.io.useCanonCaches=false -XX:MaxPermSize=512m -XX:+UseParallelGC -Xbatch -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc -Xloggc:cfjvmGC.log -Djava.awt.headless=true -Dcoldfusion.rootDir={application.home}/ -Djava.security.policy={application.home}/servers/cfusion/cfusion-ear/cfusion-war/WEB-INF/cfusion/lib/coldfusion.policy -Djava.security.auth.policy={application.home}/servers/cfusion/cfusion-ear/cfusion-war/WEB-INF/cfusion/lib/neo_jaas.policy
That is one of the reasons why I just gave add the log bit separately as well as for example and said to backup.
HTH, Carl.
PS tho I edit the form there are still spaces appearing when I post it. Suggest use your working JVM args and just paste in or type in the logging additions:
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc -Xloggc:cfjvmGC.log
Copy link to clipboard
Copied
As another update,
We tweeked a few settings in our administrator with regards to mail spooling, and cleaned out some of the larger log files in coldfusion, and did a fresh reboot of our server with monitoring left off this time, and we are getting the following in our graphs.
http://www.miramar.ca/ScreenCap3.jpg
Far more smaller spikes along with one larger spike every say 20 minutes this time.
Also at the time of this screen shot i have the following memory profile.
Heap Memory Usage - Committed250 MB
Heap Memory Usage - Initial0.00 MB
Heap Memory Usage - Max1600 MB
Heap Memory Usage - Used170 MB
JVM - Free Memory80.1 MB
JVM - Max Memory1600 MB
JVM - Total Memory250 MB
JVM - Used Memory1520 MB
Memory Pool - Code Cache - Used10.2 MB
Memory Pool - PS Eden Space - Used39.3 MB
Memory Pool - PS Old Gen - Used130
MB Memory Pool - PS Perm Gen - Used116 MB
Memory Pool - PS Survivor Space - Used0.00 MB
Non-Heap Memory Usage - Committed130 MB
Non-Heap Memory Usage - Initial23.2 MB
Non-Heap Memory Usage - Max560 MB
Non-Heap Memory Usage - Used126 MB
Carl, i passed on the suggestion you made about the jvm scripts and he's going to attempt it tonight again and see if he can get it working again.
Hope this helps a bit.
Richard
Copy link to clipboard
Copied
Hi,
>Far more smaller spikes along with one larger spike every say 20 minutes this time
That is totally different JVM behaviour from first capture. Now Full GC every 20 minutes instead of every 1 -2 minutes. Part of Full GC is a Java stop all processing so I imagine your application is running a lot better with Java not doing a part stop all every minute or two.
The smaller spikes are minor GC’s. Lots of Minor GC per Full GC is normal behaviour.
Would like the light blue (total memory) a bit higher and not giving up memory once it has taken it. Sometime an initial setting for heap would be good eg Xms1024m (value might need altering). If you can imagine the Java talking to the hardware layer getting some memory then giving it back and a few minutes latter getting it again has got to have a performance hit as Java has to make all these decisions.
>Memory Pool - PS Survivor Space - Used0.00 MB
The Survivor Space is still reporting a bit odd there I would expect it to use some memory.
HTH, Carl.
Copy link to clipboard
Copied
Hi Richard,
I have installed that CFM JVM Memory Monitor on one of my CF production systems. Interesting I get values other than 0Mb for Survivor Space. EG:
Heap Memory Usage - Committed 510 MB
Heap Memory Usage - Initial 512 MB
Heap Memory Usage - Max 986 MB
Heap Memory Usage - Used 226 MB
JVM - Free Memory 284 MB
JVM - Max Memory 986 MB
JVM - Total Memory 510 MB
JVM - Used Memory 702 MB
Memory Pool - Code Cache - Used 8.40 MB
Memory Pool - PS Eden Space - Used 15.9 MB
Memory Pool - PS Old Gen - Used 207 MB
Memory Pool - PS Perm Gen - Used 158 MB
Memory Pool - PS Survivor Space - Used 2.94 MB
Non-Heap Memory Usage - Committed 264 MB
Non-Heap Memory Usage - Initial 194 MB
Non-Heap Memory Usage - Max 304 MB
Non-Heap Memory Usage - Used 167 MB
HTH, Carl.

