We've got a website that we've recently launched that has
caused our CF 8.x server to continuously run out of memory. Under
both load testing in development and in production Jrun's memory
profile just keeps rising and rising, and rarely seems to release
any memory before eventually hitting it's max heap size (as defined
in CFadmin). This is happening on the production server without
there being a lot of traffic on the site, save for the Yahoo, MSN
and Google bots, which are fairly aggressively indexing the site's
content.
Originally we were thinking either we had one of the usual
problems: a coding mistake causing an infinite loop, possibly
loading too much data into session scope, general site applcation
errrors, the site spawning too many sessions (i.e., by having
search engines trawling all site pages and links; the site is an
online museum collections database, and there are literally
thousands of links throughout the application, as users drill down
into the site and browse the collection by various topical trees).
We did:
* An intensive code review and subsequent fixes (the cf
logfiles show basically no application errors now)
* Logged pages that were taking a lot of time, optimized code
and SQL business logic accordingly.
* Added a robots.txt, site XML file and a special content
indexing .cfm page for search engines, added index no-follow
directives to site pages to keep bots out of the website, except on
pages we wanted them to index.
* Optimized the session management on the site, to minimize
the memory footprints of user sessions, and to also eliminate the
possibility of search engine bots causing CF to set a new session
on every request.
Still no luck.
We have two remaining things we are looking at:
* Further SQL query optimization: some of the queries can
return a few thousand records (displayed via a typical web paging
navigation system, i.e., next/previous N records). We're looking at
using SQL 2005's record paging functionality to further reduce the
amount of data that gets loaded into memory on each request
(although one would think CF would eventually do garbage collection
to release this, no? Especially if you're not caching the queries
explicitly, and the number of cached queries in CFADMIN is set to a
low number?)
* fileexists(): we're using CF's fileexists() function in
several places in the application: to detect if an artifact image
exists, and if not, to either display a placeholder image, and/or
generate one on the fly (for each artifact in the database there
can be up to five different images of various sizes: the
application auto-generates some of the image versions, with the
client only uploading the primary artifact when they add new
artifacts to the database). Even though we've created separate
directories for various image types, this still means that CF is
having to run the fileexists function on folders that have
thousands of artiface image files in them. I'm wondering if this
could be causing some of our memory problems? Does the fileexists()
function basically do recursion on the directory that it scans, and
could this be causing server issues?
Also, the server was completely stable before we published
this new site to it. All other sites on it are developed by us, so
there isn't any third party code to worry about. Testing in
development/staging environment generates identical problems
(running a custom search bot on it, using Microsoft's stress
testing tool, other...), with the added note that I've noticed on
our staging server that we are getting recursion and memory/stack
overflow errors occasionally returned to browsers as developers are
working on their projects and testing. I haven't seen that specific
error in a browser on production, but that could just be a
reflection of the amount of times during the course of a day we're
looking at staging versus production. There are of course a lot of
other sites on the dev/staging box, so it could be unlrelated.
Server/App Specs:
* Dual Quadcore Dell Servers
* 4 gigs RAM in server
* Mirrored RAID (Ultra SCSI 320, not SATA)
Web Server:
* Win 2003, latest service pack
* CF 8,0,0,176276
* Native SQL Server database connection (cf datasource
basically using default datasource settings, with the exception of
the Allowed SQL Permits)
Database Server:
* Win 2003, latest service pack
* SQL 2005 Standard Edition, latest service pack
CF Settings:
* CF's JVM has 512 megs min heap, 1024 megs max, maxpermsize
set to 256 megs
* CF is already configured to minimize other possible memory
usage (max number of simulaneous requests is 12, number of cached
queries, templates etc. has been lowered below the defaults to see
if that would help, which it doesn't)
Any ideas/suggestions?
Thanks in advance,
Sean