• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

CFC method memory problem

Guest
Apr 02, 2009 Apr 02, 2009

Copy link to clipboard

Copied

Hi all,

I use CFCs a lot, so when I was creating a tool to migrate some database tables, I built the actual data transfer function as a method of a CFC. The basic code just reads a block of 1000 records from the old db, does some minor manipulation of the data, and then inserts the records into the new db a record at a time inside a cfloop. Lather, rinse, repeat, until all of the 500K+ records have been transferred.

I was surprised to find that this leads to Java out of memory errors around the 200Kth record (given my current JVM memory settings). Having run across a similar problem when sending broadcast emails from a CFC method, I simply moved the code out of the CFC method and into the calling cfm page, and it ran through all of the records without any problems.

The only explanation I can think of for this behavior is that when the code is inside a CFC method, the query object used to read the blocks of 1000 records keeps getting reallocated rather than reused, so when it gets to the point of trying to store 200 blocks of 1000 records in memory, it runs out of room. If that's the case, it would seem to be a significant bug that needs to be fixed.

Does anyone know if that's what is really going on?

Thanks,
Russ

Views

1.5K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 02, 2009 Apr 02, 2009

Copy link to clipboard

Copied

rchinoy wrote:
>
> Does anyone know if that's what is really going on?
>


I have never heard of this problem, and I doubt you are the first to
move a lot of data around with a CFC.

But could it also be a bug in your code? Where you think the CFC is
using the same variable memory, but is actually creating new ones?
Seeing some example code would have really helped with this.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Apr 02, 2009 Apr 02, 2009

Copy link to clipboard

Copied

To be honest, I hope it IS a problem with my code, but I'm just not seeing it. Here's the code:

<cffunction name="ImportProducts" returntype="void">

<cfset var OldInfo = "">
<cfset var StartRecID = 0>
<cfset var GetRecsCount = 1000>

<!--- Truncate the new table --->
<cfquery datasource="dsNew">
TRUNCATE TABLE products
</cfquery>

<!--- Load the old info --->
<cfquery name="OldInfo" datasource="dsOld" maxrows="#GetRecsCount#">
SELECT
products.*,
products_description.products_description,
products_description.products_featured,
products_description.products_name,
products_description.products_viewed
FROM
products
LEFT JOIN products_description ON
(products.products_id = products_description.products_id)
WHERE
products.products_id > #StartRecID#
ORDER BY
products.products_id
</cfquery>

<cfloop condition="OldInfo.RecordCount GT 0">

<!--- Keep track of where we left off --->
<cfset StartRecID = OldInfo.products_id[OldInfo.RecordCount]>

<!--- Insert the old info into the new db --->
<cfloop query="OldInfo">

<cfquery datasource="dsNew">
INSERT INTO products(
PROD_ID,
PROD_ModelNumber,
PROD_Name,
PROD_Description,
PROD_ImgURL,
PROD_NewInvQty,
PROD_AvailableAsOf,
PROD_Weight,
PROD_IsActive,
PROD_IsTaxable,
PROD_StatQtySold,
PROD_StatViewedCnt,
PROD_AddedOn,
PROD_AddedByAUSER_ID,
PROD_LastModifiedOn
)
VALUES (
#OldInfo.products_id#,
'#OldInfo.products_model#',
'#Replace(Replace(OldInfo.products_name, "\'", "'", "all"), "\", "", "all")#',
'#Replace(Replace(OldInfo.products_description, "\'", "'", "all"), "\", "", "all")#',
'#OldInfo.products_image#',
#OldInfo.products_quantity#,
<cfif Trim(OldInfo.products_date_available) NEQ "">
#CreateODBCDate(OldInfo.products_date_available)#,
<cfelse>
#CreateODBCDate(Now())#,
</cfif>
#OldInfo.products_weight#,
#OldInfo.products_status#,
#OldInfo.products_tax_class_id#,
#OldInfo.products_ordered#,
<cfif Trim(OldInfo.products_viewed) NEQ "">
#OldInfo.products_viewed#,
<cfelse>
0,
</cfif>
#CreateODBCDateTime(OldInfo.products_date_added)#,
1,
<cfif Trim(OldInfo.products_last_modified) NEQ "">
#CreateODBCDateTime(OldInfo.products_last_modified)#
<cfelse>
NULL
</cfif>
)
</cfquery>

</cfloop>

<!--- Load the old info --->
<cfquery name="OldInfo" datasource="dsOld" maxrows="#GetRecsCount#">
SELECT
products.*,
products_description.products_description,
products_description.products_featured,
products_description.products_name,
products_description.products_viewed
FROM
products
LEFT JOIN products_description ON
(products.products_id = products_description.products_id)
WHERE
products.products_id > #StartRecID#
ORDER BY
products.products_id
</cfquery>

</cfloop>

</cffunction>

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 02, 2009 Apr 02, 2009

Copy link to clipboard

Copied

Well nothing is obvious in the function, accept this only shows one
iteration of 1000 records, so what is the logic to iterate over this
code 200 times?

I suspect that is the likely source of the problem. A var scoped
variable is local to the current instance of a function. It is not
going to be reused from call to call of the function, as far as I know.
But the garbage collection is not going to run while in the middle of
a process intensive request where this function is being run 200 times.

I suspect the solution may be as simple as reasigning the OldInfo to an
empty string at the end of the function though, but I am not sure.

Also there is some strangeness I did not immediately understand in the
code. First, you run the "OldInfo" query twice in the function. Once
at the beginning and once at the end. Secondly, you truncate the new
table every time you iterate over this function. Wouldn't that lead to
just the last 1000 records existing in the new table?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 02, 2009 Apr 02, 2009

Copy link to clipboard

Copied

On top of what Ian sid, you should <cfqueryparam> your dynamic values so
your not having to recompile a new query with every single insert you're
doing. This would hurt the DB rather CF though, I would have thought.
It's just poor form and a bit of a performance hit, whichever way one looks
at it, though. There's a chance CF is caching something on its side of
things for each query though.

You should also row-limit your query on the DB end as well as the CF end of
things. Using maxrows just tells CF to "hang-up" after it gets the rows it
needs... the DB will still be getting all the rest of them. Again... this
hits the DB more than it hits CF, but it will slow everything down.

You might be able to improve the performance of your query by filtering the
products table before doing the join:

FROM (
select top 1000 *
from products
where product_id > <cfqueryparam value="#startRedId#"
cfsqltype="cf_sql_integer">
order by product_id
) PRODUCTS

LEFT JOIN [etc]

None of this will make any difference vis-a-vis CFC vs CFM, but it's not
going to hurt ;-)

Is there no way you can do this DB->DB, rather than putting CF in the
middle of things?

If nothing else, I'd try to use a bulk loader rather than individual
queries, if nothing else.

What DB system is it?

--
Adam

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 02, 2009 Apr 02, 2009

Copy link to clipboard

Copied

On Thu, 2 Apr 2009 22:43:13 +0100, Adam Cameron wrote:

> If nothing else, I'd try to use a bulk loader rather than individual
> queries, if nothing else.

BTW, I'm particularly proud of that sentence, btw.

--
Adam

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Apr 02, 2009 Apr 02, 2009

Copy link to clipboard

Copied

Actually, the code as shown has a loop in it to keep grabbing 1000 records (see the line <cfloop condition="OldInfo.RecordCount GT 0">), so the cffunction only gets called once.

I may try just assigning OldInfo to an empty string to see if that helps, but I certainly wouldn't expect that to be required. I suppose the real difference between the cfc-based code and the cfm-based code is one uses a local var and the other doesn't/can't, but I'm not clear on why that would make a difference.

Regarding the query optimization, while I could certainly improve the performance of what I'm doing, this tool is only going to be used twice, or maybe three times max, so there isn't much point in spending the time to optimize it.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 02, 2009 Apr 02, 2009

Copy link to clipboard

Copied

quote:

Originally posted by: rchinoy
Regarding the query optimization, while I could certainly improve the performance of what I'm doing, this tool is only going to be used twice, or maybe three times max, so there isn't much point in spending the time to optimize it.

No point making it a cfc either.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 03, 2009 Apr 03, 2009

Copy link to clipboard

Copied

> this tool is only going to be used twice, or
> maybe three times max, so there isn't much point in spending the time to
> optimize it.

> No point making it a cfc either.

Very true.

--
Adam

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Apr 03, 2009 Apr 03, 2009

Copy link to clipboard

Copied

Actually, there is a good reason to put the code in a cfc: there are almost 100 tables to migrate, so the code gets rather long and cumbersome to wade through if it isn't broken up. I could certainly use include files to do the same thing, but ultimately my question really has nothing to do with this particular tool. I would really just like to understand why the out of memory issue exists when the code is in a CFC method versus when it is in a CFM. This issue has come up enough times now that I'd like to determine if there is a workaround. Due to the deadline I'm on, I haven't tried simply assigning the query object to an empty string at the end of each iteration yet, but I will try to do so soon so I can report on the result.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 03, 2009 Apr 03, 2009

Copy link to clipboard

Copied

LATEST
> Actually, there is a good reason to put the code in a cfc: there are almost 100
> tables to migrate, so the code gets rather long and cumbersome to wade through
> if it isn't broken up.

Fair enough. Youd didn't really tell us that part before, though.

So you have a calling template which calls an equivalent to this function
for 100-odd tables? Yikes. I'll reiterate my question whether this can't
possibly be done DB to DB or via a bulk insert.


> thing, but ultimately my question really has nothing to do with this particular
> tool. I would really just like to understand why the out of memory issue exists
> when the code is in a CFC method versus when it is in a CFM. This issue has

Well it kinda is about this particular tool, because it's this code that's
causing the problems (or something about the calling code, or something
like that). Obviously there's some idiosyncasy of it which gives you
problems when it's in a CFC method as opposed to mainline code, but I would
not immediately say it's a generic problem with CFCs vs CFMs. Just yet.

Is this the first table that's being migrated? Or is it buried somewhere
within the other 100-odd? Is it always this one, or can it happen on any
of them?

Can you inspect the memory usage between each table's function, and see if
there's anything unexpected going on (other than it running out of memory,
which is - in itself - unexpected!).

Do you have 100-odd methods (one for each table) in the CFC, or lots of
individual CFCs?

I guess you could call a GC in between each table's processing, and see if
you can keep check on memory usage that way. Or maybe check memory levels
within that loop you've got, and if it starts flaring up, doa GC then.
It's not advised to call GCs "by hand", but I have found it's helped some
times.

Have you sued CF's server monitor or something like FusionReactor to check
if anything unexpected is consuming RAM?

I think you should try adding <cfqueryparam> tags to your queries: CF might
be cachinng some info about them, and that could be leaking memory or
something. Maybe try <cfobjectcache action = "clear"> between each method
callor something.

I wonder if there's something about the fact you're replacing oldInfo with
a new query within a loop that has a condition based on oldInfo's record
count. Possibly this is forcing CF to maintain the previous oldInfo
queries in the background, so you're not actually overwriting the previous
one with the next one when you requery, you're actually just getting a new
pointer assigned. Do me a favour, and do this:

<cfset iRecCount = OldInfo.RecordCount>
<cfloop condition="iRecCount GT 0">
[...]
<!--- Load the old info --->
<cfquery name="OldInfo" datasource="dsOld" maxrows="#GetRecsCount#">
[...]
</cfquery>
<cfset iRecCount = OldInfo.RecordCount>
</cfloop>

This is a slightly far-fetched notion, but it doesn't hurt to try: it's a
pretty easy change.


> come up enough times now that I'd like to determine if there is a workaround.
> Due to the deadline I'm on, I haven't tried simply assigning the query object
> to an empty string at the end of each iteration yet, but I will try to do so
> soon so I can report on the result.

It might pay to apply a test rig to this notion first:

0) on a dev machine without any other activity
1) grab the heap usage
2) load a big query in
3) grab the heap usage (should losely reflect the fact you've just stuck a
chunk of data in RAM)
4) set the query variable to be an empty string
5) grab the heap usage. Check to see if the memory is reclaimed straight
away.

You might need to do a GC between 4+5 for it to take effect. I speculate
that the RAM won't get freed up straight away, and you might be consuing it
faster than the JVM can clear it.

--
Adam

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation