• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
1

ColdFusion Memory Issue: sdk-ScheduledExecutor

Engaged ,
Nov 15, 2024 Nov 15, 2024

Copy link to clipboard

Copied

A quick post in case anyone else ever runs in to this memory issue. 

 

Our CF servers seemed to have a "memory leak", in that the memory being used  (Windows) by the CF service would gradually grow until it was maxing out the entire server.

 

Long-story-short, we found (with the folks over at xByte cloud hosting) in a thread dump that we had tens of thousands of hung (waiting) threads with the name: sdk-ScheduledExecutor

 

Turns out, that thread is created by the AWS SDK for Java, which we are not using.  Not surprisingly, though, ColdFusion does use a version of this SDK for their cloud services functionality.  We use that extensively when interacting with S3.  At first we thought there could be a bug in CF or in the SDK if it's spawning all these threads.  But, when pouring over Adobe's docs we found the following little tidbit on this page:

sdsinc_pmascari_0-1731681044140.png

 

Oh!  The getCloudService() function must be in a shared scope? In true Adobe fashion, none of the examples they  give on that page actually show this requirement in practice!

 

Anyway, our devs had missed that, entirely.  All our functions that interacted with S3 were starting out by creating the service object with getCloudService().

 

Turns out, as far as we can tell, every time that function is called, CF fires up a 'sdk-ScheduledExecutor' thread.  The thread does the work you give it, then sits in a waiting state for another job that, in our case, would never come.  So, after a while, we'd accumulate 1000s!

 

Solution: make sure 'getCloudService()' is called in a shared scope and reuse that object!

 

Going forward...and maybe I've missed something.... it would be helpful to have a way to close, or kill, this serviceObject.  As it stands, once you fire it up, it runs forever?

 

Cheers!

 

Views

213

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Engaged , Nov 18, 2024 Nov 18, 2024

Will do, Charlie.

 

For all those who come here looking for help with a ColdFusion memory issue, the solution for us, in this case, was to be sure to only put getCloudService() objects into shared scopes.  Do not create that object 'as needed' in a local scope.

Votes

Translate

Translate
Community Expert ,
Nov 15, 2024 Nov 15, 2024

Copy link to clipboard

Copied

@sdsinc_pmascari , thanks for sharing your findings. Your advice is quite instructive.

 

To answer your question, since the object that results from the getCloudService() call is stored in application-scope, it will not run forever. It will abide by the applicationTimeout value. That means, it will no longer be alive when the application times out.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Nov 15, 2024 Nov 15, 2024

Copy link to clipboard

Copied

Interesting, regarding object timing out when in the application scope.  Yes, of course I would expect this to be true.

 

But, perhaps I'm misunderstanding something, so allow me to postulate....

 

What we had been doing was putting the object created by getCloudService() into the local variable scope.  Our thinking was that object would only be alive when called upon during the intial page load.  But, it spawned a thread that never died even though the original object no longer existed when the page processing completed.

 

Which makes me wonder...  When the application times out, does CF actually go kill those threads, or does it fire up some new ones when the applicaiton re-initializes?  Thus, leaving the original threads continuing to run?

 

This makes me wonder if loading this object to a server scope might make more sense to prevent to buildup of unused threads?  Or, is that thinking fraught with peril?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 15, 2024 Nov 15, 2024

Copy link to clipboard

Copied

Interesting stuff, indeed. Thanks for sharing, Paul. 

 

1) First, as for your asking about the considerations relative to application timeouts, we should note that they might rarely or even never happen: the default is 2 days, and even if one lowers that (for an app or all), the duration is of course not "since the app was created" but rather "since the app was last used". And as some apps get traffic all the time, even if only from bots or monitoring calls, those might never timeout. 

 

But sure, the server scope would seem a fine choice (assuming there are no app-specific characteristics to the object instance saved there). 

 

2) And it will surely be interesting to see if time may show there to be more (to all this you've found) than meets the eye.

 

3) Also, did you confirm there was indeed a great reduction in how far memory now falls to, when you force a gc? That would confirm this was the cause of a seeming memory leak.

 

I get that reducing the high thread count is alone compelling, of course.


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Nov 15, 2024 Nov 15, 2024

Copy link to clipboard

Copied

Garbage collection did almost nothing to help when the server memory was maxed.  Admittedly, I am not an expert in this area, but a forced gc gave us no relief.

We are now 48+ hours after putting all getCloudService() objects into shared scopes and memory levels are back to "normal" across the board and stable.

 

In Fusion Reactor, I can see just a handful of threads handling our S3 actions.  Previously, even trying to load the Thread Visualizer would bring the server to its knees due to the number of threads.  One thread dump showed 65,000+ threads!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 15, 2024 Nov 15, 2024

Copy link to clipboard

Copied

Paul, to clarify: I was not at all proposing that doing a GC at the time of the high memory would have "given relief". What I was asking is how you find heap use to be now (with "all being well"). So thanks for clarifying that 'memory levels are back to "normal"'.

I only mentioned doing a GC because you might have looked and found the heap "still seeming high" (now, with all well), but if you were to do a GC (click the button on the FR "system metrics" page, for example) and the heap were then to drop substantially, it would show that the JVM was just being lazy about doing garbage collection itself (and so such "seeming high" heap use was just a temporary thing at that time) .

 

Finally, it was clear from your previous reply that the number of threads had been high and now was not. Again, I was trying to connect all this to your original concern of high heap use. And now that you've said it's down, that would suggest that indeed there was more to the hgi thread count than merely "being so many": instead, it would seem that some aspect of those threads "remaining alive" was also causing some aspect of the heap to "remain in use".

 

What matters is that your code change has solved both problems. That's great, and if it seems this can be a valuable lesson learned for everyone. Again, thanks.


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 16, 2024 Nov 16, 2024

Copy link to clipboard

Copied

quote... allow me to postulate....

 

What we had been doing was putting the object created by getCloudService() into the local variable scope.  Our thinking was that object would only be alive when called upon during the intial page load.  But, it spawned a thread that never died even though the original object no longer existed when the page processing completed.

 

Which makes me wonder...  When the application times out, does CF actually go kill those threads, or does it fire up some new ones when the applicaiton re-initializes?  Thus, leaving the original threads continuing to run?

 

 

By @sdsinc_pmascari

 

What I think is that each object created in local scope spawns one or more threads. However, though the object is in local scope, the threads so created may live for the duration of the application. That is, I think the threads will only cease to exist when the application times out or is restarted, whichever occurs first.

 

You can confirm or refute this yourself. For example, by examining your application's threads using a tool such as FusionReactor, ColdFusion Performance Monitoring Toolset or VisualVM.

 

Anyway, there are good design reasons why:

  1.  such a thread should live as long as the application: optimal reuse.
  2.  the thread should not survive from one application to the next: in keeping with the Creator principle (the original application being the creator), and to avoid coupling between applications.

 

I can imagine why such a thread is in WAITING state. Namely, because the thread is a worker on stand-by. As such, it is poised to go into action when needed. So, the application does actually need those threads to be alive. The issue is: not that many. 

 

Hence the recommendation to store the object in a shared scope, such as application. The threads will then be spawned just once for the entire duration of the application (rather than thousands of times as in the case of local-scoped objects).

 

I imagine creating the object as being analogous to bringing forth and opening a can of worms. You may throw away the can afterwards (garbage-collection) , but the worms will still be around. So, if your application must do this, the most efficient way will be to do it once, using application scope. 

 

When the object is local, as is currently the case in your application, a new object is created each and every time the CFM page or component is launched. Hence new threads are spawned each and every time. In other words, a new can of worms is produced and opened each time. My hypothesis is that that is how your application ended up with thousands and thousands of waiting threads.

 

quote

 

This makes me wonder if loading this object to a server scope might make more sense to prevent to buildup of unused threads?  Or, is that thinking fraught with peril?

 

By @sdsinc_pmascari

Yes, loading the object in server scope is fraught with peril. That leads to a design where objects could be propagated to applications that are unaware of the objects or don't need them. It would also lead to an increase in coupling between applications and could trigger race conditions. 

 

To prevent a buildup of unused threads, load the objects in application scope instead. I hope the above arguments are convincing for this choice.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 16, 2024 Nov 16, 2024

Copy link to clipboard

Copied

I should like to share two reports of an Amazon S3 issue similar to yours. It occurs in an environment completely different from ColdFusion:

https://github.com/aws/aws-sdk-java-v2/issues/3746 

https://github.com/aws/aws-sdk-java-v2/issues/4991 

 

It strengthens my hypothesis that:

  1.  what you have found is an Amazon AWS S3 issue, rather than a ColdFusion one.
  2.  storing the cloud-service object in application scope will drastically alleviate the issue, possibly even resolve it completely.

 

Now, on to a related subject: the time-to-live of an object in a bucket. You can configure how Amazon S3 manages such an object during its lifetime. The documentation "ColdFusion and Amazon S3" shows you how to configure the Lifecycle Rules that define how Amazon S3 manages objects during their lifetime. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Nov 18, 2024 Nov 18, 2024

Copy link to clipboard

Copied

Thank you both for your comments. It has been very helpful.

 

We will keep all this in mind as we move forward. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 18, 2024 Nov 18, 2024

Copy link to clipboard

Copied

My pleasure, @sdsinc_pmascari

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 18, 2024 Nov 18, 2024

Copy link to clipboard

Copied

I hope you'll please update us on any subsequent findings or conclusions you come to, so that others finding this thread can make sense of all that was being shared. 


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Nov 18, 2024 Nov 18, 2024

Copy link to clipboard

Copied

LATEST

Will do, Charlie.

 

For all those who come here looking for help with a ColdFusion memory issue, the solution for us, in this case, was to be sure to only put getCloudService() objects into shared scopes.  Do not create that object 'as needed' in a local scope.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation