• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

ColdFusion 10 intermittent "service not available"

Explorer ,
Jun 14, 2012 Jun 14, 2012

Copy link to clipboard

Copied

Have been programming over 30 years and CF in its various forms since 1998, but have to say this upgrade is NOT straightforward and after 3 very frustrating days I've finally reverted to CF9.

CF10 Standard upgrade from CF9. Running on dedicated IIS 7.5 on WIN 2008 R2 Server. All 64bit.

Upgrade installation worked correctly. RDS was selected as was the Upgrade all IIS sites.

Initially CF10 failed to start and checking IIS showed the connectors had NOT been installed. CF10 Administrator just showed raw machine code.

Used the CF10 Web Server Configuration tool on "ALL Sites" - no difference.

Removed "All sites" and then installed the connectors against individual sites. This worked - but then I noticed each site was running the CF9 connectors!

Removed each. Then ran the CF9 Web Server Configuration tool to remove all CF9 connectors. Confirmed they had been removed in IIS, then ran CF10 Web Server Configuration tool to add the connectors and all worked well.

This morning I rebooted the server to confirm CF had "installed" correctly, 3 of my 5 sites too 10 minutes before they could be accessed. Then after 30 minutes I'd get a simple white screen with "service unavailable".

Can't afford to have our sites and clients sites intermittent so have reverted to the CF9 connectors and took CF10 offline.

Hopefully someone else has had similar experience and can shed some light on this issue.

Thanks

Peter

Views

44.7K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
replies 101 Replies 101
New Here ,
May 03, 2013 May 03, 2013

Copy link to clipboard

Copied

I am always nervous to call it a win because I don't want to jinx it. I've been running stable since the 15th of April (over 2 weeks.) I had our IT guys install the JRE. Currently running Java 7 Update 17 (64-bit) 7.0.170. I can't be sure it's "fixed" but it's there. I think CF installs it's own but they suggest always installing it by itself. Seems to have worked thus far.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 08, 2013 May 08, 2013

Copy link to clipboard

Copied

Matt, do you know the problem really is the connector hanging? And even if so, that could be more a symptom than the root cause. I can think of all sorts of other problems that could happen when a site scan happens that could bottleneck CF, and then therefore appear to be “the connector hanging”. I help people with this sort of problem several times a week. It’s not something we could possibly do on the forums or via email.

If you may be interested in pursuing an engagement with me to try to understand and resolve the problem, I’m fairly confident that we might find something in less than an hour, and I do offer a satisfaction guarantee, so that if you find none of the time to be valuable, you don’t need to pay for it. If you may be interested, check out the consulting page at carehart.org, which also offers my email address, twitter account, and phone number if you may want to discuss things further. Not a sales pitch, just a possible solution to what sounds like a severe problem for you. Hope it’s helpful.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 09, 2013 May 09, 2013

Copy link to clipboard

Copied

Hi, Charlie.  I believe you are right.  I've found the default redirector properties will not work well for a multi-site (same connector) install.  The default connection pool size was 250 and the max reuse connections was also 250.  So I think one of the 2 sites on the server was using all of the connections which would eventually leave none available for the other site.  I have edited the worker.properties file for the connector as shown below.  So using the formula (connection_pool_size / # of sites = max_reuse_connections) seems to be working for us.

worker.cfusion.max_reuse_connections=250

worker.cfusion.connection_pool_size=500

worker.cfusion.connection_pool_timeout=60

And the next site I add to the server I will update the above properties as shown below.

worker.cfusion.max_reuse_connections=250

worker.cfusion.connection_pool_size=750

worker.cfusion.connection_pool_timeout=60

I am excited about the offer for some help Charlie.  We have one developer that says he has met you before.  From time to time we have some performance tuning questions that you could probably help us with.  If I ever can get some budget for that type of thing I'd love to have you take a look and make some suggestions.  Until then I'll have to muddle through this.  Thanks so much for you input though, it means a lot coming fromm someone with your experience.  -Matt

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 09, 2013 May 09, 2013

Copy link to clipboard

Copied

Thanks for that, Matt. So it’s interesting to hear what you found. Again, my experience has been that problems are solved other than by these worker.properties file tweaks. But I will note that Adobe documented some of them in a blog entry here:

http://blogs.coldfusion.com/post.cfm/tuning-coldfusion-10-iis-connector-configuration

That may help some reading this to learn more about the idea. I have frankly felt that there was still not quite enough info in that blog entry to really help people fully understand how to apply the info shared (and some of the comments reflect that), but you may want to share thoughts there (and here, of course) as you confirm or learn more in your exploration.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 09, 2013 May 09, 2013

Copy link to clipboard

Copied

I want to second that. As you can see from my comments in this thread I have done extensive messing around with workers.properties, server.xml, etc. Nothing really helped. Reinstalling Java seems to have done the trick, at least for now. I am approaching 1 month without crash and really nothing of note in the Event Viewer in Windows. I don't know what part of the Java environment actually "fixed" our issue, but it appears that way none the less. Is it the connector.... can't say, but I really feel Adobe dropped the ball with CF10 in some way. It's old enough now bugs, even non-bugs due to setup/config, should be addressed in an environment setup as common as Win 2008/IIS7.5/CF10.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 16, 2013 May 16, 2013

Copy link to clipboard

Copied

Just curious if you have tried installing the official oravle Java JDK JRE and use that instead of the bundled java that came with the CF installer. That totally fixed our issue. 1 month and counting.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 16, 2013 May 16, 2013

Copy link to clipboard

Copied

Hi Lee, I have not tried that but would be next on my list.  So far things have run smoothly since adjusting the connection pool settings. -Matt

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 20, 2013 May 20, 2013

Copy link to clipboard

Copied

Lee, can you clarify the version you changed to? To be clear, the JVM built into CF was indeed an “official oracle JDK JRE”, albeit not the latest. It’s 1.6.0_29, as installed by Adobe.

So are you saying you updated to a later 1.6? Or do you mean you applied Update 8, and then installed a 1.7 JDK? And if so, which version of that?

And for the sake of others reading this and thinking that that change alone may help something, are you confirming that you changed NOTHING ELSE? Not even any JVM arguments or the GC algorithm? Would be interesting to hear if just an upgrade of the JVM alone had some beneficial impact.

I would note one last possibility: it could be that when you made the change, something ELSE was affected—not so much by the change in the version of the JVM but just in the change of the location of the JVM. By that I mean, there are things that CF uses within the location of the JVM (like the keystore) which would change in the installation of a new JVM (to use the keystore in the new JVM’s location).

I realize you won’t care to “go back” now, since things seem to be working. But I would note that if someone else may be considering this, and they did find that an “updated JVM” did somehow “fix” things, it could be interesting (as a technical curiosity) to see if things would also be “fixed” if you installed 1.6.0_29 also (the same version that CF10 uses by default) and changed CF to point to that.

If things were indeed still “fixed”, then it would indicate that just changing the location was the solution. It would then be really interesting to know then what about the old location caused a problem. Could be a permissions issue. Could be some files it was reading/writing that are not brought forward. Could be anything, really. Just something to consider.

I appreciate, Lee, that you want mainly just to suggest people consider a JVM update if things are amiss. It could well help. But if you can clarify the version, knowing that could even more.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 22, 2013 May 22, 2013

Copy link to clipboard

Copied

So when looking at our development box (the machine we didn't updgrade Java on) there is no listing of Java installed, so it's using whatever CF10 installed. On our production machine I see Java 7 Update 17 (64-bit) Version 7.0.170 and Java SDK 7 Update 17 (64-bit) Version 1.7.0.170.

When I said official I guess I mean to say latest official. I have no idea how or why that fixed our issue. As stated in my previous posts I've edited the worker properties, server.xml, and even tried messing with the jvm.config file. None of this fixed the issue. We still had the random app pool failings.

Obviously we didn't install Java in the Coldfusion directory, we put it in C:\Program Files\Java\jdk1.7.0_17\jre.

I don't see how the location could be an issue. I mean CF runs fine. If permissions were an issue wouldn't it not run at all. Note, I wouldn't mind testing your theory on or production box, but our production box has never crashed as it gets little to no use, so it would be impossible to see if it works. Our IT people suggessted maybe the install got corrupted somehow. I'm leaning towards the install process of Java on CF10 must just be less than perfect in some instances for whatever reason, and because Java is the underlying archetecture of everything, who knows. All I know is that installing the latest version of Java worked... so I am happy, and if it were my personal machine or even for my business I'd love to tinker, but because I work for a higher education institution and it's not my personal box, I have no desire to potentially make a bad situtation worse with my job in the balance. I wish I could for the sake of... science?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
May 28, 2013 May 28, 2013

Copy link to clipboard

Copied

Hi All,

Please send in your contact details via personal message, if you are still facing the issue.

Regards,

Anit Kumar Panda

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 05, 2013 Jun 05, 2013

Copy link to clipboard

Copied

Sad to report. Yesterday happened again. Almost month and a half.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 11, 2013 Jun 11, 2013

Copy link to clipboard

Copied

Sorry to hear you’re challenged again, though good to see you went several weeks without problems. It can happen.

So please do clarify, though, what “happened”? For one thing, the subject of the thread is “service not available”, and that can display when for any of many reasons CF is not responding. What have you seen in the CF logs (not application.log, but ColdFusion-out.log)? Does it show CF going down? Coming up? Anything else in the log happening around and before the time of such a restart?

For another, you mentioned in your previous note an issue of “random app pool failings”, so it may be helpful to know if that is happening here (or not). Anything in the Windows event logs? As noted before, some such app pool failings were because of the need to rebuild the app pool after applying certain CF10 hotfixes, but I suspect you did that some time ago. Still the event log may give some additional insight.

Finally, there’s always the possibility that CF is “not responding” not because it crashed or has been brought down, but simply because it’s hung (still up, but just not responding). And for that situation, there may be no errors in any logs, but you may find with any of many tools that CF’s request threads are tied up with some long-running requests—and that could be just a random problem based on load or code, not an inherent deficiency in the stability of CF itself. Among the tools that can help are the new CF10 metrics.log, which tracks how many requests are running and have run. Then there’s the old school CFSTAT command-line tool. And of course the CF Server Monitor for those on CF Enterprise, and FusionReactor (and its useful new FREC logs) which can be used by someone even running CF Standard.

Hope some of that helps you or future readers.

/charlie

Re: ColdFusion 10 intermittent "service not available"

created by Lee Bartelme <http://forums.adobe.com/people/Lee+Bartelme> in ColdFusion - View the full discussion <http://forums.adobe.com/message/5383079#5383079>

Sad to report. Yesterday happened again. Almost month and a half.


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 11, 2013 Jun 11, 2013

Copy link to clipboard

Copied

So, what always happens is the same. From the end-user perspective, I get hanging for a few minutes while it's in the process of crashing. Pages don't resolve. Finally when IIS stops my pool, I get 503 errors.

Nothing in Coldfusion-out is strange. Every thing looks normal before the restart.

As stated in my previous discussion on this thread, the event viewer clearly shows the warnings being issued around crash time of the pool doing two things:

1. terminating unexpectedly and then

2. suffereing a fatal communication error with the Windows Process Activiation Service

Finally after it hits the number of allowed errors in the rapid error succession in IIS it errors and the pool is disabled.

Our IT department feels after some research they believe there is a memory leak in the isapi_redirect.dll file. I've read nothing of the sort but that is their response. My isapi log looks as so since we've turned on more detailed logging. Most of the day I have normal : ajp_send_request::jk_ajp_common.c (1658): (cfusion) all endpoints are disconnected, detected by connect check (1), cping (0), send (0) entries.

However right around crash time I get a lot of these over and over again until it crashes:

init_jk::jk_isapi_plugin.c (2779): Initializing shm:(null) errno=-1. Load balancing workers will not function properly.

init_jk::jk_isapi_plugin.c (2813): Jakarta/ISAPI/isapi_redirector/1.2.32 () initialized

init_jk::jk_isapi_plugin.c (2634): Starting Jakarta/ISAPI/isapi_redirector/1.2.32 ()

I don't know what other thing it could be other than the connector as that is the only logs that show what amounts to as errors. While I cannot vouch for all of our code on our server as I didn't write it all, I can say this NEVER happened before CF 10. So I just cannot blame this on bad code or Coldfusion hanging because of malformed code. It just doesn't make sense. This only happened after the switch to CF10.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 11, 2013 Jun 11, 2013

Copy link to clipboard

Copied

Even so, please do look at the CF10 metrics.log, and specifically at the number of requests running (and completed) around the time of these failures, to see if perhaps the problem may be that CF requests are hanging up.

I appreciate that the problems seem to point to IIS/the app pool and specifically the connector, and I do realize that that’s new for CF10.

But I would still (if I were you) want to make sure that there’s no evidence of requests hanging up. It could be that the connector/app pool issue is more the symptom rather than the cause. If you found there to be no interesting evidence in the metrics log, to indicate any issue with the state of CF request processing, then I would understand you then feeling it must be the connector.

That said, it could be that yours is one of the edge cases where some tomcat connector tweaking may be warranted, as discussed in the blog entry pointed to earlier in the thread:

http://blogs.coldfusion.com/post.cfm/tuning-coldfusion-10-iis-connector-configuration

But I would understand if you may look at that and wonder what really to do with the info offered. I don’t feel it’s as clear as it could be, but let us know as you learn/try more.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Jun 15, 2013 Jun 15, 2013

Copy link to clipboard

Copied

After  a lot of work checking into this...The problem is related to Shared Memory Errors.  The problem was definately fixed in the public version of the jakarta connector, but not in the CF version.  The CF version is based off of the version of the connector that has the bug.  The primary reason some people see it and some don't, has to do with multiple sites and appliction pools.  If you have a server running only one web site and only cf pages, than you will probably not see a problem.  If you have multiple application pools, for instance running some ASP.net apps on the same server...then you will likely run into the problem of CF crashing.  More so it seems to be more prone to crash when accessing applications in different application pools.  Adobe is really the only person that can fix this because they have yet to make their connector code publicly available.  The fix on the publicly available version was easy to incorporate and I don't understand why this is taking a while to fix.

It is easy to isolate this to a connector problem, becuase you can easily fix it by just recycling IIS.  You will also find that CF is fully functioning if you enable the internal port and access CF without going through IIS when the problem occurs.

Reference Articles:

https://issues.apache.org/bugzilla/show_bug.cgi?id=47678.

http://tomcat.10.n6.nabble.com/IIS-7-0-Worker-process-crashes-on-App-Pool-recycling-since-ISAPI-redi...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 15, 2013 Jun 15, 2013

Copy link to clipboard

Copied

That’s very interesting, and he first I’d heard of it, so good sleuthing if indeed it turns out to be the issue. You state things pretty confidently, but this is your first note in this thread (or in the forums). So do you have any more to share on why you feel so strongly that “the problem is related to shared memory errors”? I mean, I get that the two articles you point out do refer to it, but I just mean what leads you to declare that it’s the solution to the problems raised by Lee and/or others here?

But assuming it is indeed the solution, I’ll share some more for other readers. The second article refers to a fix in a 1.2.34 build. I can confirm that CF is using 1.2.32, even after the updates (so that a rebuild of the connector creates that latest-dated isapi_redirect.dll, of Nov 2012). This is reported in the C:\ColdFusion10\config\wsconfig\1isap_redirect.log.

It would certainly be interesting to hear from anyone on the CF team about the prospects related to this observation about 1.2.34.

(Oddly, the first link you shared points to a bug report that refers to no later than 1.2.31, so I’m kind of wondering if that’s really the right URL you mean to share.)

Thanks for sharing what you observed, rd33.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Jun 16, 2013 Jun 16, 2013

Copy link to clipboard

Copied

Although I don't post often...mainly because I solve problems on my own...I am a power user.  I am posting on this, because a lot of users are stuck, and only Adobe can fix the problem.  We are no longer persuing CF10 at our organization, and like others have rolled back.  Within my department we have over 250 servers...somewhere around 25 CF servers and over 20 tomcat servers using the jakarta connector.  We see millions of page views a day.  I have been using the jakarta connector for over 8 years.  I am very familiar with how it works, tuning it, and troubleshooting it.  Interestingly enough, we also tried to roll out version 1.2.32 of the connector for tomcat a while back(non CF server) and ran into a lot of errors and random crashes.  Our solution was to upgrade to 1.2.35 and the connector stopped crashing. 

What leads me very directly to the shared memory bug is the shared memory errors in the jakarta log every time to the connector crashes.  I can say with confidence that the users on this thread will find this error in there log if they go looking.  Below is the error line referring to the shared memory error. SHM=shared memory.

[Mon Mar 11 11:29:54.060 2013] [3960:4388] [error] init_jk::jk_isapi_plugin.c (2779): Initializing shm:(null) errno=-1. Load balancing workers will not function properly.

Oddly enough some articles refer to this being fixed in .31 while others say that it is not.  You will notice that one of the posts in the second article refers to it being fixed in 1.2.33.  I believe a lot of the discrepencies have to do with whether or not users had multiple application pools configured on their test servers.  Due to the nature of what actually causes the crash some servers will see it and some won't.  Also if no pages are accessed in other app pools you may not see the problem even if the server is configured exactly the same.  I believe this to be the reason for descrepencies in the reporting of what fixes it.  Think back to the posts where someone had it crashing every time it was indexed by google.  Search engines are likely to cause the crash because they will hit every page on the server.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Jun 16, 2013 Jun 16, 2013

Copy link to clipboard

Copied

Wow rd33 very informative. You should post more often. Cheers, Carl.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 16, 2013 Jun 16, 2013

Copy link to clipboard

Copied

Great. Thanks for the clarification. And that truly was all I was seeking, a clarification. No offense or slight intended. I’ve just seen too many posts where someone proposes “this is the answer” to something, when it may be a solution to something but not necessarily the issue in question. (And they may or may not be experienced in CF server Admin. That alone does not give credence to many recommendations, in my experience. Still, thanks for pointing out your experience.)

In this case, yes indeed, if people see reference to that “shm” message in their logs, then it would surely seem this is the problem and a solution. But if they did not, then it would not be. And I’m glad we’ve made that more clear now.

Of course, for those with that problem and needing that solution, it’s now up to Adobe to implement it. As you noted earlier, with the custom build of Tomcat and the connector, we cannot implement this ourselves…well, we could, but then we would not be running a supported connector, and there could be features that Adobe included which would now be removed.

Let’s hope that someone from Adobe might respond.

/charlie

PS And as Carl noted, it’s great to see your experience shared here, and hope you may consider contributing more. More specifically, there is a CF Admin forum (http://forums.adobe.com/community/coldfusion/coldfusion_administration ) where the focus is solely on helping with managing CF servers, and both Carl and I (and others) are active there helping people solve problems and hopefully helping others who read along.

Indeed, if “trying to keep up with a forum” sounds daunting, a great way to keep up (whether to help or just to learn) is to use the “mail notifications” feature (shown on the right once you sign in), which causes each forum thread to be sent by email. The CF Admin forum is fairly low-volume, so pretty easy to keep up with that way.


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 17, 2013 Jun 17, 2013

Copy link to clipboard

Copied

Maybe Anit Kumar could chime in. I spoke with him on the phone and he seemed to have a good understanding of the bug in question as it related to Adobe/Tomcat and the process that is going on behind the scenes. He gave me a bug number: 355158, that he says he filed, but I couldn't actually find a record of it on the bug website. However he stated it is something Adobe is actively working on getting fixed, but it's not THEIR issue persay, it is a Tomcat issue.

More specifically related to the setup of the systems with errors. He was spot on when he asked if I had applications setup as "children" if you will of sites in IIS. In our case the only pool that crashes is the pool that handles the site that has 5 other ASP.NET applcations underneath it. I would bet anyone experiencing the issue, has a CF POOL/SITE with one or more non-cf applications/pools underneath it in IIS. Again, it comes down to waiting for the Tomcat fix to be pushed to CF. If you contact Anit Kumar directly he said he would keep you informed when the fix gets distributed.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 19, 2013 Jun 19, 2013

Copy link to clipboard

Copied

I can also confirm this situation that causes the app pool to crash. We have a site running several CF apps under the same app pool for a good 6 months and have never had an issue with the app pool crashing. If I attempt to create an additional app pool on the same site even for another CF app, it will crash almost immediately once the site using the new app pool is accessed. So it seems, at least in my case anyways, it works fine as long as only one app pool is configured per website. I get the same "Load balancing workers will not function properly" error in the redirect.log if more than one app pool is used.

Some things interesting to note that I have discovered over the past few days to prevent the app pool from immediately crashing upon page visit ( I don't know long term as I only have been testing on our dev box) is that if the additional app pool being created is for an ASP app, enabling 32-Bit applications on Advanced settings for the ASP appool seems to not cause the CF app pool to crash. Secondly if the identity of the CF app pool is changed to a higher privileged account such as Local System it does not seem to crash either. But for security purposes though, this is not acceptable, just something interesting that I noticed.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 20, 2013 Jun 20, 2013

Copy link to clipboard

Copied

I was also able to find that bug that you reported, it doesn't have the same ID that you were given though.

https://bugbase.adobe.com/index.cfm?event=bug&id=3530880

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Feb 04, 2014 Feb 04, 2014

Copy link to clipboard

Copied

rd33 wrote:

Although I don't post often...mainly because I solve problems on my own...I am a power user.  I am posting on this, because a lot of users are stuck, and only Adobe can fix the problem.  We are no longer persuing CF10 at our organization, and like others have rolled back.  Within my department we have over 250 servers...somewhere around 25 CF servers and over 20 tomcat servers using the jakarta connector.  We see millions of page views a day.  I have been using the jakarta connector for over 8 years.  I am very familiar with how it works, tuning it, and troubleshooting it.  Interestingly enough, we also tried to roll out version 1.2.32 of the connector for tomcat a while back(non CF server) and ran into a lot of errors and random crashes.  Our solution was to upgrade to 1.2.35 and the connector stopped crashing. 

What leads me very directly to the shared memory bug is the shared memory errors in the jakarta log every time to the connector crashes.  I can say with confidence that the users on this thread will find this error in there log if they go looking.  Below is the error line referring to the shared memory error. SHM=shared memory.

[Mon Mar 11 11:29:54.060 2013] [3960:4388] [error] init_jk::jk_isapi_plugin.c (2779): Initializing shm:(null) errno=-1. Load balancing workers will not function properly.

Oddly enough some articles refer to this being fixed in .31 while others say that it is not.  You will notice that one of the posts in the second article refers to it being fixed in 1.2.33.  I believe a lot of the discrepencies have to do with whether or not users had multiple application pools configured on their test servers.  Due to the nature of what actually causes the crash some servers will see it and some won't.  Also if no pages are accessed in other app pools you may not see the problem even if the server is configured exactly the same.  I believe this to be the reason for descrepencies in the reporting of what fixes it.  Think back to the posts where someone had it crashing every time it was indexed by google.  Search engines are likely to cause the crash because they will hit every page on the server.

I'm inclined to agree with rd33 on this.  I've enabled DEBUG logging for the isapi_redirect and read through the mountain of logs it creates.  I see shared memory handler (smh) debug comments popping up in the log right before it crashes and then the IIS app pool wont stay in memory but instead crashes every time it tries to load the ColdFusion portion of the website.  At the same non-shared resource websites that run pure CF or pure .NET continue to operate on the same server.  I think the issue is that because the iasapi_connector is loaded in the website it gets called for every reference even ones you might not think it would.  Watching the debug logs I see it being called for every code file in the DotNetNuke website because it is loaded in memory at the root of that website as an ISAPI Filter even though I've gone into the Handler Mappings and removed all references to it from there.  Only the application folders with their own app pool have handler mappings that point to the ColdFusion files.  So the root website is calling the isapi_connector and then the added app pool is calling it as well requiring a shared memory environment to sync data between them even though in my scenario I'm not doing anything like that.  This functionality is typically used in load balancers or web gardens which is why we get the load balancer error.

The Tomcat Connector bug I believe this relates to.

https://issues.apache.org/bugzilla/show_bug.cgi?id=52659

An interesting comment from the end of the bug refering to both IIS process cycling and the use of the isapi_redirect itself.

Fixed with the 1.2.34 by ensuring that ...

1. We have a process mutex for serializing initialization

2. Parameters are updated from shared memory if shared memory is explicitly updated

Note however that recycle and isapi_redirector are conceptually wrong.

isapi_redirector is a proxy, has its own connection pool and connection pool management. You should disable IIS worker recycle in production to have optimal performance, because the entire recycle concept is meant to handle wrongly written .NET applications having memory leaks. This is not the case with isapi_redirect.

Next you should also limit the number if isapi_redirect instances per physical IIS instance and preferably have just one. You can use uriworkerap.properties to handle mapping for multiple vhosts.

I'm testing running my .NET & CF in the same app pool to see if it resolves the issue for me.  If not then I'll try doing the whole server configuration option instead of a per-website basis.  Neither are solutions I can use on our production servers because of the restrictions for the network they reside on but it will at least help narrow down the issue.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Jul 15, 2013 Jul 15, 2013

Copy link to clipboard

Copied

@Lee, Has the tomcat ISAPI connector update provided in CF10 Update 11 helped resolve the "shm [error]" in ISAPI log?

Please note just applying Update 11 does not modify the ISAPI connector alone one  would need to remove and add using WSCONFIG or extract and replace ISAPI DLL from JAR file.

Regards, Carl.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 16, 2013 Jul 16, 2013

Copy link to clipboard

Copied

Just one little tweak to Carl’s helpful suggestion:

While update 11 does indeed require us to reconfigure the IIS connector, there’s some good news: one need not necessarily do a full reinstall (remove/add) of the connector.

If you have done update 5 before doing update 11, and you had already reinstalled the web connector, you can get by this time (after 11) with just doing an update, which you can do from the command line (a little less work in IIS than doing a full remove/add).

For more, see the recent blog entry from Adobe:

http://blogs.coldfusion.com/post.cfm/coldfusion-10-does-the-connector-need-to-be-re-installed-for-update-11

For now it just says that if you have applied update 5, you need only update the connector, but the implication is that you did update 5 (or a later one) AND did since do the remove/add, then you need only do the update. I’ll send an email to the author to ask her to clarify that a bit more.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation