• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

ColdFusion 10 intermittent "service not available"

Explorer ,
Jun 14, 2012 Jun 14, 2012

Copy link to clipboard

Copied

Have been programming over 30 years and CF in its various forms since 1998, but have to say this upgrade is NOT straightforward and after 3 very frustrating days I've finally reverted to CF9.

CF10 Standard upgrade from CF9. Running on dedicated IIS 7.5 on WIN 2008 R2 Server. All 64bit.

Upgrade installation worked correctly. RDS was selected as was the Upgrade all IIS sites.

Initially CF10 failed to start and checking IIS showed the connectors had NOT been installed. CF10 Administrator just showed raw machine code.

Used the CF10 Web Server Configuration tool on "ALL Sites" - no difference.

Removed "All sites" and then installed the connectors against individual sites. This worked - but then I noticed each site was running the CF9 connectors!

Removed each. Then ran the CF9 Web Server Configuration tool to remove all CF9 connectors. Confirmed they had been removed in IIS, then ran CF10 Web Server Configuration tool to add the connectors and all worked well.

This morning I rebooted the server to confirm CF had "installed" correctly, 3 of my 5 sites too 10 minutes before they could be accessed. Then after 30 minutes I'd get a simple white screen with "service unavailable".

Can't afford to have our sites and clients sites intermittent so have reverted to the CF9 connectors and took CF10 offline.

Hopefully someone else has had similar experience and can shed some light on this issue.

Thanks

Peter

Views

44.7K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
replies 101 Replies 101
Guest
Jan 28, 2014 Jan 28, 2014

Copy link to clipboard

Copied

I've discovered we're having the same issue as well I believe.

I just built a fresh Hyper-V Windows 2008 R2 IIS 7.5 with ColdFusion 10 patch 13 and the isapi connectors are all freshly built as the newest version (1.2.32 11/2/2013) and we run a connector per website and not All Websites. I have it running on the latest Oracle Java JRE 1.7.0_51 as well, all items 64bit. I currently have 2 users accessing this system and the IIS app pool associated with the ColdFusion portion keeps shutting down randomly. It did it 3 times today but yesterday not at all. This is a development system so the load is very small. I've also tried the connector tuning values above to no avail. I've also run through the CF10 hardening guide on this system.

The main portion of the website runs DotNetNuke in .NET4 and then CF10 is running as a virtual folder within the website on its own app pool. There is another website on this server that runs CF and is not crashing, however it does not have multiple app pools or .NET running in it.

I have another server where I have the majority of our existing development websites still which has a mixed CF9 & CF10 environment and we've never seen this issue on that system.  After reading through this more though I believe that's because none of our websites currently evaluating CF10 on that server have mixed app pools and only run CF10 and no .NET in them.

To bring the CF portion back online is as easy as restarting the app pool or more drastically doing an iisreset.  I've been testing out having a scheduled task run the IIS command to start the app pool to try to catch it if it fails and bring it back online.  If the app pool is running the command does nothing so having this run frequently does no harm.

%windir%\system32\inetsrv\appcmd.exe start apppool /apppool.name:"MyAppPoolName"

I'm also going to experiment with the AppPool Rapid-Fail Protection settings to try changing the minutes & maximum failures to find a happy medium of protection and stability.

From reading the bug report https://bugbase.adobe.com/index.cfm?event=bug&id=3530880 it sounds like there's a fix coming. 

  •    Asha K S 

    7:37:04 AM GMT+00:00 Jan 25, 2014

    @TheCycoONE it is not part of any CF 10 released patch yet.

Is there any ETA on this or a way of getting early test access?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 28, 2014 Jan 28, 2014

Copy link to clipboard

Copied

Hey Leith, can you drop me an email directly at charlie@carehart.org. I have some thoughts/suggestions but don’t want to belabor the forum with lots of details, if we may narrow it down together to one thing, which we could report back if it works.

BTW, I tried to email you via the address for you listed here in the forums (shown to those who click your profile and are logged in. I won’t repeat it here to be viewable by those not logged in, but it failed, so I have no choice but to reach out this way. Well, there is the private messaging feature in the forums, but I suspect that would fail to get to you if your profile address is wrong.)

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Jan 28, 2014 Jan 28, 2014

Copy link to clipboard

Copied

I email you directly.  Odd it told you that email failed as it's my active one and we show no issues with the system currently.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 28, 2014 Jan 28, 2014

Copy link to clipboard

Copied

Actually, it’s off by a letter. Emailing you a reply with that and more.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 28, 2014 Jan 28, 2014

Copy link to clipboard

Copied

Doh, never mind. It was my copy/paste of it that missed a letter. Again, email to you directly with more, to come.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
May 16, 2013 May 16, 2013

Copy link to clipboard

Copied

Hi Folks, I am talking on the subject of CF10, tomcat and connectors next week. Be nice to have you along to share some thoughts. I will mention Java where appropriate however Java is another talk which have done before.

Here is URL (note with spaces added some reason the form does not like posting URL so take spaces out in your browser):

www. meetup.com /coldfusionmeetup/ events /117885652/

Regards, Carl.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Aug 06, 2014 Aug 06, 2014

Copy link to clipboard

Copied

I received this message "service not available Jakarta/ISAPI/isapi_redirector/1.2.37" looked up some articles but didnt find anything.

I noticed that the coldfusion.exe service wasnt started so i tried starting it, it wouldnt start.

I noticed that there were other coldfusion services started, so i "end process" on all of those, and then started the coldfusion service, and it started.

I cant really explain why, but it worked.

Hope this helps.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Aug 07, 2014 Aug 07, 2014

Copy link to clipboard

Copied

Perhaps your site needs Connector Tuning as first measure of troubleshooting. Please refer to ColdFusion 11 IIS Connector Tuning — Adobe ColdFusion Blog

Regards,

Anit Kumar

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Dec 18, 2014 Dec 18, 2014

Copy link to clipboard

Copied

Anit Kumar Panda wrote:

Perhaps your site needs Connector Tuning as first measure of troubleshooting. Please refer to ColdFusion 11 IIS Connector Tuning — Adobe ColdFusion Blog

Regards,

Anit Kumar

There was an open bug on this item that was just recently closed with CF11 Update 3.  I haven't tested this particular bug yet but I will be soon as we were holding off moving to CF10 or CF11 because this was a show stopper for our environment.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 22, 2014 Dec 22, 2014

Copy link to clipboard

Copied

Was are also in the same boat.  We have a site that just will not run log without the 502 and 503 errors.  I've tuned the connector every which way from sideways and nothings doing it.  Tomcat get's hung up and just won't deal.  It never utilizes all of it's allocated memory.  Basically, we can't find anything stable to run our CFML sites on.  I guess you can buy 11 licenses and downgrade to CF9.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Dec 22, 2014 Dec 22, 2014

Copy link to clipboard

Copied

Can you provide more details about your environment?

Regards,

Anit Kumar

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 13, 2016 Jan 13, 2016

Copy link to clipboard

Copied

Curious if anyone actually found a resolution to this issue. We still experience app pool crashes on CF 10 Update 18. We have tuned our connector and follow all different kind of advice on this issue. When the application pool fails, simply stopping it and starting it in IIS is all we need to do to be back up and running, but it is very unpredictable as to when it will happen. As most of these threads have died, I'm guessing that there is a solution out there for this issue?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 13, 2016 Jan 13, 2016

Copy link to clipboard

Copied

We used to have these crashes, but after making the changes specified in

the blog (pool size and timeout both in connector and in server.xml) I

haven't seen this happening. We are on CF10 update 16.

Sumit Verma

Partner / Vice President | ten24, LLC

office: 877.886.5806 x 103 | mobile: 617.290.8214

www.ten24web.com | www.linkedin.com/in/sverma | twitter: blogonria

On Wed, Jan 13, 2016 at 8:13 PM, John Sieber <forums_noreply@adobe.com>

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 13, 2016 Jan 13, 2016

Copy link to clipboard

Copied

Hi Sumit,

Thanks for responding! I'm wondering if this issue might have been reintroduced in update 17. We had this awhile back and it seemed to go away and come back again, I'm wondering if it came back when we went to update 17. Out of curiosity, what values are you using in your workers.properties and server.xml.

We had the following in workers.properties:

worker.list=cfusion

worker.cfusion.type=ajp13

worker.cfusion.host=localhost

worker.cfusion.port=8012

worker.cfusion.max_reuse_connections=900

worker.cfusion.connection_pool_size=900

worker.cfusion.connection_pool_timeout=60

and server.xml

<Connector protocol="AJP/1.3" port="8012" redirectPort="8445" maxThreads="900" connectionTimeout="60000" tomcatAuthentication="false"></Connector>

After checking our isapi_redirect.log in hopes of finding why our app pool fails, it seemed that we were never close to hitting 900 connections, so I just reduced the values to this:

worker.list=cfusion

worker.cfusion.type=ajp13

worker.cfusion.host=localhost

worker.cfusion.port=8012

worker.cfusion.connection_pool_minsize=200

worker.cfusion.max_reuse_connections=500

worker.cfusion.connection_pool_size=500

worker.cfusion.connection_pool_timeout=60

<Connector protocol="AJP/1.3" port="8012" redirectPort="8445" maxThreads="500" connectionTimeout="60000" tomcatAuthentication="false"></Connector>



Could it maybe be the value of our connection_pool_timeout that needs to be adjusted?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 13, 2016 Jan 13, 2016

Copy link to clipboard

Copied

Only difference is I'm not using min size. Here is what I have on a 2 site

setup:

worker.cfusion.connection_pool_size=600

worker.cfusion.max_reuse_connections=200

worker.cfusion.connection_pool_timeout=60

Sumit Verma

Partner / Vice President | ten24, LLC

office: 877.886.5806 x 103 | mobile: 617.290.8214

www.ten24web.com | www.linkedin.com/in/sverma | twitter: blogonria

On Wed, Jan 13, 2016 at 8:31 PM, John Sieber <forums_noreply@adobe.com>

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 13, 2016 Jan 13, 2016

Copy link to clipboard

Copied

Thanks Sumit

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 14, 2016 Jan 14, 2016

Copy link to clipboard

Copied

John, did you notice Sumit mentioned having 2 sites? You have not indicated how any sites you have, or how many connectors (and if more than one, how many sites are connected to each connector). Each of those is vital to both the troubleshooting and tuning of your problem. Let us know, and then I could elaborate.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 14, 2016 Jan 14, 2016

Copy link to clipboard

Copied

Hi Charlie,

Thanks for responding. We have one connector that has three sites connected to it. We have other sites running in IIS, but they are either php sites, or simply sites that redirect url's to other sites. All of the sites outside of the three Coldfusion sites have had the jakarta and cfide virtual directories removed. The other sites do still have a tomcat ISAPI Filter defined as it is inherited from the parent. Total we have 20 sites in IIS, 3 Coldfusion sites, 2 php sites, and 15 sites that are just domain redirects to the 5 actual sites.

When the application pool fails, the other application pools and sites remain running just fine. The site with the failed application pool stops responding and request to the site just spin indefinitely. I have Rapid-Fail Protection disabled on the site with issues, but when left on, it just stops the application pool and requests will receive 503 errors. When I stop and start the application pool, everything goes back to normal and the site continues to run properly. When the application pool has failed, I see lines in my httperr file that all have client_reset.

Thanks for any thoughts you might have on the issue.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 14, 2016 Jan 14, 2016

Copy link to clipboard

Copied

Hi, John. So first, if you have one connector and 3 sites, then you should not have the max_reuse = the connection_pool_size. You should split it to 1/3 the pool size.

The max_reuse effectively dedicates the number of connections per site. With both values at 900, you could have either of 2 of the 3 sites take up all 900 connections, leaving one “starved”. Setting it to 300, each is guaranteed to have a fair share of connections.

(And really, I’d not worry about which is “busier” and try to wonder if “one should get more than the other”. That’s not nearly as important, in my experience, as simply preventing any one site from being starved of threads.)

Second, the sites that “are not using CF” but DO still have the isapi filter MAY contribute to problems (and so you may want to add them to the divisor above), though they should not since they would not be sending CFM file requests to CF—and they’d not work if tried, because of the lack of the Jakarta VD.

Let us know how it goes if you change that max_reuse to 300.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 14, 2016 Jan 14, 2016

Copy link to clipboard

Copied

Thanks Charlie. I'll configure this change and report back if/when the application fails in the future. Appreciate that you took the time to offer insight.

-John

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 21, 2016 Jan 21, 2016

Copy link to clipboard

Copied

We have had our application pool fail two more times since my last post. When it fails, requests will first start to stack up in Fusion Reactor, but the app pool still shows as running in IIS. Stopping the app pool, restarting the site in IIS, and then starting the app pool will return the site to behaving properly. When it failed last night, all of the requests listed in the Fusion Reactor Protection Alert were from the same ip address and when researched the ip address is a know spam bot address. Even if I have Fusion Reactor attempt to kill these requests after a certain time limit, the application pool still no longer responds and has to be restarted.

A few new pieces of information.

1. Many of the hung requests in Fusion Reactor showed the following at the top of their stack traces:

java.net.SocketInputStream.socketRead0(SocketInputStream.java:???)[Native Method]

java.net.SocketInputStream.read(SocketInputStream.java:150)

java.net.SocketInputStream.read(SocketInputStream.java:121)

sun.security.ssl.InputRecord.readFully(InputRecord.java:465)

sun.security.ssl.InputRecord.read(InputRecord.java:503)

sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:954)

- locked <0x7737c017> (a java.lang.Object)

sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:911)

sun.security.ssl.AppInputStream.read(AppInputStream.java:105)

- locked <0x692b112f> (a sun.security.ssl.AppInputStream)

java.io.BufferedInputStream.fill(BufferedInputStream.java:246)

java.io.BufferedInputStream.read1(BufferedInputStream.java:286)

java.io.BufferedInputStream.read(BufferedInputStream.java:345)

- locked <0x12fecd04> (a java.io.BufferedInputStream)

sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:703)

sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)

sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1534)

- locked <0x5d639a1e> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)

sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1439)

- locked <0x5d639a1e> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)

sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254)

- locked <0x149ddedd> (a sun.net.www.protocol.https.HttpsURLConnectionImpl)

java.net.URL.openStream(URL.java:1038)

coldfusion.image.ImageReader.readJPEGImage(ImageReader.java:187)

coldfusion.image.ImageReader.readImage(ImageReader.java:69)

coldfusion.image.Image.(Image.java:270)

coldfusion.tagext.io.ImageTag.doStartTag(ImageTag.java:371)

coldfusion.runtime.CfJspPage._emptyTcfTag(CfJspPage.java:2795)

cfmuraMeta2ecfc1790351436$funcMAKETWITTERMETA.runFunction(D:\webroot\test.org\plugins\MuraMeta\eventHandlers\muraMeta.cfc:96)

The line 96 from muraMeta.cfc is a  call to cfimage to read an image. I have removed this as it is not absolutely necessary, but wonder if this overhead on the rendering of every request could be an issue or maybe a problem in the specific version of the jvm we are running, version 1.8.0_25?

2. I'm seeing these lines in my isap_redirect.log file:

[Thu Jan 21 07:32:41.746 2016] [8144:8612] [error] HttpExtensionProc::jk_isapi_plugin.c (2781): failed to init service for request.

[Thu Jan 21 07:33:03.127 2016] [8144:8612] [error] HttpExtensionProc::jk_isapi_plugin.c (2781): failed to init service for request.

[Thu Jan 21 07:33:24.582 2016] [8144:8652] [error] HttpExtensionProc::jk_isapi_plugin.c (2781): failed to init service for request.

This really seems like a connector problem as I never need to restart IIS when the problem occurs, just stopping and starting the application pool resolves the issue. The site can go for a week without this happening or only a day or two. Any other thoughts on this matter are truly appreciated.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 21, 2016 Jan 21, 2016

Copy link to clipboard

Copied

So John, let’s split this into two topics.

1) To be clear, the hung up requests would NOT be caused by any issue with the connector. (I don’t know if you’re assuming that, but I can assure you it is not.) IF they get “through” the connector and are running in CF, the connector’s job is done and it’s out of the picture. (As for whether piled up requests could lead to a failure in the connector, that’s a different possibility. More in a moment.)

1a) And as for the hung up requests, if you mean that the stack trace was done when a request was running long, and it showed that cfimage running, then that does seem to explain the hung up requests. (The proof would be if you kept stack tracing them, and they remained on that line.)

1b) As for the cfimage hanging up, do you notice that it refers to httpclient? That tells you that the file that CFIMAGE was told to read was either a URL or some other remote thing (like an AWS S3 bucket, which CF can read like a file system). THAT is likely what was hung up. CF was thus the victim.

1c) And sure, if it was some automated agent generating many requests causing that, it would only exacerbate (though not necessarily cause) the problem. Consider that the remote site being called may have throttled your requests, thus hanging up the cfimage read. That could happen also if a bunch of real users made a sudden batch of requests, but again it’s more likely to be an issue with it’s some automated agent (spider, bot, hacker, monitoring tool, security scanning tool, etc.)

2) So now, as for the crash of the app pool, well, it’s POSSIBLE that a pile up of CF requests could back up to IIS somehow and cause a failure, but I’ve never seen it myself (and I help people troubleshoot CF server problems all day each day).

2a) Instead, my first question (if I were in your shoes) would be whether your web server connector has been updated. You say you are on CF10 update 18. I’ll assume you know that it’s not enough to just do the update, but (as the update notes say) you must rebuild the web server connector. Many miss that, and often have very old connectors.

So what is the date of the isapi_redirect.dll in all your folders under coldfusion10\config\wsconfig? If it’s not from Sep 2015 (what it should be for CF10 update 18), then whoever did the update did not update the connector. And that could be your problem—especially if it’s even older, like 2013 or 2012, when the connectors were quite buggy initially.

Doing a rebuild/reconfiguration of the connector can be a 5 minute job if it goes well (not even needing a restart of CF, but of the web server). If it goes poorly, you could be down while trying to resolve things. If you wanted help doing it, that’s indeed the kind of thing I do often help people with on a consulting basis. More (Rates, approach, satisfaction guarantee) at the consulting page of carehart.org.

Otherwise, hope the above it helpful.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 21, 2016 Jan 21, 2016

Copy link to clipboard

Copied

Hi Charlie,

I really appreciate the response! You helped us out a few years back with some performance issues that were resolved with tuning the jvm and I really found your help to be a life saver!

1) I totally agree that the hung requests in Fusion Reactor are not caused by the connector. When the issue starts, we check Fusion Reactor, running directly through its port, and that is where we see the stacked requests. Once we restart the app pool, those requests timeout and new requests come in and process correctly. So, the hung requests are a sign that the app pool is about to fail. Once it fails, no further requests make it to Coldfusion. Once this starts happening requests are logged in the httperr.log with the value Client_Reset. ex.. 2016-01-20 21:58:06 12.29.201.82 58340 10.0.1.176 80 HTTP/1.1 GET /default/assets/Image/blog/Nov+2012/8145030183_2fee2cc187_n.jpg - 2 Client_Reset test.org

1a) I will have to try and see if I can monitor the stack trace on the hung requests the next time it happens. I would not be surprised if they eventually finish, but that does not recover the app pool. Typically, when this happens, we just want to get the site up again, but we need to get to the bottom of this, so it is a good suggestion.

1c) Agreed. Typically when I see requests backup in Fusion Reactor it is an automated spider or bot. A build up of requests does not guarantee that the app pool will fail, it is just that when the app pool does fail, I always get Protection alerts for a backup of running requests that immediately proceed the failure.

2a) Our connector is definitely updated. We definitely know about this and have tried updating it several times. The isapi_redirect.dll is dated 9/10/2015 11:37 AM and shows as version 1.2.41.0. This the version detail in the isapi_redirect.log file when the sites starts up:

[Fri Dec 04 18:00:32.790 2015] [860:6804] [info] jk_log_version::jk_connector_version.h (21): Connector Version: 295188

[Fri Dec 04 18:00:32.796 2015] [860:6804] [info] init_jk::jk_isapi_plugin.c (3157): Starting Tomcat/ISAPI/isapi_redirector/1.2.41

[Fri Dec 04 18:00:32.801 2015] [860:6804] [info] init_jk::jk_isapi_plugin.c (3355): Tomcat/ISAPI/isapi_redirector/1.2.41 initialized

I'm going to take a few more tries at this, but may contact you directly if we can't get to the bottom of this. It would be great if Adobe would reach out and help their paying customers in these forums!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 22, 2016 Jan 22, 2016

Copy link to clipboard

Copied

Thanks for the kind regards.

As for your finding that recycling the app pool helps (“Once we restart the app pool, those requests timeout and new requests come in and process correctly”), that would tell me something very interesting:

Recall how I noted that your stack traces showed the hanging requests waiting on a CFIMAGE read action, which was resulting in an http request?

I am willing to bet now that you will find that the source of the cfimage was a URL on your own server (whether for a CF page or perhaps even a static image, PDF, or whatever).

And I’d be willing to bet that THOSE requests (the ones your CFIMAGEs were waiting on) were then stacked up in IIS. And so yes, recycling the app pool(s) cleared THOSE, and that then allowed the hanging requests in CF (the CFIMAGE waiting on those) to now end (or timeout, if you have times set).

I’d really now suspect you may have an issue with the web server connector tuning. (It’s great to hear that you see the dll is updated. And I’m assuming you checked ALL your numbered folders under coldfusion10\config\wsconfig). So now the question is had you considered or done any tuning of those, tweaking the workers.properties file settings in each of those numbered folders?

For instance, do you at least have the connection_pool_timeout set? Many find that just setting that to 60 is a huge help. (To be clear, this has NOTHING TO DO WITH how long pages in CF run. It’s how long idle connections in the connector remain living.) Sadly, it’s not set by default, and so there is no timeout and they can live forever (idle), and eventually the connector runs out of threads, though due to other matters related to the settings for connection_pool_size (also not set but which defaults to 250) and max_reuse_connections (which is set initially to 250).

If you have more than one site, and whether you have one connector or more, then you need to do some tuning, otherwise (for a variety of reasons), any one site could grab all the threads leaving none for other sites. And again, this could absolutely be exacerbated by a spider or bot crawling one or more sites on your server, leading to this starvation in one or more of the sites.

Fortunately, Adobe has a blog post on this. There was one for CF10 in 2012, and another in 2014 for CF11. I know you’re on 10, but read the one for CF11 as it has much more info, and the concepts and defaults are exactly the same. And you’ll read there that there are also two corresponding changes needed in your CF instance’s server.xml (coldfusion10\cfusion\runtime\conf, or replacing cfusion with your instancename if on Enterprise with multiple instances).

I will admit that the blog post leaves some things to be desired, and you’ll see that both the CF10 and CF11 posts have many comments (a few dozen each) from folks (including myself) who tried to draw out more details and clarifications from Adobe. For now, it’s about all we have. See

http://blogs.coldfusion.com/post.cfm/coldfusion-11-iis-connector-tuning

And then note that they subsequently came out with a post to talk about adding a measure of monitoring for the connector, in the form of the Tomcat “Status Worker” which one can easily enable (no restart of CF required). I blogged about that with a little more intro and pointing to the Adobe post, here:

http://www.carehart.org/blog/client/index.cfm/2015/8/3/more_on_tomcat_status_worker

Finally, those perhaps reading along and new to the whole issue (alluded to in the last two notes) about the need to update your web server connector, I had also blogged about that here:

http://www.carehart.org/blog/client/index.cfm/2013/9/13/why_you_must_update_cf10_webserver_connector

Hope that’s helpful.

/charlie


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 22, 2016 Jan 22, 2016

Copy link to clipboard

Copied

Hi Charlie,

I really appreciate the time you take to answer these questions for not only my own sake, but the rest of the community as well. We have made multiple attempts at tunning the connector over the past through years and have read through all the comments on the cf10 and cf11 connector tuning posts on the adobe coldfusion blog. We do have the connection_pool_timeout set to 60 in workers.properties and connectionTimeout="60000" in server.xml. When I look at the isapi_redirect.log, I rarely ever see it report that we have used more than 30 connections, so I wonder if we are really saturating our connections, or if the connector some how just kills the application pool?

I like your theory on the cfimage part, and yes they were requests to our own site. That call was being made on every page render across our entire site by a plugin for our cms, and it was not really needed, so I have removed it and it will be interesting to see what they change alone does for our situation.

One last thing I noticed last night, is that we would have bots repetitively hit one of the error pages from one of fw/1 based plugins. The actual request was to a sample app built into the fw/1 plugin and not the actual live plugin, so I will remove the sample app, but these requests would continue to churn until fusion reactor would eventually kill them. When I look at the stack trace I noticed paths to invalid directories for cfdump. ex: cfdump2ecfm1093827541$funcRENDEROUTPUT.runFunction(C:\work\cf10_updates\cfusion\wwwroot\WEB-INF\cftags\dump.cfm:681)

We don't have anything cf related on our C: drive, but searching Google for this path returns all kinds of results from stacktraces on actual sites to Adobe Coldfusion bug tracker pages.

Seems really odd, should the stacktraces contain paths like this, or is something else going on?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation