• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

CF10 database connection BLOCKED?? - JDBCPool.checkOut/checkIn

Contributor ,
Nov 30, 2013 Nov 30, 2013

Copy link to clipboard

Copied

I am continuing to troublshoot some 503 issues with my new server.  I recently enabled the Dump Snapshot feature inside Server Monitor and whenever I have an Unresponsive Server alert and I am seeing tons of these in the snapshot.  Anyone know what this means exactly??

"ajp-bio-8012-exec-384" prio=5 tid=1261 BLOCKED

     at coldfusion.server.j2ee.sql.pool.JDBCPool.checkOut(JDBCPool.java:371)

     at coldfusion.server.j2ee.sql.pool.JDBCPool.requestConnection(JDBCPool.java:825)

     at coldfusion.server.j2ee.sql.pool.JDBCManager.requestConnection(JDBCManager.java:125)

     at coldfusion.server.j2ee.sql.JRunDataSource.getConnection(JRunDataSource.java:135)

     at coldfusion.sql.CFDataSource.getConnection(CFDataSource.java:42)

     at coldfusion.sql.DataSrcImpl.getCachedConnection(DataSrcImpl.java:156)

     at coldfusion.sql.DataSrcImpl.getConnection(DataSrcImpl.java:110)

     at coldfusion.sql.SqlImpl.execute(SqlImpl.java:363)

     at coldfusion.tagext.sql.QueryTag.executeQuery(QueryTag.java:1059)

     at coldfusion.tagext.sql.QueryTag.doEndTag(QueryTag.java:688)

     at cfpageInitQrygetPortalInfo2ecfm582265125.runPage(E:\pathToMyPage.cfm:24)

ETC...

And something like this during another similar event:

"ajp-bio-8012-exec-127" prio=5 tid=325 BLOCKED

     at coldfusion.server.j2ee.pool.ObjectPool.checkIn(ObjectPool.java:263)

     at coldfusion.server.j2ee.sql.pool.JDBCPool.returnConnection(JDBCPool.java:851)

     at coldfusion.server.j2ee.sql.pool.JDBCPool.connectionClosed(JDBCPool.java:324)

     at coldfusion.server.j2ee.sql.JRunConnection.sendCloseEvent(JRunConnection.java:345)

     at coldfusion.server.j2ee.sql.JRunConnectionHandle.close(JRunConnectionHandle.java:72)

     at coldfusion.sql.DataSrcImpl.clear(DataSrcImpl.java:182)

     at coldfusion.sql.DataSrcImpl.clearSqlProxy(DataSrcImpl.java:25)

     at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:26)

     at coldfusion.filter.CachingFilter.invoke(CachingFilter.java:62)

     at coldfusion.CfmServlet.service(CfmServlet.java:219)

     at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42)

     at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)

     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)

     at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)

     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)

     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)

     at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)

     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)

     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:414)

     at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:204)

     at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)

     at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)

     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:744)

The issues I'm having sound very similar to what's going on here and here but so far I have reached no conclusions.  WTH is going on?

I have a brand new 64-bit Win2008/IIS7 machine (well, VMware if it makes any diff) running CF10 with the latest Update 12 patch, etc.  I even updated the JRE to the latest version (1.7.0_45 at the time of this writing).  I have done the whole Tomcat connector thing, modfied it several times with higher pool sizes, timeouts, etc., reinstalled the connector serveral times (the last being when I updated to CF10 Update 12).  The JVM is perfectly configured.  I've tried some of Carl's tweaks to no avail.  Anit from Adobe Support has been helping me, too, and has done remote desktop sessions and seen my server and says all the settings look good, so he is sort of at a loss too except to analyze my logs whenever I send them, but I am not getting any concrete data that I think will help yet, as I am very familiar with what to look for in general.  I've been doing CF for 16 years and never encountered such a stubborn issue.

TOPICS
Database access

Views

8.9K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Contributor , Jan 28, 2014 Jan 28, 2014

Hi, all.  My problem was finally solved with changes to our network wiring configuration.  Here is the write-up from my IT team explaining exactly what they did:

------------

We performed a few upgrades simultaneously which may have contributed to the current stability. Though the Oracle upgrade and the kernel upgrade / patch may have been factors, it's our opinion that the network port was the primary culprit.

The way that the network was previously configured, each server gained network access t

...

Votes

Translate

Translate
Contributor ,
Dec 02, 2013 Dec 02, 2013

Copy link to clipboard

Copied

Monday morning bump.  Please help if anyone knows.  This is driving me crazy and locking up my production server several times per day, but mostly at evening/night, which is strange but perhaps telling.

One thing I've been thinking:  maybe it has to do with the max number of cached queries?  I have set the limit of cached queries to 20,000, and my Server Monitor tells me it is always at the maximum level after running for a day or two.  Could it be that the server hangs while it is trying to release the older cached queries in order to insert new queries into the cache?  Is 20,000 too much on a 64-bit machine with 8GM of RAM?  Memory and CPU definitely do not seem to be an issue on the surface.

Or could it be an issue with a CFLOCK somewhere?  If so, I would expect to see errors in the log related to CFLOCK but I do not, and the one or two pages I have with CFLOCK in them run great and by themselves do not seem to cause any problem.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Dec 02, 2013 Dec 02, 2013

Copy link to clipboard

Copied

This looks like a deadlock somewhere inside CF. If you have CF enterprise

you can trigger a snapshot in the Server Monitor and that might give you

more info or if you have CF standard you can start CF from the command line

("coldfusion start console"), attach JVisualm to that ColdFusion intance

and get a thread dump. That thread dump will contain more info about locks

that are held by each thread.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Dec 02, 2013 Dec 02, 2013

Copy link to clipboard

Copied

Hi, Mack.  Yes, this is Enterprise. The info above is from the snapshot, which is generated automatically when I get an Unresponsive Server alert from CF and attached to my email.  I only pasted what I thought was the relevant part.  I don't see anything in the rest of the file about "locks" specifically.  What should I be looking for??

Here is what it says at the very top of the snapshot, then underneath are all the specific threads running, and their stack trace dumps.   They are all basic requests to the index.cfm page of multiple sites, probably because there is some spider or robot scanning the site, but nothing we can't handle load-wise.  I have used JMeter and done stress testing to confirm this.

Snapshot generated on : Sun Dec 01 20:46:01 EST 2013  Caused by : Unresponsive Server Alert
Profiling : Disabled
Memory Tracking : Disabled

8 or more threads are busy for more than 60 seconds.

*Total Running requests - 85

*Total Queued requests - 4

Template Running requests - 85

Template Queued requests - 4

Flash Remoting Running requests - 0

Flash Remoting Queued requests - 0

CFC method Running requests - 0

CFC method Queued requests - 0

Web Service Running requests - 0

Web Service Queued requests - 0

JVM Memory Stats

Free Memory : 507323936 bytes

Used Memory : 787667424 bytes

Throttle Stats [Throttle Queue Size : Throttle Memory Used]


0  : 0.0 megabytes

Query Cache Stats -

[Hit Ratio : Size : Count]

   0.0 : 0 bytes : 19927

Cached Queries -

[Query Name : DSN : Size : TIme Executed : Time Taken : Line No : Function Name : Template Path]

DB Pool Stats -

[DSN :Open Conn Count : Total Conn Count : Max Con Count : Avg Open Conn Count : Avg Total Conn Count]

   DSN1 : 0 : 1 : Unlimited : 0 : 6

   DSN2 : 0 : 0 : Unlimited : 0 : 0

   DSN3 : 2 : 3 : Unlimited : 2 : 3

   DSN4 : 0 : 1 : Unlimited : 0 : 18

   DSN5 : 0 : 0 : Unlimited : 0 : 0

   DSN6 : 0 : 1 : Unlimited : 0 : 1

   DSN7 : 0 : 0 : Unlimited : 0 : 0

   DSN8 : 37 : 37 : Unlimited : 31 : 43

   DSN9 : 0 : 0 : Unlimited : 0 : 0

   DSN10 : 0 : 1 : Unlimited : 0 : 0

   DSN11 : 0 : 0 : Unlimited : 0 : 0

   DSN12 : 0 : 1 : Unlimited : 0 : 0

   DSN13 : 0 : 1 : Unlimited : 0 : 1

   DSN14 : 0 : 2 : Unlimited : 0 : 1

Then a list of the 85 threads/pages (in this case they are all index.cfm for one or more of our sites--we only have a few).

Then a list of the actual FULL thread dump.  Here is more from the same file:

Java stack trace -

Full thread dump Java 1.7.0_45

"ajp-bio-8012-exec-482" prio=5 tid=1092 TIMED_WAITING

     at sun.misc.Unsafe.park(Native Method)

     at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)

     at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)

     at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)

     at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:86)

     at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:32)

     at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)

     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:744)

"ajp-bio-8012-exec-523" prio=5 tid=1139 BLOCKED

     at coldfusion.server.j2ee.sql.pool.JDBCPool.checkOut(JDBCPool.java:371)

     at coldfusion.server.j2ee.sql.pool.JDBCPool.requestConnection(JDBCPool.java:825)

     at coldfusion.server.j2ee.sql.pool.JDBCManager.requestConnection(JDBCManager.java:125)

     at coldfusion.server.j2ee.sql.JRunDataSource.getConnection(JRunDataSource.java:135)

     at coldfusion.sql.CFDataSource.getConnection(CFDataSource.java:42)

     at coldfusion.sql.DataSrcImpl.getCachedConnection(DataSrcImpl.java:156)

     at coldfusion.sql.DataSrcImpl.getConnection(DataSrcImpl.java:110)

     at coldfusion.sql.SqlImpl.execute(SqlImpl.java:363)

     at coldfusion.tagext.sql.QueryTag.executeQuery(QueryTag.java:1059)

     at coldfusion.tagext.sql.QueryTag.doEndTag(QueryTag.java:688)

     at cfpageInitQrygetPortalInfo2ecfm582265125.runPage(E:\Inetpub\mySite1\Queries\pageInitQrygetPortalInfo.cfm:24)

     at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:244)

     at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:444)

     at coldfusion.runtime.CfJspPage._emptyTcfTag(CfJspPage.java:2799)

     at cfpageInit2ecfm1741799151.runPage(E:\Inetpub\mySite1\Includes\pageInit.cfm:70)

     at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:244)

     at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:444)

     at coldfusion.runtime.CfJspPage._emptyTcfTag(CfJspPage.java:2799)

     at cfpage2ecfm2105009432.runPage(E:\Inetpub\mySite1\page.cfm:21)

     at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:244)

     at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:444)

     at coldfusion.filter.CfincludeFilter.invoke(CfincludeFilter.java:65)

     at coldfusion.filter.IpFilter.invoke(IpFilter.java:64)

     at coldfusion.filter.ApplicationFilter.invoke(ApplicationFilter.java:449)

     at coldfusion.filter.RequestMonitorFilter.invoke(RequestMonitorFilter.java:48)

     at coldfusion.filter.MonitoringFilter.invoke(MonitoringFilter.java:40)

     at coldfusion.filter.PathFilter.invoke(PathFilter.java:112)

     at coldfusion.filter.ExceptionFilter.invoke(ExceptionFilter.java:94)

     at coldfusion.filter.ClientScopePersistenceFilter.invoke(ClientScopePersistenceFilter.java:28)

     at coldfusion.filter.BrowserFilter.invoke(BrowserFilter.java:38)

     at coldfusion.filter.NoCacheFilter.invoke(NoCacheFilter.java:58)

     at coldfusion.filter.GlobalsFilter.invoke(GlobalsFilter.java:38)

     at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:22)

     at coldfusion.filter.CachingFilter.invoke(CachingFilter.java:62)

     at coldfusion.CfmServlet.service(CfmServlet.java:219)

     at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42)

     at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)

     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)

     at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)

     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)

     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)

     at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)

     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)

     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:414)

     at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:204)

     at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)

     at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)

     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:744)

"ajp-bio-8012-exec-548" prio=5 tid=1173 BLOCKED

     at coldfusion.server.j2ee.sql.pool.JDBCPool.checkOut(JDBCPool.java:371)

     at coldfusion.server.j2ee.sql.pool.JDBCPool.requestConnection(JDBCPool.java:825)

     at coldfusion.server.j2ee.sql.pool.JDBCManager.requestConnection(JDBCManager.java:125)

     at coldfusion.server.j2ee.sql.JRunDataSource.getConnection(JRunDataSource.java:135)

     at coldfusion.sql.CFDataSource.getConnection(CFDataSource.java:42)

     at coldfusion.sql.DataSrcImpl.getCachedConnection(DataSrcImpl.java:156)

     at coldfusion.sql.DataSrcImpl.getConnection(DataSrcImpl.java:110)

     at coldfusion.sql.SqlImpl.execute(SqlImpl.java:363)

     at coldfusion.tagext.sql.QueryTag.executeQuery(QueryTag.java:1059)

     at coldfusion.tagext.sql.QueryTag.doEndTag(QueryTag.java:688)

     at cfpageInitQrygetPortalInfo2ecfm582265125.runPage(E:\Inetpub\mySite1\Queries\pageInitQrygetPortalInfo.cfm:24)

     at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:244)

     at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:444)

     at coldfusion.runtime.CfJspPage._emptyTcfTag(CfJspPage.java:2799)

     at cfpageInit2ecfm1741799151.runPage(E:\Inetpub\mySite1\Includes\pageInit.cfm:70)

     at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:244)

     at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:444)

     at coldfusion.runtime.CfJspPage._emptyTcfTag(CfJspPage.java:2799)

     at cfpage2ecfm2105009432.runPage(E:\Inetpub\mySite1\page.cfm:21)

     at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:244)

     at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:444)

     at coldfusion.filter.CfincludeFilter.invoke(CfincludeFilter.java:65)

     at coldfusion.filter.IpFilter.invoke(IpFilter.java:64)

     at coldfusion.filter.ApplicationFilter.invoke(ApplicationFilter.java:449)

     at coldfusion.filter.RequestMonitorFilter.invoke(RequestMonitorFilter.java:48)

     at coldfusion.filter.MonitoringFilter.invoke(MonitoringFilter.java:40)

     at coldfusion.filter.PathFilter.invoke(PathFilter.java:112)

     at coldfusion.filter.ExceptionFilter.invoke(ExceptionFilter.java:94)

     at coldfusion.filter.ClientScopePersistenceFilter.invoke(ClientScopePersistenceFilter.java:28)

     at coldfusion.filter.BrowserFilter.invoke(BrowserFilter.java:38)

     at coldfusion.filter.NoCacheFilter.invoke(NoCacheFilter.java:58)

     at coldfusion.filter.GlobalsFilter.invoke(GlobalsFilter.java:38)

     at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:22)

     at coldfusion.filter.CachingFilter.invoke(CachingFilter.java:62)

     at coldfusion.filter.RequestThrottleFilter.invoke(RequestThrottleFilter.java:151)

     at coldfusion.CfmServlet.service(CfmServlet.java:219)

     at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42)

     at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)

     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)

     at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)

     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)

     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)

     at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)

     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)

     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:414)

     at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:204)

     at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)

     at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)

     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:744)

"Finalizer" prio=8 tid=3 WAITING

     at java.lang.Object.wait(Native Method)

     at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)

     at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)

     at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

"ajp-bio-8012-exec-594" prio=5 tid=1232 TIMED_WAITING

     at java.lang.Object.wait(Native Method)

     at java.lang.Object.wait(Object.java:461)

     at edu.emory.mathcs.backport.java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:301)

     at edu.emory.mathcs.backport.java.util.concurrent.helpers.WaitQueue$WaitNode.doTimedWait(WaitQueue.java:79)

     at edu.emory.mathcs.backport.java.util.concurrent.Semaphore$FairSync.attempt(Semaphore.java:346)

     at edu.emory.mathcs.backport.java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:571)

     at coldfusion.CfmServlet.lock(CfmServlet.java:299)

     at coldfusion.CfmServlet.service(CfmServlet.java:218)

     at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42)

     at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)

     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)

     at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)

     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)

     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)

     at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)

     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)

     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:414)

     at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:204)

     at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)

     at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)

     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:744)

"ajp-bio-8012-exec-584" prio=5 tid=1219 BLOCKED

     at coldfusion.server.j2ee.pool.ObjectPool.checkIn(ObjectPool.java:263)

     at coldfusion.server.j2ee.sql.pool.JDBCPool.returnConnection(JDBCPool.java:851)

     at coldfusion.server.j2ee.sql.pool.JDBCPool.connectionClosed(JDBCPool.java:324)

     at coldfusion.server.j2ee.sql.JRunConnection.sendCloseEvent(JRunConnection.java:345)

     at coldfusion.server.j2ee.sql.JRunConnectionHandle.close(JRunConnectionHandle.java:72)

     at coldfusion.sql.DataSrcImpl.clear(DataSrcImpl.java:182)

     at coldfusion.sql.DataSrcImpl.clearSqlProxy(DataSrcImpl.java:25)

     at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:26)

     at coldfusion.filter.CachingFilter.invoke(CachingFilter.java:62)

     at coldfusion.CfmServlet.service(CfmServlet.java:219)

     at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42)

     at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)

     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)

     at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)

     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)

     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)

     at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)

     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)

     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:414)

     at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:204)

     at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)

     at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)

     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:744)

"ajp-bio-8012-exec-514" prio=5 tid=1129 BLOCKED

     at coldfusion.server.j2ee.sql.pool.JDBCPool.checkOut(JDBCPool.java:371)

     at coldfusion.server.j2ee.sql.pool.JDBCPool.requestConnection(JDBCPool.java:825)

     at coldfusion.server.j2ee.sql.pool.JDBCManager.requestConnection(JDBCManager.java:125)

     at coldfusion.server.j2ee.sql.JRunDataSource.getConnection(JRunDataSource.java:135)

     at coldfusion.sql.CFDataSource.getConnection(CFDataSource.java:42)

     at coldfusion.sql.DataSrcImpl.getCachedConnection(DataSrcImpl.java:156)

     at coldfusion.sql.DataSrcImpl.getConnection(DataSrcImpl.java:110)

     at coldfusion.sql.SqlImpl.execute(SqlImpl.java:363)

     at coldfusion.tagext.sql.QueryTag.executeQuery(QueryTag.java:1059)

     at coldfusion.tagext.sql.QueryTag.doEndTag(QueryTag.java:688)

     at cfheader2ecfm1349916215.runPage(E:\Inetpub\mySite2\header.cfm:69)

     at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:244)

     at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:444)

     at coldfusion.runtime.CfJspPage._emptyTcfTag(CfJspPage.java:2799)

     at cfindex2ecfm1520480767.runPage(E:\Inetpub\mySite2\index.cfm:3)

     at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:244)

     at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:444)

     at coldfusion.filter.CfincludeFilter.invoke(CfincludeFilter.java:65)

     at coldfusion.filter.IpFilter.invoke(IpFilter.java:64)

     at coldfusion.filter.ApplicationFilter.invoke(ApplicationFilter.java:428)

     at coldfusion.filter.RequestMonitorFilter.invoke(RequestMonitorFilter.java:48)

     at coldfusion.filter.MonitoringFilter.invoke(MonitoringFilter.java:40)

     at coldfusion.filter.PathFilter.invoke(PathFilter.java:112)

     at coldfusion.filter.ExceptionFilter.invoke(ExceptionFilter.java:94)

     at coldfusion.filter.ClientScopePersistenceFilter.invoke(ClientScopePersistenceFilter.java:28)

     at coldfusion.filter.BrowserFilter.invoke(BrowserFilter.java:38)

     at coldfusion.filter.NoCacheFilter.invoke(NoCacheFilter.java:58)

     at coldfusion.filter.GlobalsFilter.invoke(GlobalsFilter.java:38)

     at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:22)

     at coldfusion.filter.CachingFilter.invoke(CachingFilter.java:62)

     at coldfusion.CfmServlet.service(CfmServlet.java:219)

     at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42)

     at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)

     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)

     at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)

     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)

     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)

     at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)

     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)

     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:414)

     at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:204)

     at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)

     at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)

     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:744)

"scheduler-0" prio=5 tid=21 RUNNABLE

     at java.lang.Thread.getStackTrace(Thread.java:1588)

     at coldfusion.monitor.stack.J5StackTrace.getAllStackTraces(J5StackTrace.java:39)

     at coldfusion.monitor.active.Snapshot.getJavaStackTrace(Snapshot.java:606)

     at coldfusion.monitor.active.Snapshot.getSnapshotData(Snapshot.java:165)

     at coldfusion.monitor.active.Snapshot.dumpSnapshot(Snapshot.java:595)

     at coldfusion.monitor.alert.Alert.dumpSnapshot(Alert.java:599)

     at coldfusion.monitor.alert.HungServerAlert$HungServerAlertTask.run(HungServerAlert.java:233)

     at coldfusion.monitor.MonitoringServiceImpl$RunnableWrapper.run(MonitoringServiceImpl.java:1570)

     at coldfusion.scheduling.ThreadPool.run(ThreadPool.java:211)

     at coldfusion.scheduling.WorkerThread.run(WorkerThread.java:71)

"ajp-bio-8012-exec-550" prio=5 tid=1175 BLOCKED

     at coldfusion.server.j2ee.pool.ObjectPool.checkIn(ObjectPool.java:263)

     at coldfusion.server.j2ee.sql.pool.JDBCPool.returnConnection(JDBCPool.java:851)

     at coldfusion.server.j2ee.sql.pool.JDBCPool.connectionClosed(JDBCPool.java:324)

     at coldfusion.server.j2ee.sql.JRunConnection.sendCloseEvent(JRunConnection.java:345)

     at coldfusion.server.j2ee.sql.JRunConnectionHandle.close(JRunConnectionHandle.java:72)

     at coldfusion.sql.DataSrcImpl.clear(DataSrcImpl.java:182)

     at coldfusion.sql.DataSrcImpl.clearSqlProxy(DataSrcImpl.java:25)

     at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:26)

     at coldfusion.filter.CachingFilter.invoke(CachingFilter.java:62)

     at coldfusion.CfmServlet.service(CfmServlet.java:219)

     at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42)

     at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46)

     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)

     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)

     at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)

     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)

     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)

     at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)

     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)

     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:414)

     at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:204)

     at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)

     at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298)

     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:744)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Dec 04, 2013 Dec 04, 2013

Copy link to clipboard

Copied

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Dec 04, 2013 Dec 04, 2013

Copy link to clipboard

Copied

Mack, it looks like your post did not go through, but I may have finally had a breakthrough on this issue of mine.  Tentatively, I am going to say the problem was my Max Pooled Statements was not high enough for some of our datasources.

All of mine were set to 100, which is what they have been at FOREVER, so although the CF10 documentation says "300" is now the default, I think I simply copied over "100" from my old installations every time we upgraded CF, without even thinking obviously.

Meanwhile, our use of cfqueryparam has gone up over the years, and I never knew use of cfqueryparam had anything to do with Max Pooled Statements. Now for some reason that still doesn't make a lot of sense to me (although Anit tried explaining to me on the phone), when your pool runs out, all threads become BLOCKED.  They do not wait or queue up like you might expect; they just stop, and this apparently can and does lock up the whole damn CF server, which will then start giving you happy joy 503 errors.  Even better is that none of this information get logged in any sensible way to help trace back the problem such as "you have x number of cfqueryparam statements, which exceeds your pool size" or something to that effect.

So last night after I upped my pools from 100 to 300 I did not have any problem with my server for a whole 12 hours (gasp!).  Happy me, right?  Wrong.  It all came crashing down in the middle of the work day today  with a different super joy error:

     [Macromedia][Oracle JDBC Driver][Oracle]ORA-01000: maximum open cursors exceeded.

I knew intrinsically this had something to do with the changes I made to the Max Pooled Statements, so now the issue became our Oracle database itself, which I do not manage.  So I asked our DBA and he said "open cursors" was set to exactly 300 on his end, so no wonder we ran out of cursors when I upped it on my end.  And I upped it for several of our datasources, not just the "big" one, because I had upped it before but ONLY for the one I thought was the problem, not the others we maintain, and I still had the damn problem so I ruled out the whole Max Pooled Statements thing at one point and moved on.  Turns out it was a different DSN that needed to be increased, or a combination of them all...I'm still not 100% sure.

So now Oracle has been tuned to allow 2000 open cursors, and I have increased all my datasources to have a much larger Max Pool Size because I actually counted how many cfqueryparams we were using and, yeah, there are a lot.  To make things more confusing, they are not on the main production site, they are on the intranet, which gets very little load (except at night when our Google Appliance scans the crap out of it) so go figure.  Most of the 503 / server lock-up events were at night, but once in a while it was during the day.  Fun times troublshooting that.

Moral of the story if you are having similar issues:  count up your useage of unique cfquery tags that use the cfqueryparam (and/or cfstoredproc) and make sure your Max Pooled Statements are at least equal to that number, THEN make sure your database is configured to allow at least that many open cursors for however many datasources you have configued.  If you have 3 datasources, each with 300 max pooled statements, then your DB should allow at least 900 cursors to be safe.  I am speaking for Oracle, I'm not sure how the other DBs handle it although I can't image they are that different.

Here are some great articles that go on to explain: http://jehiah.cz/a/maximum-open-cursors-exceeded and http://headsplodeblog.blogspot.com/2010/06/coldfusion-and-ora-01000-too-many-open.html

I will only be convinced that I've finally found the probelm once I have days, if not weeks, of flawless uptime, so we will see, and I will report back if errors go away for good.  If they don't, I will feel like an idiot posting all this information!  I am grateful for Adobe's help and everyone else on these forums who have been listening to and trying to help me solve this problems, which I think may have been the root of my other problems here and here.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Dec 04, 2013 Dec 04, 2013

Copy link to clipboard

Copied

Hi David,

Let us monitor the server for next couple of days. The Max Pool set, seems to be appropriate at this stage.

Regards,

Anit Kumar

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Dec 10, 2013 Dec 10, 2013

Copy link to clipboard

Copied

Unfortunately, we continue to have this problem, and the insane thing is that is happens at a very predictable time almost every night, but I cannot trace the source.  The hang lasts for approximately 6-8 minutes.

I have turned Max Pooled Statements off by setting to ZERO in case that was the problem, but we still encounter the locked threads and thus timed out queries across the board, even for web sites connected to totally separate datasources running under different username accounts (but overall same database server).  We are on Oracle. Failed Request Tracking shows a couple of odd yet perhaps telling things way down the list of loaded modules and responses:

One of the responses says “GENERAL_FLUSH_RESPONSE_END” messages: ErrorCode="An operation was attempted on a nonexistent network connection.  (0x800704cd)" or "The specified network name is no longer available."

All of these clues are making me think ColdFusion (or the driver) is losing communication temporarily.  Why else would ALL sites fail at the same time, and why would my dev and staging servers also be affected?  ColdFusion still works, any ColdFusion page with no database connection comes right up.  Traffic spikes have been ruled out, as I have stress tested the crap out of our site and can never reproduce the hang-up.  I also looked anywhere we had CFLOCK but that did not seem to be an issue.  There are NO scheduled tasks running.  Internal scans only run at the timeframe in question on Sunday nights, so does not explain why I see problems almost every night.  On the nights there is no problem I can't explain.

I will try to enable logging database calls in CFAdmin next time to see what it says there, but that log becomes HUGE in a very short period of time.

Any other ideas are welcome.  I am trying to eliminate any possibility, no matter how small.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Dec 16, 2013 Dec 16, 2013

Copy link to clipboard

Copied

We continue to have this problem.  Database threads get locked up every single night and all sites go down.  Our DBA notices connection will be very few then ramp up from, say, 2 to 245 right around the timeframe we get the ColdFusion hang.  Also, he notices all these connections remain OPEN.  What is going on?  Also, is 245 open connections a lot?  I've stress tested our system with hunderds of concurrent connections and I can never duplicate the issue.

I changed my database to uncheck "Maintain Connections" and still we get this problem.  I have 9 datasources and I only made the change to one, the biggest one where the BLOCKED theads are happening, should I change them all?   I just can't figure out where this coming from.  Other things we've tried: changing the NIC driver, disabled IPv4 and TCP checksum offloading (Wireshark pointed us in this direction but still hasn't made any difference), logs really aren't giving us any data.  Traffic on the site seems to be almost nothing sometimes when these events occur.

Other thoughts I've had:  could the JDBC driver be the problem?  We are running Oracle 11.2.0.1 (separate server, same network).  My web boxes are all in VMware on a host with 40 other machines.  IT dept says they cannot find anything running around the times of the database connection issue.  I've checked the Windows 2008 "tasks" and can't find anything there.  We have no ColdFusion tasks running.

Oddly, I've even seen this issue on my staging and devlopment machines.  I've received a "slow server" alert when there weren't even any requests running!!  Zero!  Does this mean the skimmers and other native timers/whatever inside CF can't find a live connection?  Just weird, weird, weird.

The problem is always at night, so of course that's a clue.  But when I look through IIS raw logs nothing jumps out.  If something was hitting my server, I would expect to see a ton of logs pointing to a source at those exact times, but there is nothing obvious.  What else could be causing all the database connections to be used and held open if not traffic?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Dec 17, 2013 Dec 17, 2013

Copy link to clipboard

Copied

I did another test last night with the following:

  • turn off every web site on my IIS server except the "big" one that we care most about when it goes down every night.
  • I also changed my robots.txt to exclude ALL crawlers and agents
  • enabled the Windows firewall to block all known internal security scanners
  • I added Query Validation to my datasources (the one big site has several areas that connect to different datasources so I made sure they were all the same)
  • I even removed all extraneous classes from the CFadmin JVM section in case there was something funky going on in there
  • I shut down my development and staging machines to make sure there would be no interferrence from them (not that I can imagine what that would be, but they do connect to same datasources as production).
  • Rebooted the production machine just before the usual hangs occur, making sure none of the other sites came back up.  So bascially the server was fresh with nothing else running but the one site.

Like clockwork, the site goes down around 9:00pm.  Same errors as above with blocked threads.  Wireshark is running.  IIS Worker Proceeses show what looks like a lot of requests just sitting open for a LONG time.  Despite my CFAdmin having the general timeout set to 60 seconds, none of those requests are getting killed.  When I read about that it says CF will not kill threads so long as it's waiting for a database reply or something to that effect.  Make sense, and confirms what I fear, which is that the communication channel between our Oracle server and my web server is going down, even though both machines are "up" and logs don't indicate any strange activity.  Traffic is minimal around this time of night, too. 

I decided to run Jconsole while all this was going on and here is some interesting data from inside (not really sure what it all means):

Name: Worker-Threads-2600

State: RUNNABLE

Total blocked: 145  Total waited: 145

Stack trace:

  1. java.net.SocketInputStream.socketRead0(Native Method)
  2. java.net.SocketInputStream.read(SocketInputStream.java:152)
  3. java.net.SocketInputStream.read(SocketInputStream.java:122)
  4. java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
  5. java.io.BufferedInputStream.read(BufferedInputStream.java:254)

- locked java.io.BufferedInputStream@413ecf22 (emphasis mine, what does this mean?)

  1. java.io.DataInputStream.readByte(DataInputStream.java:265)
  2. sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:214)
  3. sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
  4. com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
  5. javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source)
  6. javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1029)
  7. sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
  8. sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  9. java.lang.reflect.Method.invoke(Method.java:606)
  10. sun.tools.jconsole.ProxyClient$SnapshotInvocationHandler.invoke(ProxyClient.java:1000)
  11. com.sun.proxy.$Proxy0.invoke(Unknown Source)
  12. com.sun.jmx.mbeanserver.MXBeanProxy$InvokeHandler.invoke(MXBeanProxy.java:150)
  13. com.sun.jmx.mbeanserver.MXBeanProxy.invoke(MXBeanProxy.java:167)
  14. javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:252)
  15. com.sun.proxy.$Proxy2.dumpAllThreads(Unknown Source)
  16. sun.tools.jconsole.ThreadTab$2.run(ThreadTab.java:339)
  17. sun.tools.jconsole.Worker.run(Worker.java:55)

And another random thread I picked from inside the Jconsole:

Name: RMI TCP Connection(2)-xxx.xxx.xxx.xxx (IP of our server)

State: TIMED_WAITING on com.sun.jmx.remote.internal.ArrayNotificationBuffer@40173294

Total blocked: 68  Total waited: 76

Stack trace:

  1. java.lang.Object.wait(Native Method)
  2. com.sun.jmx.remote.internal.ArrayNotificationBuffer.fetchNotifications(ArrayNotificationBuffer.java:449)
  3. com.sun.jmx.remote.internal.ArrayNotificationBuffer$ShareBuffer.fetchNotifications(ArrayNotificationBuffer.java:227)
  4. com.sun.jmx.remote.internal.ServerNotifForwarder.fetchNotifs(ServerNotifForwarder.java:275)
  5. javax.management.remote.rmi.RMIConnectionImpl$3.run(RMIConnectionImpl.java:1289)
  6. javax.management.remote.rmi.RMIConnectionImpl$3.run(RMIConnectionImpl.java:1287)
  7. javax.management.remote.rmi.RMIConnectionImpl.fetchNotifications(RMIConnectionImpl.java:1293)
  8. sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  9. sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  10. sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  11. java.lang.reflect.Method.invoke(Method.java:606)
  12. sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
  13. sun.rmi.transport.Transport$1.run(Transport.java:177)
  14. sun.rmi.transport.Transport$1.run(Transport.java:174)
  15. java.security.AccessController.doPrivileged(Native Method)
  16. sun.rmi.transport.Transport.serviceCall(Transport.java:173)
  17. sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
  18. sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
  19. sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
  20. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  21. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  22. java.lang.Thread.run(Thread.java:744)

The ColdFusion application log shows this for all the CFQUERY during the hang:  Timed out trying to establish connection.

Also around the same time as the hang I am seeing this in the HTTPERR log in IIS next to my web site's application pool name:  Timer_MinBytesPerSecond

  • This error means: The connection expired because the client was not receiving a response at a reasonable speed. The response send rate was slower than the default of 240 bytes/sec.  In other words, the network is not sending data at a reasonable speed.  This is probably why database connections are not going through either.  Important clue!

Finally, I notice in my IIS raw logs around this time that there is very little traffic, but the traffic that's there is getting a combination of 500 and 200 response codes.  So why are some requests getting through within the same 1-2 second timeframe, and the others are not?  The ones that get 500 look the same as the ones that get 200.  Why are these repsonse codes mixed together????

My ONLY take on all this is that there is a problem in the network or VMware architecure causing a communincation failure between Oracle and our Web servers.  By the way, the Oracle machine is a physical box.  Our web servers are in VM land with about 40 other instances I was told.

I've given the Wireshark dumps to the network people.  We'll see.  Nothing really jumped out at me but I'm not an expert at interpretting that data.


Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Dec 19, 2013 Dec 19, 2013

Copy link to clipboard

Copied

Looks like we're dealing with a network configuration issue.  It makes sense, because all the other data is crazy, and finally the whole IT team stayed online late and monitored the whole crash in-processes, and looks like it might have to do with Oracle backups and the network share drive.  Something may be locking up the I/O process on the database and preventing communication with all services, so ColdFusion just sits and hangs, connections keep getting opened and never closed (even with Maintain Connections off) so that's the story for now.  I am just happy it does not appear to be a bug in ColdFusion or Tomcat, but will keep this thread going until its conclusion just in case this helps anyone else who find themselves in this spot!  What a mess!  The hours involved in troublshooting this have been astronomical.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Jan 28, 2014 Jan 28, 2014

Copy link to clipboard

Copied

Hi, all.  My problem was finally solved with changes to our network wiring configuration.  Here is the write-up from my IT team explaining exactly what they did:

------------

We performed a few upgrades simultaneously which may have contributed to the current stability. Though the Oracle upgrade and the kernel upgrade / patch may have been factors, it's our opinion that the network port was the primary culprit.

The way that the network was previously configured, each server gained network access through a different switch, which in turn were connected through other devices. The database server stores the databases on the storage cluster, mapped via NFS.  By moving the database server network port to the same switch that the storage cluster users, we eliminated extra hops and hardware.   Speeds for the database server increased and we stopped seeing timeouts occur. 

Before change:

web server -> Switch1 -> Switch2 -> Switch3 -> database server -> Switch3 -> Switch4 -> Switch5 -> storage -> Switch5 -> Switch4 -> Switch3 -> database server -> Switch3 -> Switch2 -> Switch1 -> web server

After change:

web server -> Switch1 -> Switch2 -> Switch3 -> database server -> Switch3 -> storage -> Switch3 -> database server -> Switch3 -> Switch2 -> Switch1 -> web server

------------

I guess the moral of the story here is check your network hops if anyone is experiencing these kinds of ColdFusion/datdabase timeouts and have no idea where they are coming from.

P.S.  Big props to Anit for helping me troublshoot every other possibility before I handed it off to the IT team with the conclusion "It's not ColdFusion!".  Adobe Support was EXCELLENT.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 25, 2019 Apr 25, 2019

Copy link to clipboard

Copied

LATEST

Dcwebguy, I know this is a very old post, but it still comes up often in searches for cf jdbc connection pool issues. So first, thanks for that final note you'd shared in 2014.

Still, if you may see this, could I trouble you to help readers with just a bit more?

First, did you find these extraneous hops reflected in a tracert or similar (and readily accessible) tool?

Just wondering if this scenario would be easily spotted by folks (perhaps in the same boat) but who may not have access to advanced network monitoring tools.

Second and finally, I wonder if you could elaborate on what seems the key point:

"By moving the database server network port to the same switch that the storage cluster users, we eliminated extra hops and hardware"

Do you really mean you somehow moved only the db server "port" to the same switch as the storage cluster, or was it rather the db server itself (not just its "port")?

Thanks for any thoughts, or from anyone more versed in the problem and solution.


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation