Skip to main content
Participant
March 21, 2014
Question

App instances crashes when under heavy load

  • March 21, 2014
  • 2 replies
  • 855 views

Hello Adobe Support,

My problem is that many of our AMS servers have app instance crashes when under heavy load. We are using the biggest possible instance at Amazon AWS so we cant just add bigger instances.

 

My hope is that you can help us in the right direction on solving this issue based on the log files snippets I have included below.

 

Another weird thing I see is in the admin console where the bandwidth graphs shows in/out bandwidth dropping to 0 for a couple of seconds. This only happens during peak time.

Here are some logs from the time of an instance crash:

 

/var/log/messages:

Mar 20 14:01:51 ip-10-230-57-112 Adaptor[25827]: Listener started ( _defaultRoot_ ) : 127.0.0.1:19350 (rtmfp-core)/v4

Mar 20 14:01:51 ip-10-230-57-112 Adaptor[25827]: Listener started ( _defaultRoot_ ) : 10.230.57.112:19350 (rtmfp-core)/v4

Mar 20 14:01:51 ip-10-230-57-112 Adaptor[25827]: Public rtmfp-core addresses for listener _defaultRoot_ are: 54.216.15.146:19350

Mar 20 14:02:25 ip-10-230-57-112 Adaptor[26142]: Listener started ( _defaultRoot_ ) : 127.0.0.1:19350 (rtmfp-core)/v4

Mar 20 14:02:25 ip-10-230-57-112 Adaptor[26142]: Listener started ( _defaultRoot_ ) : 10.230.57.112:19350 (rtmfp-core)/v4

Mar 20 14:02:25 ip-10-230-57-112 Adaptor[26142]: Public rtmfp-core addresses for listener _defaultRoot_ are: 54.216.15.146:19350

Mar 20 14:02:46 ip-10-230-57-112 kernel: amscore[25947]: segfault at 0000000000000000 rip 0000000000000000 rsp 00002aaaad682f88 error 14

Mar 20 14:03:00 ip-10-230-57-112 Adaptor[26450]: Listener started ( _defaultRoot_ ) : 127.0.0.1:19350 (rtmfp-core)/v4

Mar 20 14:03:00 ip-10-230-57-112 Adaptor[26450]: Listener started ( _defaultRoot_ ) : 10.230.57.112:19350 (rtmfp-core)/v4

Mar 20 14:03:00 ip-10-230-57-112 Adaptor[26450]: Public rtmfp-core addresses for listener _defaultRoot_ are: 54.216.15.146:19350

Core Logs:

#Date: 2014-03-20

#Fields: date   time    x-pid   x-status        x-ctx   x-comment

2014-03-20      14:01:51        25827   (i)2581237      Starting admin app on core (25827).     -

2014-03-20      14:01:51        25827   (i)2581238      Core (25827) connecting to admin.       -

2014-03-20      14:01:51        25827   (i)2581231      Core (25827) connected to admin.        -

2014-03-20      14:01:51        25827   (i)2581234      Core (25827) connection to admin accepted.      -

2014-03-20      14:01:51        25827   (i)2631174      Listener started ( _defaultRoot_ ) : 127.0.0.1:19350 (rtmfp-core)/v4    -

2014-03-20      14:01:51        25827   (i)2631174      Listener started ( _defaultRoot_ ) : 10.230.57.112:19350 (rtmfp-core)/v4        -

2014-03-20      14:01:51        25827   (i)2631509      Public rtmfp-core addresses for listener _defaultRoot_ are: 54.216.15.146:19350 -

2014-03-20      14:01:51        25827   (i)2581246      Core (25827) sending register cmd to edge.      -

#Date: 2014-03-20

#Fields: date   time    x-pid   x-status        x-ctx   x-comment

2014-03-20      14:01:51        23351   (i)2581247      Core (23351) disconnected from edge.    -

2014-03-20      14:01:51        23351   (i)2581233      Core (23351) disconnecting from admin: 200.     -

#Date: 2014-03-20

#Fields: date   time    x-pid   x-status        x-ctx   x-comment

2014-03-20      14:02:25        26142   (i)2581237      Starting admin app on core (26142).     -

2014-03-20      14:02:25        26142   (i)2581238      Core (26142) connecting to admin.       -

2014-03-20      14:02:25        26142   (i)2581231      Core (26142) connected to admin.        -

2014-03-20      14:02:25        26142   (i)2581234      Core (26142) connection to admin accepted.      -

2014-03-20      14:02:25        26142   (i)2631174      Listener started ( _defaultRoot_ ) : 127.0.0.1:19350 (rtmfp-core)/v4    -

2014-03-20      14:02:25        26142   (i)2631174      Listener started ( _defaultRoot_ ) : 10.230.57.112:19350 (rtmfp-core)/v4        -

2014-03-20      14:02:25        26142   (i)2631509      Public rtmfp-core addresses for listener _defaultRoot_ are: 54.216.15.146:19350 -

2014-03-20      14:02:25        26142   (i)2581246      Core (26142) sending register cmd to edge.      -

2014-03-20      14:02:04        23351   (e)2581279      Assert failed in tincan/util/TCAsyncIO.cpp line 1948    -

2014-03-20      14:02:26        23351   (e)2581279      Assert failed in tincan/server/TCMsgHandler.cpp line 695        -

2014-03-20      14:02:26        23351   (e)2581279      Assert failed in tincan/server/TCMsgHandler.cpp line 695        -

Master Logs:

2014-03-20      14:01:51        22725   (w)2581257      Core (23351) _defaultRoot_:_defaultVHost_:::_1 is not responding and is being restarted!        -

2014-03-20      14:01:51        22725   (i)2581221      Core (25827) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app  -inst  -tag "_1" -console  -conf "/opt/adobe/ams/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:::_1".   -

2014-03-20      14:02:25        22725   (w)2581257      Core (25827) _defaultRoot_:_defaultVHost_:::_1 is not responding and is being restarted!        -

2014-03-20      14:02:25        22725   (i)2581221      Core (26142) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app  -inst  -tag "_1" -console  -conf "/opt/adobe/ams/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:::_1".   -

2014-03-20      14:02:49        22725   (i)2581223      Core (25827) is no longer active.       -

2014-03-20      14:03:00        22725   (w)2581257      Core (26142) _defaultRoot_:_defaultVHost_:::_1 is not responding and is being restarted!        -

2014-03-20      14:03:00        22725   (i)2581221      Core (26450) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app  -inst  -tag "_1" -console  -conf "/opt/adobe/ams/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:::_1".   -

2014-03-20      14:06:23        22725   (w)2581260      Core (23351) did not shutdown as expected. Killing core now.    -

2014-03-20      14:06:23        22725   (w)2581434      Master executing fast shutdown of core ID (23351).      -

2014-03-20      14:06:23        22725   (w)2581260      Core (26142) did not shutdown as expected. Killing core now.    -

2014-03-20      14:06:23        22725   (w)2581434      Master executing fast shutdown of core ID (26142).      -

2014-03-20      14:06:28        22725   (e)2581395      Failed to signal process condition: errno(22).  -

2014-03-20      14:06:28        22725   (e)2581395      Failed to signal process condition: errno(22).  -

2014-03-20      14:06:38        22725   (e)2581395      Failed to signal process condition: errno(22).  -

2014-03-20      14:06:38        22725   (e)2581395      Failed to signal process condition: errno(22).  -

2014-03-20      14:06:39        22725   (i)2581223      Core (23351) is no longer active.       -

2014-03-20      14:06:39        22725   (i)2581223      Core (26142) is no longer active.       -

Edge Logs:

2014-03-20      14:01:51        22835   (i)2701054      Virtual host Disconnecting unresponsive core LATEST/fr_FR_ComServ (23351) after 30 seconds is not available.    -

2014-03-20      14:01:51        22835   (i)2581250      Edge disconnected from core (23351).    -

2014-03-20      14:01:51        22835   (i)2581537      Resending StartCore message to master (vhost: _defaultVHost_, app: , inst: , tag: _1).  -

2014-03-20      14:01:51        22835   (i)2581252      Registering core (25827).       -

2014-03-20      14:02:25        22835   (i)2701054      Virtual host Disconnecting unresponsive core LATEST/fr_FR_ComServ (25827) after 30 seconds is not available.    -

2014-03-20      14:02:25        22835   (i)2581250      Edge disconnected from core (25827).    -

2014-03-20      14:02:25        22835   (i)2581537      Resending StartCore message to master (vhost: _defaultVHost_, app: , inst: , tag: _1).  -

2014-03-20      14:02:25        22835   (i)2581252      Registering core (26142).       -

2014-03-20      14:03:00        22835   (i)2701054      Virtual host Disconnecting unresponsive core LATEST/fr_FR_ComServ (26142) after 30 seconds is not available.    -

2014-03-20      14:03:00        22835   (i)2581250      Edge disconnected from core (26142).    -

2014-03-20      14:03:00        22835   (i)2581537      Resending StartCore message to master (vhost: _defaultVHost_, app: , inst: , tag: _1).  -

2014-03-20      14:03:00        22835   (i)2581252      Registering core (26450).       -

This topic has been closed for replies.

2 replies

Adobe Employee
April 7, 2014

Try increasing the number for amscores that get launched. See Scope and Distribute tags in Application.xml

PenojaAuthor
Participant
April 4, 2014

Adobe, you have the worst support in the world. We spend millions a year on your products and nobody from your support reaches out to us when we have major issues.

We are going to move to an alternative ASAP.