Skip to main content
Participating Frequently
April 18, 2011
Question

VOD - application goes down after a while (only some instances)

  • April 18, 2011
  • 3 replies
  • 2900 views
Please download the attached file to view this post
    This topic has been closed for replies.

    3 replies

    Nikhil_Kalyan
    Participating Frequently
    April 21, 2011

    Hi,

    Thanks for the details and the logs. It looks like the core process is crashing too often than expected!

    Can you help by providing some more details, which can confirm any workaround to get past the crashes for you or to have this issue investigated and fixed.

    I would like to see your configuration files, including application.xml in the vod folder, if any, if there are any significant changes that you made. Did you make any changes to the default configuration except for the MaxAppIdleTime that you already mentioned ?

    What files are you trying to stream ? I see the system memory running at very high levels (the logs say so too), which is taking almost you total RAM till the crash happens when it is released.. I would like to know the stream codecs,bitrate and the size of the file etc. Are all the subscribers trying to play a single file ?

    DId you change your FLV Cache size ? Recommended settings for this is 1/4th of system's RAM. So you can have it set to 2 GB (the setting can be done in fms.ini file). Let us know if this has improved your system's memory performance.

    You should also check the application logs (they will be under the logs folder\_default_Root_\_default_Vhost_\app_name\ folder ).

    Last, it would be very good to have the crash dumps if you can share.. Please let us know.

    Thank you !

    JuNiOx_BRAuthor
    Participating Frequently
    April 21, 2011

    Hello Nikhil

    thank you for helping.

    The current configuration file's content are on the first topic: Application.xml, Server.xml and Vhost.xml.

    I changed everything related to limits (either memory or time idle).

    Regarding the MaxAppIdleTime, it's till set to "-1" in order to disable it.

    /opt/adobe/fms/conf/_defaultRoot_/_defaultVHost_# grep -i idle * | grep -v \<\!\-\-

    Application.xml:        <MaxAppIdleTime>-1</MaxAppIdleTime>

    Application.xml:                        <IdlePostInterval>512</IdlePostInterval>

    Application.xml:                        <IdleAckInterval>512</IdleAckInterval>

    FLV Cache size is currently set to "4000", I'll try changing it to "2000" as you recommended:

    /opt/adobe/fms/conf/_defaultRoot_/_defaultVHost_# grep SERVER.FLVCACHE_MAXSIZE /opt/adobe/fms/conf/fms.ini

    SERVER.FLVCACHE_MAXSIZE=4000

    The application logs dont show details related to the problem, only "traces" I set during the code. For example, this instance went down some minutes ago but the logs show the function which was called to play the next file from a playlist. (I mean, even with the instance down, no error is logged here)

    /opt/adobe/fms/logs/_defaultVHost_/videostream/bj_station# cat application.00.log

    --- function nextStream(xml) --- STARTED ---    -

    2011-04-21      16:16:03        29664   (s)2641173                      playing = 1/12: 2011-04-11_00_03_21_bj.wmv      -

    2011-04-21      16:16:03        29664   (s)2641173                      clientObj info.code NetStream.Play.Start        -

    2011-04-21      16:16:03        29664   (s)2641173      NetStream.onPlayStatus called   -

    2011-04-21      16:16:03        29664   (s)2641173              level:  status  -

    2011-04-21      16:16:03        29664   (s)2641173              code:   NetStream.Play.Start    -

    2011-04-21      16:16:03        29664   (s)2641173              description:    Started playing 2011-04-11_00_03_21_bj.wmv.      -

    2011-04-21      16:16:03        29664   (s)2641173              details:        2011-04-11_00_03_21_bj.wmv        -

    2011-04-21      16:16:03        29664   (s)2641173                      clientObj info.code NetStream.Data.Start        -

    2011-04-21      16:16:03        29664   (s)2641173      NetStream.onPlayStatus called   -

    2011-04-21      16:16:03        29664   (s)2641173              code:   NetStream.Data.Start    -

    /opt/adobe/fms/logs/_defaultVHost_/videostream/bj_station# date

    Thu Apr 21 17:57:14 CEST 2011

    Each instance for this VOD application are supposed to work like a TV Channel, so that once a video finish streamining, the next video from the "tv channel" playlist start playing. Those videos are converted to flv.

    I´m basically using ffmpeg to convert them:

    -acodec libfaac -ab 128k -vcodec libx264 -vpre slow -b 300k -threads 0

    and there are videos from 3mb till 100mb being streamed.

    sorry to ask, but how can I get the core dump? that might be the key to figure out why the cores goes down.

    Thank you very much!!

    Nikhil_Kalyan
    Participating Frequently
    April 29, 2011

    Sorry for the delay in responding to this post. Somehow I got it missed.

    One of the logs mentioned :

    2011-04-21      16:16:03        29664   (s)2641173              description:    Started playing 2011-04-11_00_03_21_bj.wmv.      -


    Are you trying to play an wmv file ? It is not supported by FMS/Flash Player. Can you make sure that these files are not causing any problems.

    I am not so good at linux stuff but to track processes or for debugging , some commands like gdb, tail, strace/ptrace , valgrind might help. Give them a try.

    I somehow feel the files made by ffmpeg has issues while playback. Can you send me some files you transcoded using ffmpeg and which for sure are causing the core to crash so that i can reproduce the issue in house and probably find a fix ?


    Thank you  !

    Nikhil_Kalyan
    Participating Frequently
    April 20, 2011

    Yes, there are two settings that you can configure/disable that are related to application unload :

    MaxAppIdleTime in application.xml as you have found. Default value for this is 1200 sec (20 min) and you can either disable (-1) or have a value below its max value.

    AutoCloseIdleConnections in server.xml : this setting helps you configure to close all those idle connections to FMS that are not having any activity. By default it is disabled and hence no connections are closed automatically.

    Still, in any case, the core failures are still not expected.

    Do let us know if you are still hitting the core failures. Thank you !

    JuNiOx_BRAuthor
    Participating Frequently
    April 20, 2011

    Hello guys,

    thanks for replying.

    All intances spent 24h playing/streaming as expected but today some of them went down.

    Any idea why this core is being deactivated?

    --------------------------------------------------------------------------------------------------------------------------------------------

    master.log

    2011-04-20      16:30:54        16109   (i)2581172      System memory load (90) is now below the maximum threshold.     -

    2011-04-20      16:30:55        16109   (w)2581171      System memory load (91) is high.        -

    2011-04-20      18:36:18        16109   (i)2581226      Edge (16129) is no longer active.       -

    2011-04-20      18:36:18        16109   (w)2581255      Edge (16129) _defaultRoot__edge1 experienced 1 failure!      -

    2011-04-20      18:36:18        16109   (i)2581224      Edge (28483) started, arguments : -edgeports ":1935,80" -coreports "localhost:19350" -conf "/opt/adobe/fms/conf/Server.xml" -adaptor "_defaultRoot_" -name "_defaultRoot__edge1".        -

    2011-04-20      18:53:25        16109   (i)2581223      Core (16132) is no longer active.       -

    2011-04-20      18:53:25        16109   (i)2581221      Core (28621) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app "registry" -inst "registry" -tag  -conf "/opt/adobe/fms/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:registry:registry:".-

    2011-04-20      18:57:25        16109   (i)2581223      Core (28621) is no longer active.       -

    2011-04-20      18:57:25        16109   (i)2581221      Core (28814) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app "registry" -inst "registry" -tag  -conf "/opt/adobe/fms/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:registry:registry:".-

    2011-04-20      19:00:15        16109   (i)2581172      System memory load (61) is now below the maximum threshold.     -

    2011-04-20      19:00:28        16109   (i)2581223      Core (16752) is no longer active.       -

    2011-04-20      19:00:28        16109   (w)2581256      Core (16752) _defaultRoot_:_defaultVHost_:::_1 experienced 1 failure!        -

    2011-04-20      19:00:28        16109   (i)2581221      Core (29004) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app  -inst  -tag "_1" -conf "/opt/adobe/fms/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:::_1".     

    --------------------------------------------------------------------------------------------------------------------------------------------

    core.log

    2011-04-20      16:30:56        16132   (w)2581171      System memory load (91) is high.        -

    2011-04-20      18:36:16        16752   (i)2581247      Core (16752) disconnected from edge.    -

    2011-04-20      18:36:16        16548   (i)2581247      Core (16548) disconnected from edge.    -

    2011-04-20      18:36:16        16962   (i)2581247      Core (16962) disconnected from edge.    -

    2011-04-20      18:36:16        16132   (i)2581247      Core (16132) disconnected from edge.    -

    2011-04-20      18:36:21        16962   (i)2581246      Core (16962) sending register cmd to edge.      -

    2011-04-20      18:36:21        16752   (i)2581246      Core (16752) sending register cmd to edge.      -

    2011-04-20      18:36:21        16548   (i)2581246      Core (16548) sending register cmd to edge.      -

    2011-04-20      18:36:22        16132   (i)2581246      Core (16132) sending register cmd to edge.      -

    #Date: 2011-04-20

    #Fields: date   time    x-pid   x-status        x-ctx   x-comment

    2011-04-20      18:53:27        28621   (i)2581246      Core (28621) sending register cmd to edge.      -

    2011-04-20      18:53:30        28621   (w)2581171      System memory load (99) is high.        -

    #Date: 2011-04-20

    #Fields: date   time    x-pid   x-status        x-ctx   x-comment

    2011-04-20      18:57:27        28814   (i)2581246      Core (28814) sending register cmd to edge.      -

    2011-04-20      18:57:29        28814   (w)2581171      System memory load (99) is high.        -

    2011-04-20      19:00:16        16962   (i)2581172      System memory load (61) is now below the maximum threshold.     -

    2011-04-20      19:00:15        28814   (i)2581172      System memory load (66) is now below the maximum threshold.     -

    2011-04-20      19:00:16        16548   (i)2581172      System memory load (61) is now below the maximum threshold.     -

    #Date: 2011-04-20

    #Fields: date   time    x-pid   x-status        x-ctx   x-comment

    2011-04-20      19:00:29        29004   (i)2581237      Starting admin app on core (29004).     -

    2011-04-20      19:00:29        29004   (i)2581238      Core (29004) connecting to admin.       -

    2011-04-20      19:00:29        29004   (i)2581231      Core (29004) connected to admin.        -

    2011-04-20      19:00:29        29004   (i)2581234      Core (29004) connection to admin accepted.      -

    2011-04-20      19:00:29        29004   (i)2581246      Core (29004) sending register cmd to edge.      -

    --------------------------------------------------------------------------------------------------------------------------------------------
    # free -m
                 total       used       free     shared    buffers     cached
    Mem:          7993       6076       1917          0          4         34
    -/+ buffers/cache:       6037       1956
    Swap:         3147       1558       1589
    thank you!

    JuNiOx_BRAuthor
    Participating Frequently
    April 21, 2011

    once again, new instances just went down (3 of them)

    /opt/adobe/fms/logs# cat master.00.log

    #Version: 1.0

    #Start-Date: 2011-04-21 03:20:08

    #Software: Adobe Flash Media Server 3.5.5 r2004

    #Date: 2011-04-21

    #Fields: date   time    x-pid   x-status        x-ctx   x-comment

    2011-04-21      03:20:05        16109   (i)2581221      Core (29664) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app  -inst  -tag "_1" -conf "/opt/adobe/fms/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:::_1".       -

    2011-04-21      07:44:42        16109   (i)2581223      Core (16962) is no longer active.       -

    2011-04-21      07:44:42        16109   (w)2581256      Core (16962) _defaultRoot_:_defaultVHost_:::_0 experienced 1 failure!     -

    /opt/adobe/fms/logs# cat edge.00.log

    #Version: 1.0

    #Start-Date: 2011-04-21 03:20:06

    #Software: Adobe Flash Media Server 3.5.5 r2004

    #Date: 2011-04-21

    #Fields: date   time    x-pid   x-status        x-ctx   x-comment

    2011-04-21      03:20:06        28483   (i)2581252      Registering core (29664).       -

    2011-04-21      07:44:41        28483   (i)2581250      Edge disconnected from core (16962).    -

    /opt/adobe/fms/logs# cat admin.00.log
    #Version: 1.0
    #Start-Date: 2011-04-21 03:20:07
    #Software: Adobe Flash Media Server 3.5.5 r2004
    #Date: 2011-04-21
    #Fields: date   time    x-pid   x-status        x-ctx   x-comment
    2011-04-21      03:20:06        16464   (i)2581241      Connection to admin received.   -
    2011-04-21      03:20:06        16464   (i)2581243      Connection from core 29664 received.    -
    2011-04-21      03:20:06        16464   (i)2581244      Connection from core 29664 accepted.    -
    2011-04-21      07:44:41        16464   (i)2581242      Core (16962) disconnected : _defaultRoot_:_defaultVHost_:::_0.  -
    /opt/adobe/fms/logs# grep -v H.264 core.00.log | grep -v AAC
    #Version: 1.0
    #Start-Date: 2011-04-21 03:20:06
    #Software: Adobe Flash Media Server 3.5.5 r2004
    #Date: 2011-04-21
    #Fields: date   time    x-pid   x-status        x-ctx   x-comment
    2011-04-21      03:20:06        29664   (i)2581237      Starting admin app on core (29664).     -
    2011-04-21      03:20:06        29664   (i)2581238      Core (29664) connecting to admin.       -
    2011-04-21      03:20:06        29664   (i)2581231      Core (29664) connected to admin.        -
    2011-04-21      03:20:06        29664   (i)2581246      Core (29664) sending register cmd to edge.      -
    2011-04-21      03:20:06        29664   (i)2581234      Core (29664) connection to admin accepted.      -

    /opt/adobe/fms/logs# date

    Thu Apr 21 07:49:57 CEST 2011

    /opt/adobe/fms/logs# free -m

                 total       used       free     shared    buffers     cached

    Mem:          7993       4253       3740          0         93        259

    -/+ buffers/cache:       3900       4093

    Swap:         3147        407       2740

    /opt/adobe/fms/logs# ps -ef | grep core | grep -v grep

    fms      16548 16109  2 Apr19 ?        01:03:57 /opt/adobe/fms/fmscore -adaptor _defaultRoot_ -vhost _defaultVHost_ -app -inst -tag _2 -conf /opt/adobe/fms/conf/Server.xml -name _defaultRoot_:_defaultVHost_:::_2

    fms      28483 16109  0 Apr20 ?        00:00:22 /opt/adobe/fms/fmsedge -edgeports :1935,80 -coreports localhost:19350 -conf /opt/adobe/fms/conf/Server.xml -adaptor _defaultRoot_ -name _defaultRoot__edge1

    fms      28814 16109  0 Apr20 ?        00:00:08 /opt/adobe/fms/fmscore -adaptor _defaultRoot_ -vhost _defaultVHost_ -app registry -inst registry -tag -conf /opt/adobe/fms/conf/Server.xml -name _defaultRoot_:_defaultVHost_:registry:registry:

    fms      29664 16109  0 03:20 ?        00:01:57 /opt/adobe/fms/fmscore -adaptor _defaultRoot_ -vhost _defaultVHost_ -app -inst -tag _1 -conf

    /opt/adobe/fms/conf/Server.xml -name _defaultRoot_:_defaultVHost_:::_1

    /opt/adobe/fms/logs# ps -ef | grep core | grep -v grep | wc -l

    4

    any idea why this is happening with the cores?

    Participating Frequently
    April 19, 2011

    Did you check core, master or edge logs - do they say anythign which can provide some hints?

    JuNiOx_BRAuthor
    Participating Frequently
    April 19, 2011

    Hello there, tks for replying.

    I found this, but cant identify the reason

    # cat master.01.log
    #Version: 1.0
    #Start-Date: 2011-04-18 00:49:08
    #Software: Adobe Flash Media Server 3.5.5 r2004
    #Date: 2011-04-18
    #Fields: date   time    x-pid   x-status        x-ctx   x-comment
    2011-04-18      00:49:08        10177   (i)2581221      Core (11199) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app  -inst  -tag "_1" -conf "/opt/adobe/fms/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:::_1". -
    2011-04-18      03:43:54        10177   (i)2581223      Core (10615) is no longer active.       -
    2011-04-18      03:43:54        10177   (w)2581256      Core (10615) _defaultRoot_:_defaultVHost_:::_2 experienced 1 failure!        -
    2011-04-18      04:01:20        10177   (i)2581221      Core (11667) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app  -inst  -tag "_2" -conf "/opt/adobe/fms/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:::_2". -
    2011-04-18      04:56:44        10177   (i)2581223      Core (11667) is no longer active.       -
    2011-04-18      08:42:39        10177   (i)2581223      Core (11199) is no longer active.       -
    2011-04-18      08:42:39        10177   (w)2581256      Core (11199) _defaultRoot_:_defaultVHost_:::_1 experienced 1 failure!        -
    2011-04-18      15:47:13        10177   (i)2581221      Core (12293) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app  -inst  -tag "_2" -conf "/opt/adobe/fms/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:::_2". -
    2011-04-18      21:29:33        10177   (i)2581221      Core (12653) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app  -inst  -tag "_1" -conf "/opt/adobe/fms/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:::_1". -
    2011-04-18      21:59:56        10177   (i)2581223      Core (12653) is no longer active.       -

    Just read on the internet about <MaxAppIdleTime> on Application.xml that "2147483" is the maximum value its accepts, and in order to disable it I had set to "-1".

    I'm just testing it now.

    thank you

    Participating Frequently
    April 19, 2011

    So looks like core is crashing for some reason - do let me know hoe your new tests goes.