Problem with FMIS 4 and streaming of live events
We have a problem on our platform and its driving us nuts... no seriously... NUTS.
We have triple checked every possible component from a hardware level up to a software configuration level.
The problem : Our platform consists of 2 origin servers with 6 edges talking to them (really beefy hardware). Once we inject a live stream into our two origins... we can successfully get the stream out via the edges and stream it successfully via our player. Once we hit around 2200 concurrent connections, the FMIS servers drops all the connections busy with streams. From the logs the only thing we can see is the following - Tons of disconnects with the Status code 103's which according to the online documentation means Client disconnected due to server shutdown (or application unloaded).
We simulated the scenario with the FMS load simulator utility... and we start seeing errors + all connections dropped around the 2200 mark.
The machines are Dell blades with dual CPU Xeons (quad cores) with around 50 gigs of ram per server... The edges are all on 10 Gb/s ethernet interfaces as well.
We managed to generate a nice big fat coredump on the one origin and the only thing visible from inspecting the core dumps + logs is the following :
| 2011-10-05 | 15:44:10 | 22353 (e)2641112 | JavaScript runtime is out of memory; server shutting down instance (Adaptor: |
_defaultRoot_, VHost: _defaultVHost_, App: livestreamcast_origin/_definst_). Check the JavaScript runtime size for this application
in the configuration file.
And from the core dump :
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fff9ddfc000
Core was generated by `/opt/adobe/fms/fmscore -adaptor _defaultRoot_ -vhost _defaultVHost_ -app -inst'.
Program terminated with signal 11, Segmentation fault.
#0 0x00002aaaab19ab22 in js_MarkGCThing () from /opt/adobe/fms/modules/scriptengines/libasc.so
(gdb) bt
#0 0x00002aaaab19ab22 in js_MarkGCThing () from /opt/adobe/fms/modules/scriptengines/libasc.so
#1 0x00002aaaab196b63 in ?? () from /opt/adobe/fms/modules/scriptengines/libasc.so
#2 0x00002aaaab1b316f in js_Mark () from /opt/adobe/fms/modules/scriptengines/libasc.so
#3 0x00002aaaab19a673 in ?? () from /opt/adobe/fms/modules/scriptengines/libasc.so
#4 0x00002aaaab19a6f7 in ?? () from /opt/adobe/fms/modules/scriptengines/libasc.so
#5 0x00002aaaab19ab3d in js_MarkGCThing () from /opt/adobe/fms/modules/scriptengines/libasc.so
#6 0x00002aaaab19abbe in ?? () from /opt/adobe/fms/modules/scriptengines/libasc.so
#7 0x00002aaaab185bbe in JS_DHashTableEnumerate () from /opt/adobe/fms/modules/scriptengines/libasc.so
#8 0x00002aaaab19b39d in js_GC () from /opt/adobe/fms/modules/scriptengines/libasc.so
#9 0x00002aaaab17e6d7 in js_DestroyContext () from /opt/adobe/fms/modules/scriptengines/libasc.so
#10 0x00002aaaab176bf4 in JS_DestroyContext () from /opt/adobe/fms/modules/scriptengines/libasc.so
#11 0x00002aaaab14f5e3 in ?? () from /opt/adobe/fms/modules/scriptengines/libasc.so
#12 0x00002aaaab14fabd in JScriptVMImpl::resetContext() () from /opt/adobe/fms/modules/scriptengines/libasc.so
#13 0x00002aaaab1527b4 in JScriptVMImpl::postProcessCbk(unsigned int, bool, int) ()
from /opt/adobe/fms/modules/scriptengines/libasc.so
#14 0x00002aaaab1035c7 in boost::detail::function::void_function_obj_invoker3<boost::_bi::bind_t<void, boost::_mfi::mf3<void, IJScriptVM, unsigned int, bool, int>, boost::_bi::list4<boost::_bi::value<IJScriptVM*>, boost::arg<1>, boost::arg<2>, boost::arg<3> > >, void, unsigned int, bool, int>::invoke(boost::detail::function::function_buffer&, unsigned int, bool, int) ()
from /opt/adobe/fms/modules/scriptengines/libasc.so
#15 0x00002aaaab0fddf6 in boost::function3<void, unsigned int, bool, int>::operator()(unsigned int, bool, int) const ()
from /opt/adobe/fms/modules/scriptengines/libasc.so
#16 0x00002aaaab0fbd9d in fms::script::AscRequestQ::run() () from /opt/adobe/fms/modules/scriptengines/libasc.so
#17 0x00002aaaab0fd0eb in boost::detail::function::function_obj_invoker0<boost::_bi::bind_t<bool, boost::_mfi::mf0<bool, fms::script::AscRequestQ>, boost::_bi::list1<boost::_bi::value<fms::script::IntrusivePtr<fms::script::AscRequestQ> > > >, bool>::invoke(boost::detail::function::function_buffer&) () from /opt/adobe/fms/modules/scriptengines/libasc.so
#18 0x00000000009c7327 in boost::function0<bool>::operator()() const ()
#19 0x00000000009c7529 in fms::script::QueueRequest::run() ()
#20 0x00000000008b868a in TCThreadPool::launchThreadRun(void*) ()
#21 0x00000000008b8bd6 in TCThreadPool::__ThreadStaticPoolEntry(void*) ()
#22 0x00000000008ba496 in launchThreadRun(void*) ()
#23 0x00000000008bb44f in __TCThreadEntry(void*) ()
#24 0x000000390ca0673d in start_thread () from /lib64/libpthread.so.0
#25 0x000000390bed44bd in clone () from /lib64/libc.so.6
From what it looks like above, FMS is hard crashing when trying to use clone(2) (basically it means when its trying to spawn a new/another process).
I am really hoping there is someone out there who can guide us in the right direction with regards to how we can pinpoint why our platform cannot cope with a pathetic 2200 connections before the FMIS daemon drops all connected streams.
There has to be someone out there that has run into this or a similiar problem like this... HELP !!!!
Any feedback / ideas would be greatly appreciated.
