Skip to main content
April 17, 2008
Question

FMS Interactive 3.0 Linux Crashing

  • April 17, 2008
  • 5 replies
  • 1398 views
FMSI 3.0 crashing with *many* (2000+/sec) messages in /var/log/messages

Config:
FMS Interactive 3.0 on CentOS 5
Kernel 2.6.18-53.1.14.el5PAE #1 SMP i686 athlon i386 GNU/Linux
Processors: Quad Dual-Core AMD Opteron(tm) Processor 8214
8 GB RAM, 8 GB Swap enabled

Problem:

FMS stops accepting connections and throws the following error into /var/log/messages at a rate of 2,000+ messages per second:

Server[10924]: Failed to wait for process condition: errno(43).

/var/log/messages very quickly becomes huge -- this morning it hit 15 GB and filled my /var partition.

Only way I can figure out to recover is stop FMS and syslogd, remove /var/log/messages, and start syslogd and FMS. Server will run for seemingly random amount of time before this happens again. I have not been able to correlate this problem with any specific activity or streams.

Any ideas will be appreciated.

JP

    This topic has been closed for replies.

    5 replies

    May 9, 2008
    yes, 32bit libs are installed with no problem. ldd on the binaries confirms linkage. mainly nspr was the issue for gentoo linux users.

    I only have 4gig so I haven't tried to increase memory any higher. Previous versions of FMS worked great.
    May 5, 2008
    I have been trying to get version 3.0.1 to work on amd64 and right now fmscore seems to be crashing (segv).

    I am getting the following at console:

    May 3 21:09:15 [Service] Server starting...
    May 3 21:09:15 [Service] Server started (/usr/local/fms/conf/Server.xml).
    May 3 21:09:15 [Adaptor] Listener started ( _defaultRoot__edge1 ) : localhost:19350/v4
    May 3 21:09:15 [Adaptor] Listener started ( _defaultRoot__edge1 ) : 1935/v4
    May 3 21:09:15 [kernel] fmscore[1018]: segfault at e6554004 rip ee9d704f rsp e28f81fc error 7
    May 3 21:09:15 [kernel] grsec: From 192.168.2.2: signal 11 sent to /usr/local/fms/fmscore[fmscore:1018] uid/euid:44/44 gid/egid:44/44, parent /usr/local/fms/fmsmaster[fmsmaster:875] uid/euid:0/0 gid/egid:0/0
    May 3 21:09:20 [kernel] fmscore[1047]: segfault at dc3c1004 rip e304104f rsp d6ef51fc error 7
    May 3 21:09:20 [kernel] grsec: From 192.168.2.2: signal 11 sent to /usr/local/fms/fmscore[fmscore:1047] uid/euid:44/44 gid/egid:44/44, parent /usr/local/fms/fmsmaster[fmsmaster:896] uid/euid:0/0 gid/egid:0/0
    May 3 21:09:25 [kernel] fmscore[1075]: segfault at db9f5004 rip e6fa804f rsp dacfe1fc error 6
    May 3 21:09:30 [kernel] fmscore[1110]: segfault at d8ef3004 rip e44af04f rsp d81fe1fc error 6
    May 3 21:09:30 [kernel] grsec: From 192.168.2.2: signal 11 sent to /usr/local/fms/fmscore[fmscore:1110] uid/euid:44/44 gid/egid:44/44, parent /usr/local/fms/fmsmaster[fmsmaster:896] uid/euid:0/0 gid/egid:0/0


    and this in fms logs:


    2008-05-03 21:09:15 875 (i)2571111 Server started (/usr/local/fms/conf/Server.xml). -
    2008-05-03 21:09:20 875 (i)2581223 Core (897) is no longer active. -
    2008-05-03 21:09:20 875 (i)2581221 Core (1020) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app "registry" -inst "registry" -tag -console -conf "/usr/local/fms/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:registry:registry:". -
    2008-05-03 21:09:25 875 (i)2581223 Core (1020) is no longer active. -
    2008-05-03 21:09:25 875 (i)2581221 Core (1048) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app "registry" -inst "registry" -tag -console -conf "/usr/local/fms/conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:registry:registry:". -


    At first I thought this might be due to gresec or pax. I havent been able to solve it yet and wanted to see if u had thoughts. I ran an strace on fmscore and got to this point:

    mmap2(0xde5a5000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 21, 0x12) = 0xffffffffde5a5000
    close(21) = 0
    getdents64(20, /* 0 entries */, 4096) = 0
    close(20) = 0
    flock(4, LOCK_EX) = 0
    flock(4, LOCK_UN) = 0
    mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffffffffda8f9000
    mprotect(0xda8f9000, 4096, PROT_NONE) = 0
    clone(child_stack=0xdb0f94b4, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0xdb0f9bd8, tls=0xdb0f9bd8, child_tidptr=0xfc58ab50) = 1773
    +++ killed by SIGSEGV +++
    May 5, 2008
    Are you running a 64-bit kernel?? If so, I don't think it is going to work at all.

    May 7, 2008
    Yeah. But it does work with version 3.0.0 and it also worked fine with version 2. LOL.
    Participating Frequently
    May 2, 2008
    I think the load must have been pretty hefty
    Participating Frequently
    May 1, 2008
    What kind of load was on the server at the time? If it is crashing with "many" messages it seems to be under quite a bit of load. I am quite interested in this issue (never been seen before) so if there is any more information you can provide that would be helpful: ie cpu usage, memory usage, # of connections etc.

    fyi: http://www.adobe.com/support/flashmediaserver/downloads_updaters.html
    May 2, 2008
    Load average is between 2 and 3 when it crashes (normally runs around 0.5). Memory usage is just about 8 GB, but that's mostly cache and buffers.
    April 30, 2008
    3.0.1 is out, perhaps try upgrading? I am still having problems getting fmscore to not coredump on run via Gentoo Linux.
    April 30, 2008
    Thanks for the response. Where can we download 3.0.1? You'd think maybe Adobe would put a link to that in a some easy to find location.

    Sure would be nice if this version runs better. I'm a little sick of being an alpha tester for these guys.

    We run ONE piece of software that is not open source, and it is the one thing that gives us the most trouble.

    JP