Skip to main content
Inspiring
June 10, 2009
Question

Bundled Apache stops receiving requests

  • June 10, 2009
  • 2 replies
  • 12854 views

Ok, I think I'm getting closer to the source of my problem.

Let me outline my setup first (I'm using private IPs for example only). I have set two IP addresses to one network adaptor as follows:

CentOS 5.3 server running Apache 2.2.3 on eth0 with IP 192.168.0.78 <- working fine absolutely no problems at all

FMS 3.5.2 running Apache bundled with FMS on eth0:2 with IP 192.168.0.79

In fms.ini I have the following config:

ADAPTOR.HOSTPORT = 192.168.0.79:1935,80

HTTPPROXY.HOST = 192.168.0.79:8134

At (apparently) random times, the FMS Apache stops serving requests, but does not crash, and for some reason blocks requests to RTMP at the same time. Nothing appears in the log files which makes this even more tricky.

I've run a test by shutting down apache as follows:

./httpd -f ./conf/httpd.conf -d "/opt/adobe/fms/Apache2.2" -k stop

and I expected RTMP to stop working aswell, however it continued serving up RTMP, which has led me to believe that something is then stalling the serving up of RTMP requests. This results in total downtime of all applications, so is pretty critical!

Any help would be massively appreciated,

Thanks,

Paul

    This topic has been closed for replies.

    2 replies

    Participating Frequently
    September 6, 2009

    Hi,

    It's been along time from the first request and I don't see any respond from adobe peoples.

    Does someone read it ? Does something been made to solve it ? Does somebody out there ?

    Alon

    aSiNeAuthor
    Inspiring
    September 7, 2009

    It has indeed been a long time - Adobe are trying to address this problem, but as yet no fix has been found.

    If you are experiencing the same problem, can you post details of your problem here so we can see if there is any similarity between issues.

    Participating Frequently
    September 7, 2009

    See my mail http://forums.adobe.com/message/2215588#2215588

    Are you in touch with someone from Adobe ? Do they read this forum ?

    P Consider the environment. Please don't print this E-Mail unless you

    really need to.

    ================================

    Sharfi Alon

    Computing Center

    Weizmann Institute of Science

    Tel: 972-89342764

    Fax: 972-89343041

    E-mail: alon.sharfi@weizmann.ac.il

    June 10, 2009

    This is an odd one. Can you give a little more information?

    First, RTMP is a connected streaming protocol, not a request-response protocol like HTTP, so trying to talk about it in HTTP-like terms can get confusing. It's probably my fault for not understanding you,so please bear with me.

    1. When you say that FMS "blocks requests to RTMP at the same time," do you mean:
            1. Already-connected and already-streaming clients stop streaming?
            2. Existing streams work, but commands like play, pause, etc. on existing connections stop working?
            3. Existing connections work fine, but new connections aren't accepted?
            4. Something different?
          1. If you use RTMPT instead of RTMP, is it the same, or different?
          2. Does it matter whether you connect to port 1935 or port 80 with RTMP?
          3. If you make an HTTP connection directly to port 8134, does it work? (assuming your firewall allows this--if not, try using links from a local console)
          4. Does everything recover like magic at some point, or does it stay broken until you restart the FMS service?
            1. If it does recover, how long does it take?
          5. Are you anywhere near running out of swap space when this happens?
          6. Is any process anywhere near 3GB of virtual memory size when this happens?

          Also, this part confuses me:

          > I've run a test by shutting down apache as follows:

          >

          > ./httpd -f ./conf/httpd.conf -d "/opt/adobe/fms/Apache2.2" -k stop

          >

          > and I expected RTMP to stop working aswell, however it continued serving up RTMP,

          Why did you expect RTMP to stop working when you did this?

          I know that this isn't explained in great detail, but FMS is acting as a proxy for Apache, not the other way around. So, if Apache isn't working, the worst-case scenario is that HTTP requests (except for RTMPT tunneling and a few other special cases) hang, timeout, or return errors; RTMP won't be affected at all. But if FMS isn't working, nothing will work.

          To get a little lower level: The process fmsedge does all of the listening for everyone else. For HTTP, it proxies to Apache. For RTMPT, it proxies to fmscore (which does the actual streaming) and wraps it up in an HTTP tunnel. For RTMP, it hands the connection off to fmscore and gets out of the way.

          So, if fmsedge were to hang for some reason, what you'd see is:

          • Existing RTMP connections work--streams keep streaming, commands keep working, etc.
          • Existing RTMPT connections freeze up.
          • Existing HTTP connections freeze up.
          • New connections of any kind time out.

          If fmsedge were to crash, it would get restarted--but, during the time it takes to restart it, you'd see the same symptoms as with a hung fmsedge (except that maybe you'd get faster rejected connections/errors/timeouts), but then everything would recover.

          > Nothing appears in the log files which makes this even more tricky.

          Are you sure there's nothing weird in the master and edge logs? I know people normally only look at core (and access and application logs), but if the edge is the problem, there won't be anything there.

          aSiNeAuthor
          Inspiring
          June 11, 2009

          Hi, and much appreciate the response.

          1. As it appears to be happening at random, I can't say exactly *when* the server stops serving requests. When the server does stop serving requests, you cannot connect to RTMP instances, and the FMS HTTP server doesn't serve anything either ( in fact it never times out, just get a blank screen), so both appear to be hung, and don't get restarted.

          Looking through the master logs this is twhat it contains (i've snipped it down, but there are quite a few of these):

          2009-06-10     18:10:18     2540     (i)2581221     Core (5229) started, arguments : -adaptor "_defaultRoot_" -vhost  -app  -inst  -tag  -console  -conf "/opt/adobe/fms/conf/Server.xml" -name "_defaultRoot_::::".     -
          2009-06-10     18:50:41     2540     (i)2581223     Core (5229) is no longer active.     -
          2009-06-10     22:27:36     2540     (i)2581221     Core (25380) started, arguments : -adaptor "_defaultRoot_" -vhost  -app  -inst  -tag  -console  -conf "/opt/adobe/fms/conf/Server.xml" -name "_defaultRoot_::::".     -
          2009-06-10     22:52:46     2540     (i)2581223     Core (25380) is no longer active.     -

          And this is from core log:

          2009-06-10     18:50:19     2563     (i)2581250     Edge disconnected from core (5229).     -
          2009-06-10     22:27:36     2563     (i)2581252     Registering core (25380).     -
          2009-06-10     22:52:37     2563     (i)2581250     Edge disconnected from core (25380).     -

          To me, nothing looks like its crashing, but then I might just be reading it wrong.

          2. Havent tried - but will do when it stops working again

          3. No

          4. Havent tried - but will do when it stops working again

          5. No, stays broken until service is restarted

          6. Not as far as I know, however is there a minimum swap space required? The sevrer has 8GB ram installed.

          7. No

          As for restarting the httpd service, I was mistaken. My troubleshooting took me in that direction as this problem only arised *after* I had reinstalled FMS with Apache (previously I had opted to not install and everything ran fine). So I assumed that because Apache was crashing, this was affecting FMS somehow. However FMS is definitely not crashing - this is what is confusing me the most. Even after the streams are inaccessible, I can still connect to the server using the admin console!

          Thanks,

          Paul

          June 11, 2009

          aSiNe wrote:

          Hi, and much appreciate the response.

          1. As it appears to be happening at random, I can't say exactly *when* the server stops serving requests. When the server does stop serving requests, you cannot connect to RTMP instances, and the FMS HTTP server doesn't serve anything either ( in fact it never times out, just get a blank screen), so both appear to be hung, and don't get restarted.

          The edge going down--even if the cores and apache were running--would cause exactly that. There's nothing wrong with apache, so it doesn't get restarted; there's nothing wrong with the cores, so they don't get restarted (although eventually they will quit because all of their apps are unused--which you may be seeing in the logs); the only problem is that they're not getting connections proxied or migrated to them because there's no (working) edge to give them any.

          The most useful thing would be to verify that this is what's happening. If you can leave a client connected for a long time, ideally playing some stream on repeating loop, then the next time FMS stops accepting new connections, you can see if the existing client is still playing. If so, that means the edge is down but the core is still running; if not, that means something else is wrong.

          Here are some more things to try next time it happens:

          • nc localhost 80, type a few lines of random garbage, and see whether it fails to connect, hangs and ignores everything you type, disconnects you (immediately, after the first line, or otherwise), or sends back random garbage of its own.
          • ps -C fmsedge, to see if the edge is still running.
          • killall -6 fmsedge, which should kill the (apparently-hung) edge and get a core dump that someone on the FMS team can look at. (You may have to put "ulimit -c unlimited" in the server script, or otherwise change your settings.
          Also, after killing the edge, see if it comes back up automatically. (There should also be stuff in the logs about that happening.)

          To me, nothing looks like its crashing, but then I might just be reading it wrong.

          No, that looks accurate. The only thing you're seeing is cores quitting normally, either because of inactivity or because their edges have gone down or stopped responding. (Unfortunately, there's no way to distinguish between these cases from the core log.)

          6. Not as far as I know, however is there a minimum swap space required? The sevrer has 8GB ram installed.

          There's no specific minimum; you just have to have "enough" that you don't run out. On one extreme, I have FMS running on a VM with 512MB total RAM+swap; it can serve one or two clients without any problems. On the other extreme, someone reported using 128GB of RAM+swap to run 100 core processes doing who knows what.

          With default settings and a moderate load, I think 8GB and minimal swap should be more than enough. But if you want to make sure, run "cat /proc/meminfo" (or your favorite tool for the same purpose) and see how much MemFree + SwapFree is. If it never gets anywhere near 0, you don't have to worry.

          As for restarting the httpd service, I was mistaken. My troubleshooting took me in that direction as this problem only arised *after* I had reinstalled FMS with Apache (previously I had opted to not install and everything ran fine). So I assumed that because Apache was crashing, this was affecting FMS somehow. However FMS is definitely not crashing - this is what is confusing me the most. Even after the streams are inaccessible, I can still connect to the server using the admin console!

          Well, the admin is handled by another process (fmsadmin), which runs almost completely independently of the rest of FMS (so it can be used to detect problems, start and stop apps, etc.). So, I wouldn't expect that to stop working.

          Unfortunately, the admin is mainly there to administer the core processes (which do all the work) and the master (which is used to control the whole set of processes); the edge is supposed to be a lightweight and simple proxy that doesn't need much administration, so the admin server barely talks to it. Except, of course, when the edge isn't working, as seems to be happening for you.

          I realize this is confusing. Let me try to break it down:

          • RTMP on 1935 or 80: Connects to the edge, which then finds a core process to hand the connection to and then gets out of the way.
          • RTMPT on 80: Connects to the edge, which then finds a core process and tunnels the connection for you.
          • HTTP on 80: Connects to the edge, which then finds a webserver and proxies the connection for you.
          • Admin on 1111: Connects to the admin, which does everything internally.

          So, a hung edge would mean no new RTMP connections, but existing ones work fine; RTMPT and HTTP are hosed; admin is unaffected.
          Anyway, thanks for gathering all of this information; it'll be very helpful in tracking down the bug and fixing it.