Skip to main content
Inspiring
November 6, 2014
Question

Apache POST flex2gateway never closes or times out, reaches max child processes

  • November 6, 2014
  • 3 replies
  • 1313 views

We have been trying to pass an external PCI scan, and noticed some server lockups after starting a scan.  We are scanning a couple hundred IP addresses, which all resolve to the same servers.  The scans are actively looking for vulnerabilities on the box, and one of which is flash remoting.  When we look at the apache /server-status page, it shows a ton of long running flex2gateway processes.  For instance:

22-44466

0/3817/3817

W4.0716384000.057.7657.76x.x.x.101WebNode2.ambassador.intPOST /flex2gateway/http HTTP/1.1

As you can see, this POST request has been running for 163840 seconds, or nearly two days.  Since it seems these POST requests never complete, even though the client has long since disconnected, they simply stack up until the server's max number of child processes has been reached, effectively killing our webserver.

When I try to restart the clustered coldfusion instances one at a time, these POST requests do not die off.

If I stop both clustered CF instances, the requests complete (or get killed).

If I reload or restart apache, the requests are gone as well.

strace gives me nothing useful:

[root@WebNode1 ~]# strace -p 34025

Process 34025 attached - interrupt to quit

read(185,

pstack gives a little more, but nothing that looks obvious to me:

[root@WebNode1 ~]# pstack -p 34025     

Usage: pstack <process-id>

[root@WebNode1 ~]# pstack 34025  

#0  0x00007fdd40444740 in __read_nocancel () from /lib64/libpthread.so.0

#1  0x00007fdd33efe2e6 in jk_tcp_socket_recvfull () from /opt/coldfusion10/config/wsconfig/1/mod_jk.so

#2  0x00007fdd33f1b68d in ajp_connection_tcp_get_message () from /opt/coldfusion10/config/wsconfig/1/mod_jk.so

#3  0x00007fdd33f1ceea in ajp_get_reply () from /opt/coldfusion10/config/wsconfig/1/mod_jk.so

#4  0x00007fdd33f20308 in ajp_service () from /opt/coldfusion10/config/wsconfig/1/mod_jk.so

#5  0x00007fdd33ef8f5d in jk_handler () from /opt/coldfusion10/config/wsconfig/1/mod_jk.so

#6  0x00007fdd41b92cd0 in ap_run_handler ()

#7  0x00007fdd41b9658e in ap_invoke_handler ()

#8  0x00007fdd41ba1c50 in ap_process_request ()

#9  0x00007fdd41b9eac8 in ?? ()

#10 0x00007fdd41b9a7d8 in ap_run_process_connection ()

#11 0x00007fdd41ba6ad7 in ?? ()

#12 0x00007fdd41ba6dea in ?? ()

#13 0x00007fdd41ba7a6c in ap_mpm_run ()

#14 0x00007fdd41b7e9b0 in main ()

I dont know what that tells us exactly, but I'm leaning toward the hangup between apache and tomcat. 

Any suggestions on where how to troubleshoot this issue?

    This topic has been closed for replies.

    3 replies

    Participant
    December 1, 2014

    Make sure you have the following in one of your config files:

    # enable Flex Gateway

    <IfModule jk_module>

        JkMount /*.cfm ajp13

        JkMount /*.cfc ajp13

        JkMount /*.do ajp13

        JkMount /*.jsp ajp13

        JkMount /*.cfchart ajp13

        JkMount /*.cfres ajp13

        JkMount /*.cfm/* ajp13

        JkMount /*.cfml/* ajp13

        JkMountCopy all

    </IfModule>

    If you add this to the end of the mod_jk.conf file, just be careful when updating your connector in the future, because it may remove the lines. These commands are required to get the flex2gateway working in CF10. Without these lines, we've seen the exact same behavior you're describing.

    Hope this helps!

    GuitsBoyAuthor
    Inspiring
    December 1, 2014

    Thanks for the response.  Where exactly did you need to add this block of code?  I tried adding it to the end of the mod_jk.conf file, as well as adding it to the default virtual host block in the httpd.conf files.  Neither seems to have helped when testing.  Thanks.

    Participant
    December 1, 2014

    We have it in our mod_jk.conf file, but be careful when updating the connector because it may remove the code.

    Make sure you've restarted Apache/ColdFusion after adding the lines as well.

    You might want to return your uriworkermap.properties back to it's original version.

    Here's the thread where I originally found the entries that needed to be added:

    Re: Coldfusion 10 + Apache + Flex2gateway + Debian/Linux

    Maybe you can find more info from someone in that post.

    GuitsBoyAuthor
    Inspiring
    November 10, 2014

    On a test server, I have removed the wildcard from the uriworkermap.properties file, so it now only matches "/flex2gateway" and "/flex2gateway/".  Unfortunately I'm still seeing the occasional hung apache worker. 


    Anyone have any leads on this issue?  I don't mind doing the research, I'v just exhausted the limits of my Google Fu.


    Apache Server Status for 10.10.10.205

    Server Version: Apache/2.2.15 (Unix) DAV/2 PHP/5.3.3 mod_ssl/2.2.15 OpenSSL/1.0.1e-fips mod_wsgi/3.2 Python/2.6.6 mod_jk/1.2.32 mod_perl/2.0.4 Perl/v5.10.1
    Server Built: Oct 16 2014 14:48:21

    Current Time: Monday, 10-Nov-2014 16:49:22 EST
    Restart Time: Monday, 10-Nov-2014 15:25:16 EST
    Parent Server Generation: 0
    Server uptime: 1 hour 24 minutes 6 seconds
    Total accesses: 5313 - Total Traffic: 98.4 MB
    CPU Usage: u3.97 s1.26 cu0 cs0 - .104% CPU load
    1.05 requests/sec - 20.0 kB/second - 19.0 kB/request
    15 requests currently being processed, 11 idle workers
    WWWWWWW_W_W_W__W__W__WW_W_...................................... ................................................................ ................................................................ ................................................................ 

    Scoreboard Key:
    "_" Waiting for Connection, "S" Starting up, "R" Reading Request,
    "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
    "C" Closing connection, "L" Logging, "G" Gracefully finishing,
    "I" Idle cleanup of worker, "." Open slot with no current process

    SrvPIDAccMCPUSSReqConnChildSlotClientVHostRequest
    0-087270/12/12W0.03457200.00.050.0510.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
    1-087280/11/11W0.03435800.00.180.1810.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
    2-087290/38/38W0.04391000.01.111.1110.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
    3-087300/27/27W0.03406400.00.790.7910.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
    4-087310/16/16W0.03435400.00.120.1210.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
    5-087320/7/7W0.02456400.00.020.0210.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
    6-087330/8/8W0.02467300.00.010.0110.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
    7-087340/386/386_0.37400.06.496.4910.10.2.212www.company.qcGET /marketingpages/images/login_over.jpg HTTP/1.1
    8-094220/10/10W0.02456400.00.040.0410.10.2.201qc.company.intPOST /flex2gateway HTTP/1.1
    9-0101120/393/393_0.37600.014.5914.5910.10.2.212www.company.qcGET /marketingpages/images/box_onesource.jpg HTTP/1.1
    10-0104680/321/321W0.3284600.04.424.4210.10.2.212qc.company.intPOST /flex2gateway HTTP/1.1
    11-0104700/398/398_0.38600.012.8012.8010.10.2.212www.company.qcGET /marketingpages/images/home_eco.jpg HTTP/1.1
    12-0104710/340/340W0.3283700.04.994.9910.10.2.212qc.company.intPOST /flex2gateway/ HTTP/1.1
    13-0105440/404/404_0.34600.05.215.2110.10.2.212www.company.qcGET /marketingpages/images/box_top.jpg HTTP/1.1
    14-0105920/353/353_0.406120.014.1014.1010.10.2.212www.company.qcGET /?login HTTP/1.1
    15-0106480/296/296W0.3180000.03.823.8210.10.2.212qc.company.intPOST /flex2gateway/ HTTP/1.1
    16-0123820/339/339_0.33600.02.852.8510.10.2.212www.company.qcGET /marketingpages/images/logo_sourceone.jpg HTTP/1.1
    17-0123870/336/336_0.34600.05.065.0610.10.2.212www.company.qcGET /marketingpages/images/logo_onesource.jpg HTTP/1.1
    18-0123880/265/265W0.2583900.02.872.8710.10.2.212qc.company.intPOST /flex2gateway/ HTTP/1.1
    19-0123890/323/323_0.31000.04.824.8210.10.2.212www.company.qcGET /marketingpages/lib/dimming.js HTTP/1.1
    20-0123900/336/336_0.31400.05.245.2410.10.2.212www.company.qcGET /marketingpages/lib/superfish.js HTTP/1.1
    21-0123910/289/289W0.2780500.02.492.4910.10.2.212qc.company.intPOST /flex2gateway/ HTTP/1.1
    22-0123920/281/281W0.2783100.03.173.1710.10.2.212qc.company.intPOST /flex2gateway HTTP/1.1
    23-0147500/41/41_0.04600.00.920.9210.10.2.212www.company.qcGET /marketingpages/images/close.jpg HTTP/1.1
    24-0147510/43/43W0.04000.01.211.2110.10.2.36qc.company.intGET /server-status HTTP/1.1
    25-0147520/40/40_0.04600.00.960.9610.10.2.212www.company.qcGET /marketingpages/images/box_sourceone.jpg HTTP/1.1
    GuitsBoyAuthor
    Inspiring
    November 6, 2014

    I removed clustering by editing the uriworkermap.properties file and pointing /flex2gateway and /flex2gateway/* to a single instance, and then ran the PCI scan again.  It still seems to hang.  I'm surprised there no other complaints about this out there on the interwebs.  I cant be the only one.

    GuitsBoyAuthor
    Inspiring
    November 7, 2014

    OK, I did a little more testing from a linux CLI using curl, and I find that if I post to /flex2gateway/<any string> it will hang indefinitely.  A normal get request results in a 404, but a post will hang it indefinitely.  Whats more, posting to just /flex2gateway/ seems to perform normally (some kind of binary data connection).  Its only if I put something in the path after /flex2gateway/ that it hangs indefinitely.  It performs the same if I hit one instance specifically, as opposed to through the cluster, so that eliminates apache as the problem.  I also notice a hang when posting to /flex-internal/ and /flex-internal/<some string>

    Any clue as to why this might act this way?