Question
intermittent connection problems
Hi,
we have two FMS servers running version 2.0.4 r79 on Linux, and they are managed by a load balancer that directs client connections to both of them.
we have had two problems with them.
1. Failed connections -
About every 2 days, one of the servers stops responding to connection requests. It accepts the socket connections, but does not go through with the protocol handshake -
This is a typical connection result from the FMS server indicating success --
00000000 02 00 00 00 00 00 04 05 00 00 00 00 00 13 12 d0 |................|
00000010 02 00 00 00 00 00 05 06 00 00 00 00 00 13 12 d0 |................|
00000020 02 02 00 00 00 00 00 0e 04 00 00 00 00 00 08 00 |................|
00000030 00 00 00 00 00 00 01 0a 55 d8 c0 02 00 00 00 00 |........U.......|
00000040 00 06 04 00 00 00 00 00 00 00 00 00 00 03 00 00 |................|
00000050 00 00 00 73 14 00 00 00 00 02 00 07 5f 72 65 73 |...s........_res|
00000060 75 6c 74 00 3f f0 00 00 00 00 00 00 05 03 00 05 |ult.?...........|
00000070 6c 65 76 65 6c 02 00 06 73 74 61 74 75 73 00 04 |level...status..|
00000080 63 6f 64 65 02 00 1d 4e 65 74 43 6f 6e 6e 65 63 |code...NetConnec|
00000090 74 69 6f 6e 2e 43 6f 6e 6e 65 63 74 2e 53 75 63 |tion.Connect.Suc|
000000a0 63 65 73 73 00 0b 64 65 73 63 72 69 70 74 69 6f |cess..descriptio|
000000b0 6e 02 00 15 43 6f 6e 6e 65 63 74 69 6f 6e 20 73 |n...Connection s|
000000c0 75 63 63 65 65 64 65 64 2e 00 00 09 03 00 00 00 |ucceeded........|
000000d0 00 00 1b 14 00 00 00 00 02 00 05 73 65 74 49 64 |...........setId|
000000e0 00 00 00 00 00 00 00 00 00 05 00 41 03 57 20 00 |...........A.W .|
000000f0 00 00 00 |...|
But when this error occurs, nothing is sent to the client, and it just hangs. There's no rejection, or failure message.
There's no mention in the logs of any problem.
Restarting the FMS daemon usually resolves this problem when it occurs.
2. The second problem involves the connection count for the purpose of license enforcement.
Each of our servers has 3 150-connection licenses, totaling 450 maximum concurrent connections on each server. Even though we rarely have more than 10 concurrent clients on each server, every few days the accumulated connection count reaches the license limit, and further connections are rejected. As far as the application and access logs show, all connections are terminated successfully, but there still seems to be some resource leak.
Restarting the service does not solve the problem, but rebooting the server does.
Any help would be greatly appreciated.
Udi.
we have two FMS servers running version 2.0.4 r79 on Linux, and they are managed by a load balancer that directs client connections to both of them.
we have had two problems with them.
1. Failed connections -
About every 2 days, one of the servers stops responding to connection requests. It accepts the socket connections, but does not go through with the protocol handshake -
This is a typical connection result from the FMS server indicating success --
00000000 02 00 00 00 00 00 04 05 00 00 00 00 00 13 12 d0 |................|
00000010 02 00 00 00 00 00 05 06 00 00 00 00 00 13 12 d0 |................|
00000020 02 02 00 00 00 00 00 0e 04 00 00 00 00 00 08 00 |................|
00000030 00 00 00 00 00 00 01 0a 55 d8 c0 02 00 00 00 00 |........U.......|
00000040 00 06 04 00 00 00 00 00 00 00 00 00 00 03 00 00 |................|
00000050 00 00 00 73 14 00 00 00 00 02 00 07 5f 72 65 73 |...s........_res|
00000060 75 6c 74 00 3f f0 00 00 00 00 00 00 05 03 00 05 |ult.?...........|
00000070 6c 65 76 65 6c 02 00 06 73 74 61 74 75 73 00 04 |level...status..|
00000080 63 6f 64 65 02 00 1d 4e 65 74 43 6f 6e 6e 65 63 |code...NetConnec|
00000090 74 69 6f 6e 2e 43 6f 6e 6e 65 63 74 2e 53 75 63 |tion.Connect.Suc|
000000a0 63 65 73 73 00 0b 64 65 73 63 72 69 70 74 69 6f |cess..descriptio|
000000b0 6e 02 00 15 43 6f 6e 6e 65 63 74 69 6f 6e 20 73 |n...Connection s|
000000c0 75 63 63 65 65 64 65 64 2e 00 00 09 03 00 00 00 |ucceeded........|
000000d0 00 00 1b 14 00 00 00 00 02 00 05 73 65 74 49 64 |...........setId|
000000e0 00 00 00 00 00 00 00 00 00 05 00 41 03 57 20 00 |...........A.W .|
000000f0 00 00 00 |...|
But when this error occurs, nothing is sent to the client, and it just hangs. There's no rejection, or failure message.
There's no mention in the logs of any problem.
Restarting the FMS daemon usually resolves this problem when it occurs.
2. The second problem involves the connection count for the purpose of license enforcement.
Each of our servers has 3 150-connection licenses, totaling 450 maximum concurrent connections on each server. Even though we rarely have more than 10 concurrent clients on each server, every few days the accumulated connection count reaches the license limit, and further connections are rejected. As far as the application and access logs show, all connections are terminated successfully, but there still seems to be some resource leak.
Restarting the service does not solve the problem, but rebooting the server does.
Any help would be greatly appreciated.
Udi.
