• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

CF2016 cluster members won't start

New Here ,
Mar 01, 2018 Mar 01, 2018

Copy link to clipboard

Copied

I have a 2 member cluster.  If member A is running member B will not start.  If I have member B running member A will not start.  The member that fails to start will eventually time out (Window Service) with this error:

ar 01, 2018 9:22:16 AM org.apache.catalina.ha.tcp.SimpleTcpCluster send

SEVERE: Unable to send message through cluster sender.

org.apache.catalina.tribes.ChannelException: Operation has timed out(3000 ms.).; Faulty members:tcp://{xxx.xxx.xxx.xxx}:4005;

    at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:102)

    at org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:47)

    at org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:57)

    at org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82)

    at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:78)

    at org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:91)

    at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:78)

    at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:92)

    at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:78)

    at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:237)

    at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:190)

    at org.apache.catalina.ha.tcp.SimpleTcpCluster.send(SimpleTcpCluster.java:684)

    at org.apache.catalina.ha.session.DeltaManager.sendSessions(DeltaManager.java:1442)

    at org.apache.catalina.ha.session.DeltaManager.handleGET_ALL_SESSIONS(DeltaManager.java:1359)

    at org.apache.catalina.ha.session.DeltaManager.messageReceived(DeltaManager.java:1171)

    at org.apache.catalina.ha.session.DeltaManager.messageDataReceived(DeltaManager.java:929)

    at org.apache.catalina.ha.session.ClusterSessionListener.messageReceived(ClusterSessionListener.java:77)

    at org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:783)

    at org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:764)

    at org.apache.catalina.tribes.group.GroupChannel.messageReceived(GroupChannel.java:300)

    at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:83)

    at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.messageReceived(TcpFailureDetector.java:116)

    at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:83)

    at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:83)

    at org.apache.catalina.tribes.group.ChannelCoordinator.messageReceived(ChannelCoordinator.java:276)

    at org.apache.catalina.tribes.transport.ReceiverBase.messageDataReceived(ReceiverBase.java:244)

    at org.apache.catalina.tribes.transport.nio.NioReplicationTask.drainChannel(NioReplicationTask.java:213)

    at org.apache.catalina.tribes.transport.nio.NioReplicationTask.run(NioReplicationTask.java:101)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

    at java.lang.Thread.run(Unknown Source)

Any ideas what the issue could be?

Views

786

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Mar 01, 2018 Mar 01, 2018

This sure sounds like a port conflict of some sort. So about that port 4005 mentioned in the error, I think you'll find that's the tcpListenPort, specified in the server.xml file within the element:

<Receiver className="org.apache.catalina.cluster.tcp.ReplicationListener"

Do you show that in your server.xml for each instance? Is it defined to be the same value, in the server.xml on both instances?

If so, are the instances on the same machine? If so, what if you change that to a different port for e

...

Votes

Translate

Translate
Community Expert ,
Mar 01, 2018 Mar 01, 2018

Copy link to clipboard

Copied

This sure sounds like a port conflict of some sort. So about that port 4005 mentioned in the error, I think you'll find that's the tcpListenPort, specified in the server.xml file within the element:

<Receiver className="org.apache.catalina.cluster.tcp.ReplicationListener"

Do you show that in your server.xml for each instance? Is it defined to be the same value, in the server.xml on both instances?

If so, are the instances on the same machine? If so, what if you change that to a different port for each instance? Just give it a shot, in the one that is not running now and therefore won't start. Does it then start?

If so, the next question would of course be what impact might this have. I have not been able to find good enough docs (in CF or tomcat) to explain that. But the pragmatic question would seem simply whether a) the instances now both come up and b) whether the failover and replication work for you.

Either way, do let us know. 🙂


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 01, 2018 Mar 01, 2018

Copy link to clipboard

Copied

LATEST

It appears that there was a port conflict between the two.  Thanks Charlie!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation