I'm running 2 Windows 2016 boxes with CF 2018 Ent installed.
We have a dedicated CF instance specifically to run as a websocket server.
It's setup to proxy via IIS.
We estimate that we could have 5,000 - 10,000 concurrent connections to it since our users could be subscribed to multiple channels.
After a fresh restart of CF and IIS, clients can connect to the websocket channel instantly and get a near instant success publish from the channel so you know you're successfully subscribed and updates will start coming in.
After a random amount of time after things are working well (minutes to days), clients will instantly connect but will no longer get the success publish until things are restarted again. This means they aren't truly subscribed to the channel anymore and no updates come through.
The instance resources look fine, good memory usage, garbage collection looks good, low cpu usage.
We've also been playing with the connection pool numbers.
server.xml max threads = 5000
We're knocking our heads against the wall with this for some time now and are hoping to get some help.
Matt, while I wish I could propose some single tweak that would help, there are just too many variables. What I would say is that with a combination of better monitoring of things, as well as close assessment of those various configuration settings (to make sure there's not an issue that's unclear for what you have shared), it SHOULD be possible both to understand what's causing the failing updates/channel communication, and then what setting needs to be tweaked (whether in the connector and its config, the proxy and its config, CF, IIS, the JVM, or perhaps even something else).
If you're at all interested in a helping hand to assess all that, see my carehart.org/consulting page. I hate to drop that as the only solution I can offer, but for now it is. Perhaps soemonee else will have another suggestion if you prefer to wait for that. But if you want it solved, either we will or you won't pay for any of my time you don't find valuable.
We're having the same issue. It requires restart to work for a period of time (most daily) and we don't know where the setting is to fix this restart issue. Please share if you found a solution for this.
We ended up creating a new CF instance that is used as a dedicated single WS server.
We also transitioned a lot of it to pubnub.
No real solution was ever found. Thanks Adobe.
To be clear, no, you should not need to restart anything. There's always a reason for that and almost always a better solution. And Tuan, you don't
clarify what it is that you are restarting. Do you mean cf? The web server? The box they are running on?
And Matt, you never responded to my offer of direct help (which was offered even potentially at no charge). While your workaround may have seemed easier--and I'm glad you're doing well with that alternative--I just want to say again that such problems should be solvable.
And I'll say the same to you, Tuan, with the same offer of direct help if interested-- especially if that workaround may not work as well for you.
It's not clear in your respective cases what or where the problem may be. But to Matt's last comment, Adobe often gets the blame for issues which may not at all be of their making. Again, there are a lot of variables in such things.
Let's find the problem and fix it, if we can. As the saying goes, it's better to light one candle than to curse the darkness.
(Or again perhaps this may catch someone's eye and they'll hop in with the perfect solution. If I had it, I'd offer it. Or perhaps they will ask the perfect question/s to drive you to the solution here in the forums alone. Again, I do when I can but in this case I sense it's just not that simple a problem. And since no one else has chimed in, in the couple of months since matt first wrote, that would seem to confirm my suspicion. )
I am aware that your post was from 2021, but can you tell us what patch of 2018 you're on?