|Reported by:||gregwilkins||Owned by:||gregwilkins|
From mailing list:
I've come across several issues with load balancers and cometd:
1) A load balancer (or some other network failure) may hide the
failure of a connect request. The result is the connection to
the browser stays open, but no response is ever sent. The full
browser timeout (300s) is request before the connect fails.
I think we need to implement the timeout method in the binds we
do for connect (open tunnel).
But in order to set that timeout to a reasonable value, we need
to know what that value is! Currently this is private config
on the server.
I'd like to update the spec to allow the server to include
a timeout value in it's advice field, so the client can use
that (plus an expected network delay) to set a timeout.
Thus the advice from the server could potentially look like:
and the client could set a timeout a little bigger than 180000
but I'd also like individual clients to be able to specify shorter
timeouts! Thus I'd like a client to be able to say:
and then it will set the bind timeoutSeconds to 30s
However, the client will need to advise the server of this,
so I'd like to modify the spec so that if a client has a connectTimeout
shorter than the advised timeout from the server, it can advise
the server of that by sending advice in a connect request:
The expected network timeout is controlled by dojox.cometd.expectedNetworkDelay
I've got this implemented, but wanted to check that everybody thinks this
makes sense before I updated the spec and checked the code in.
2) With a stupid load balancer, it is possible for a connect to go
to 1 host and a subscribe/publish to go to another. The problem is that
the subscribe/publish will receive an error like 402::unknown client,
but we do nothing with this and everything just hangs as the connect
is never woken up.
We need to decide how to handle bind errors, bind timeouts and
error returns when sending messages.
At the very least, we should cancel the current connection and
try to reconnect with backoff.
We could also provide the failed messages in an local event so
that a client can decide if they should be resent after a renegotiated