[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xmlblaster] Cluster peers

Sounds to be a good idea to base the scenario on the proven OSPF
router protocol,


Michael Lum wrote:
IMO, we would have to look at this using a 'dynamic routing' scenario. If we explicitly pinned slaves and masters, when one of them dies, we have the potential for message loss. In other words, if bilbo is only connected to saroman, and bilbo dies, messages published to bilbo will be queued until saroman is up, UNLESS you can reconfigure bilbo on the fly to re-route messages via frodo. It's similar to OSPF routing where links have a certain cost, and you can 'prefer' links via a metric value.

In a HA scenario, if saroman died, fail-safe subscribers would reconnect via the VIP to sauron to resubscribe to their topics (or regain their session if session mirroring is implemented). Next, if frodo died, publishers would now be publishing to bilbo since the load balancer will re-route when the health-check fails. However, if bilbo is pinned to saroman, messages will simply re-queue. Instead, bilbo should have two route costs, such as 'saroman 10' and 'sauron 100', meaning that bilbo prefers saroman for cluster delivery, but will route to sauron if the link is down. This way, messages don't simply queue (and possibly overflow) on bilbo. Also, if the hardware for saroman was unrecoverable and would take 3 days to replace, one could run on the degraded cluster without need for a reconfig, because the route costs would take care of that for you, until you could purchase new hardware.

Finally, adding a new host 'gandalf' would be pretty simple because you simply need to express your routing costs for its message domains.

On 03/02/05 00:10, Marcel Ruff wrote:

Hi Michael,

i have put your drawing online so others can view
it as well (it got bounced by the mailing list):


Probably the 'crossing arrows' between

  frodo - saroman


  bilbo - sauron

is too complicated. In such a case the messages need to convey
the information that they have reached the other master
already with the direct connection frodo->sauron and
bilbo->saroman and don't need to be mirrored anymore.
And further, it is questionable if bilbo or frodo should
have knowledge about the backend master cluster setup.
This cluster setup could change to 3 or more mirrored master nodes
or sauron could go for maintenance and another mirror
node 'gandalf' pops up.

Conceptually it could be more wise to only mirror on
master level sauron<-->saroman, what do you think?