notes on Visibroker Naming Service implicit object cluster and stale object references

Visibroker Naming Service (VBNS) is the Corba naming services provided by Borland Visibroker. It supports implicit object clustering which allows you to bind multiple object references under one name. This is not CORBA compliant, how ever it allows easy/transparent clustering of corba objects instead of explictly going through the hassle of ClusterManager.

Several things to note for implicit object cluster mode:
(1) if the peroperty “vbroker.naming.propBindOn” is set to 1, the implicit clustering feature is turned on. by default it is 0.

(2) if the peroperty “vbroker.naming.smrr.pruneStaleRef” is set to 1, a stale object reference that was previously bound to a cluster with the Smart Round Robin criterion will be removed from the bindings when the name service discovers it.

The “prineStaleRef” behavior sometimes causes trouble though: The Naming Service in implicit clustering mode might mark an active object reference as stale erroneously due to some transient network problem that caused the object to be unreachable for some time.

For example, the following scenario happened:
— naming service and the corba server are running on the same box,
— between the Corba server box and the corba client there is a firewall
— the corba client is able to looks up the object A from VBNS and successfully got a references
— the corba client tries to connect to the the Server, however was blocked by the firewall and can’t talk to it. so it reports back to the naming service with a “MarkSuspect” message marking the object as stale.
— the naming service then considers the object refernce A as stale and remove (unbind) it from the cluster.
–now other corba clients can no longer get a refrence of object A although the Server is still actively running.

The problem above is because of hte firewall, which should not block the traffic between the corba client and the corba server. but it’s not easy to identify the root cause. Similar problem is also possible to occured becasue of transient network issues. The article at presnts an out-of-band solution for this but it does not address the problem well since the code will bind multiple refrences of the same object in the cluster.


About: mmpower

Software Architect & Soccer Fan 黑超白袜 = IT 民工 + 摇滚大叔

Leave a Reply

Your email address will not be published. Required fields are marked *