Friday, November 16, 2012

Persisting discovery responses with TCPPING

I've added a nifty little feature to JGroups which helps people who use TCPPING but can't list all of the cluster nodes in the static list.

So far I've always said that if someone needs dynamic discovery, they should use a dynamic discovery protocol such as PING / MPING (require IP multicasting), TCPGOSSIP (requires external GossipRouter process), FILE_PING (requires shared file system), S3_PING / AS_PING / SWIFT_PING / RACKSPACE_PING (requires to be running in a cloud) or JDBC_PING (requires a database).

I always said that TCPPING is for static clusters, ie. clusters where the membership is finite and is always known beforehand.

However, there are cases, where it makes sense to add a little dynamicity to TCPPING, and this is what PDC (Persistent Discovery Cache) does.

PDC is a new protocol that should be placed somewhere between the transport and the discovery protocol, e.g.

    <TCP />

    <PDC cache_dir="/tmp/jgroups"  />

    <TCPPING timeout="2000" num_initial_members="20"
            port_range="0" return_entire_cache="true"
            use_disk_cache="true" />

Here, PDC is placed above TCP and below TCPPING. Note that we need to set use_disk_cache to true in the discovery protocol for it to use the persistent cache.

What PDC does is actually very simple: it intercepts discovery responses and persists them to disk. Whenever a discovery request is received, it also intercepts that request and adds its own results from disk to the response set.

Let's take a look at a use case (with TCPPING) that PDC solves:
  • The membership is {A,B,C}
  • TCPPING.initial_hosts="A"
  • A is started, the cluster is A|0={A}
  • B is started, the cluster is A|1={A,B}
  • C is started, the cluster is A|2={A,B,C}
  • A is killed, the cluster is B|3={B,C}
  • C leaves, the cluster is B|4={B}
  • C joins again
    • Without PDC, it doesn't get a response from A (which is the only node listed in TCPPING.initial_hosts), and forms a singleton cluster C|0={C}
    • With PDC, C discovers A and B and asks both of them for an initial discovery. B replies and therefore the new view is B|5={B,C}
The directory in which PDC stores its information is configured with PDC.cache_dir. If multiple cluster nodes are running on the same physical box, they can share that directory.

Feedback appreciated on the mailing list !