Brussels / 1 & 2 February 2020


SWIM - Protocol to Build a Cluster

SWIM gossip protocol, its implementation, and improvements

SWIM - is a relatively new protocol to discover and monitor cluster nodes, to disseminate events and data between them. The protocol is extremely lightweight, decentralised, and its speed and load per node do not depend on cluster size.

The protocol solves several tasks at once. First - build and keep up to date topology of a cluster without explicit configuration. The task is quite intricate because:

  • new just started nodes know nothing about others, and they should somehow discover them;
  • already working nodes can fail, and it should be detected so as to change a master, or evict an unrecoverable node from the cluster, or restart it.

According to the protocol, cluster nodes broadcast packets and send p2p ping requests. Broadcast helps to discover new nodes, p2p pings help to detect failure of a known node.

A second task - events dissemination in a cluster. Event is a node failure; UUID change; IP address update; new node appearance - anything that affects cluster state. Sometimes users define their own event types. When a node learns about an event, it needs to disseminate the event to other nodes. SWIM protocol describes an algorithm how to detect and disseminate events, and gives the following guarantees:

  • it takes a constant time to learn about an event on at least one node in the cluster;
  • it takes logarithmic from cluster size time to disseminate that event to each node of the cluster.

In the talk I tell about how SWIM works, how and with which essential improvements it was implemented, how to use SWIM, and what are the practical performance results.

Implementation is a part of Tarantool DBMS. Tarantool is the biggest Russian Open-Source DBMS. Tarantool currently goes toward better scalability, improvements in horizontal scaling, in cluster-wide calculations, and better cluster management. In scope of that roadmap SWIM protocol implementation was recently released.


Photo of Vladislav Shpilevoy Vladislav Shpilevoy