Cluster and Distributed Consensus
Cluster Computing is a form of distributed computing where each node set to perform the same task. The nodes usually located in the same local area network, each of them hosted on separated virtual machine or container. The input task can be distributed to the target node by load balancer or leader node. If leader node is required then cluster should use Distributed Consensus Algorithm to select exactly one leader or re-elect it if leader failed. Additionally, consensus-enabled cluster can be used to organize fault-tolerant set of microservices where only one service (leader) can be active and perform specific operations while other are in standby mode. If active node is failed then one of the standby nodes becomes active.
.NEXT cluster programming model provides the following features in addition to the core model:
- Messaging
- Replication
- Consensus
The programming model at higher level of abstraction is represented by interfaces:
- IClusterMember represents individual node in the cluster
- ICluster represents entire cluster. This is an entry point to work with cluster using .NEXT library.
- IExpandableCluster optional interface that extends
ICluster
and represents dynamically configurable cluster where the nodes can be added or removed on-the-fly. If actual implementation doesn't support this interface then cluster can be configured only statically - it is required to shutdown entire cluster if you want to add or remove nodes - IMessageBus optional interface provides message-based communication between nodes in point-to-point manner
- IReplicationCluster optional interface represents a cluster where its state can be replicated across nodes to ensure consistency between them. Replication functionality based on audit trail. By default, design of replication infrastructure supports Weak Consistency.
Thereby, core model consists of two interfaces: ICluster
and IClusterMember
. Other interfaces are extensions of the core model.
Messaging
Messaging feature allows to organize point-to-point communication between nodes where individual node is able to send the message to any other node. The discrete unit of communication is represented by IMessage interface which is transport- and protocol-agnostic. The actual implementation should provide protocol-specific serialization and deserialization of such messages.
There are two types of messages:
- Request-Reply message is similar to RPC call when caller should wait for the response. The response payload is represented by
IMessage
- One Way (or signal) message doesn't have response. It can be delivered in two ways: 1.1. With confirmation, when sender waiting for acknowledge from receiver side. As a result, it is possible to ensure that message is processed by receiver. 1.1. Without confirmation, when sender doesn't wait for acknowledge. Such kind of delivery is not reliable but very performant.
The message can be transferred to the particular member using ISubscriber interface which is the extension of IClusterMember
interface.
Usually, you don't to implement IMessage
interface directly due to existence of ready-to-use realizations:
- BinaryMessage for raw binary content
- StreamMessage for message which payload is represented by Stream. It it suitable for large payload when it is stored on the disk
- TextMessage for textual content
Distributed Consensus
Consensus Algorithm allows to achieve overall reliability in the presence of faulty nodes. The most commonly used consensus algorithms are:
The consensus algorithm allows to choose exactly one leader node in the cluster.
.NEXT library provides protocol-agnostic implementation of Raft algorithm that can be adopted for any real network protocol. You can reuse this implementation which is located in DotNext.Net.Cluster.Consensus.Raft namespace. If you want to know more about Raft then use the following links:
- The Raft Consensus Algorithm
- The Secret Lives of Data
- In Search of an Understandable Consensus Algorithm
- Dissertation
Replication
Replication allows to share information between nodes to ensure consistency between them. Usually, consensus algorithm covers replication process. In .NEXT library, replication functionality relies on the fact that each cluster node has its own persistent audit trail (or transaction log). However, the only default implementation of it is in-memory log which is suitable in siutations when your distributed application requires distributed consensus only and don't have distributed state that should be synchronized across cluster. If you need reliable replication then provide your own implementation of IAuditTrail interface or use PersistentState class.
IReplicationCluster interface indicates that the specific cluster implementation supports state replication across cluster nodes. It exposed access to the audit trail used to track local changes and commits on other cluster nodes.
Implementations
- .NEXT Raft Suite is a fully-featured implementation of Raft algorithm and related infrastructure.