Abstract
Replicating important services on multiple processors in a distributed architecture is a common technique for constructing dependable computing systems. The authors describe a communication substrate, called Consul, that facilitates the development of such systems by providing a collection of fundamental abstractions for constructing fault-tolerant programs based on replicated processing. These abstractions include a multicast service, a membership service, and a recovery service. Consul is unique in two respects. First, its services are implemented using a collection of algorithms that exploit the partial (or causal) ordering of messages exchanged in the system. Such algorithms are generally more efficient than those that depend on a total ordering of events. Second, its underlying architecture is configurable, thereby allowing a system to be structured according to the needs of the application. The paper sketches Consul's architecture, presents the algorithms used by its protocols, and reports on the performance of an implementation using the x-kernel.