QuePaxa: Escaping the Tyranny of Timeouts in Consensus
Pasindu Tennage, Cristina Basescu, Eleftherios Kokoris-Kogias,
Ewa Syta, Philipp Jovanovic, Vero Estrada-GaliƱanes, and Bryan Ford
29th ACM Symposium on Operating Systems Principles
October 23-26, 2023
Abstract:
Leader-based consensus algorithms are fast and efficient under normal
conditions, but lack robustness to adverse conditions due to their reliance
on timeouts for liveness. We present QuePaxa, the first protocol offering
state-of-the-art normal-case efficiency without depending on timeouts. QuePaxa
uses a novel randomized asynchronous consensus core to tolerate adverse
conditions such as denial-of-service (DoS) attacks, while a one-round-trip fast
path preserves the normal-case efficiency of Multi-Paxos or Raft. By allowing
simultaneous proposers without destructive interference, and using short
hedging delays instead of conservative timeouts to limit redundant effort,
QuePaxa permits rapid recovery after leader failure without risking costly view
changes due to false timeouts. By treating leader choice and hedging delay as
a multi-armed-bandit optimization, QuePaxa achieves responsiveness to
prevalent conditions, and can choose the best leader even if the current one
has not failed. Experiments with a prototype confirm that QuePaxa achieves
normal-case LAN and WAN performance of 600k and 250k cmd/sec in throughput,
respectively, comparable to Multi-Paxos. Under conditions such as DoS
attacks, misconfigurations, or slow leaders that severely impact existing
protocols, we find that QuePaxa remains live with median latency under 380ms in
WAN experiments.
Full paper:
PDF
ACM