skip to main content
10.1145/3361525.3361536acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article
Public Access

Generalized Consensus for Practical Fault Tolerance

Published: 09 December 2019 Publication History

Abstract

Despite extensive research on Byzantine Fault Tolerant (BFT) systems, overheads associated with such solutions preclude widespread adoption. Past efforts such as the Cross Fault Tolerance (XFT) model address this problem by making a weaker assumption that a majority of nodes are correct and communicate synchronously. Although XPaxos of Liu et al. (applying the XFT model) achieves similar performance as Paxos, it does not scale with the number of faults. Also, its reliance on a single leader introduces considerable downtime in case of failures. We present Elpis, the first multi-leader XFT consensus protocol. By adopting the Generalized Consensus specification, we were able to devise a multi-leader protocol that exploits the commutativity property inherent in the commands ordered by the system. Elpis maps accessed objects to non-faulty replicas during periods of synchrony. Subsequently, these replicas order all commands which access these objects. The experimental evaluation confirms the effectiveness of this approach: Elpis achieves up to 2x speedup over XPaxos and up to 3.5x speedup over state-of-the-art Byzantine Fault-Tolerant Consensus Protocols.

References

[1]
2012. AWS Service Event in the US-East Region: October 22, 2012. (2012). https://aws.amazon.com/message/680342/
[2]
2018. cockroach: CockroachDB - the open source, cloud-native SQL database. https://github.com/cockroachdb/cockroach original-date: 2014-02-06T00:18:47Z.
[3]
2018. etcd: Distributed reliable key-value store for the most critical data of a distributed system. https://github.com/coreos/etcd original-date: 2013-07-06T21:57:21Z.
[4]
2019. Google App Engine: 02 January 2019. (2019). https://status.cloud.google.com/incident/appengine/19001
[5]
2019. Google Compute Engine: November 05, 2018. (2019). https://status.cloud.google.com/incident/compute/18012
[6]
2019. Home | YugaByte DB. https://www.yugabyte.com/. Accessed: 2019-09-04.
[7]
Balaji Arun, Sebastiano Peluso, Roberto Palmieri, Giuliano Losa, and Binoy Ravindran. 2017. Speeding up Consensus by Chasing Fast Decisions. arXiv.1704.03319 [cs] (April 2017). http://arxiv.org/abs/1704.03319 arXiv: 1704.03319.
[8]
Pierre-Louis Aublin, Rachid Guerraoui, Nikola Kneevi, Vivien Quéma, and Marko Vukoli. 2015. The Next 700 BFT Protocols. ACM Trans. Comput. Syst. 32, 4 (Jan. 2015), 12:1--12:45. https://doi.org/10.1145/2658994
[9]
Bela Ban. 2002. JGroups, a toolkit for reliable multicast communication. (2002).
[10]
Alysson Neves Bessani and Marcel Santos. 2011. Bft-smart-high-performance byzantine-faulttolerant state machine replication.
[11]
Eric A Brewer. 2000. Towards robust distributed systems. In PODC, Vol. 7.
[12]
Mike Burrows. 2006. The Chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th symposium on Operating systems design and implementation. USENIX Association, 335--350.
[13]
Brad Calder, Ju Wang, Aaron Ogus, Niranjan Nilakantan, Arild Skjolsvold, Sam McKelvie, Yikang Xu, Shashwat Srivastav, Jiesheng Wu, Huseyin Simitci, et al. 2011. Windows Azure Storage: a highly available cloud storage service with strong consistency. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. ACM, 143--157.
[14]
Apache Cassandra. 2014. Apache cassandra. Website. Available online at http://planetcassandra.org/what-is-apache-cassandra (2014), 13.
[15]
Miguel Castro and Barbara Liskov. 1999. Practical Byzantine fault tolerance. In OSDI, Vol. 99. 173--186.
[16]
James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. 2013. Spanner: Googles Globally Distributed Database. ACM Trans. Comput. Syst. 31, 3 (Aug. 2013), 8:1--8:22. https://doi.org/10.1145/2491245
[17]
Patrick Hunt, Mahadev Konar, Flavio P Junqueira, and Benjamin Reed. 2010. ZooKeeper: Wait-free coordination for Internet-scale systems. (2010), 14.
[18]
Ramakrishna Kotla, Lorenzo Alvisi, Mike Dahlin, Allen Clement, and Edmund Wong. 2007. Zyzzyva: speculative byzantine fault tolerance. In ACM SIGOPS Operating Systems Review, Vol. 41. ACM, 45--58.
[19]
Leslie Lamport. 2001. Paxos made simple. ACM Sigact News 32, 4 (2001), 18--25.
[20]
Leslie Lamport. 2002. Specifying systems: the TLA+ language and tools for hardware and software engineers. Addison-Wesley Longman Publishing Co., Inc.
[21]
Leslie Lamport. 2005. Generalized Consensus and Paxos. (2005), 63.
[22]
Leslie Lamport. 2006. Fast Paxos. Distributed Computing 19 (Oct. 2006). https://www.microsoft.com/en-us/research/publication/fast-paxos/
[23]
Leslie Lamport, Robert Shostak, and Marshall Pease. 1982. The Byzantine Generals Problem. ACM Transactions on Programming Languages and Systems 4, 3 (July 1982), 382--401. https://doi.org/10.1145/357172.357176
[24]
Shengyun Liu, Paolo Viotti, Christian Cachin, Vivien Quéma, and Marko Vukolic. 2016. XFT: Practical Fault Tolerance beyond Crashes. In OSDI. 485--500.
[25]
J-P Martin and Lorenzo Alvisi. 2006. Fast byzantine consensus. IEEE Transactions on Dependable and Secure Computing 3, 3 (2006), 202--215.
[26]
Iulian Moraru, David G. Andersen, and Michael Kaminsky. 2013. There is more consensus in Egalitarian parliaments. ACM Press, 358--372. https://doi.org/10.1145/2517349.2517350
[27]
Diego Ongaro and John K. Ousterhout. 2014. In search of an understandable consensus algorithm. In USENIX Annual Technical Conference. 305--319.
[28]
S. Peluso, A. Turcu, R. Palmieri, G. Losa, and B. Ravindran. 2016. Making Fast Consensus Generally Faster. In 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). 156--167. https://doi.org/10.1109/DSN.2016.23
[29]
Fred B Schneider. 1993. Replication management using the state-machine approach, Distributed systems. (1993).
[30]
Swaminathan Sivasubramanian. 2012. Amazon dynamoDB: a seamlessly scalable non-relational database service. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 729--730.
[31]
Mohammad Reza Khalifeh Soltanian and Iraj Sadegh Amiri. 2016. Chapter 1 - Introduction. In Theoretical and Experimental Methods for Defending Against DDOS Attacks, Mohammad Reza Khalifeh Soltanian and Iraj Sadegh Amiri (Eds.). Syngress, 1--5. https://doi.org/10.1016/B978-0-12-805391-1.00001-8

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Middleware '19: Proceedings of the 20th International Middleware Conference
December 2019
342 pages
ISBN:9781450370097
DOI:10.1145/3361525
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Blockchain
  2. Byzantine Fault Tolerance
  3. Collision Recovery
  4. Consensus
  5. Generalized Consensus

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

Middleware '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 203 of 948 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)14
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media