skip to main content
article
Free access

The Totem single-ring ordering and membership protocol

Published: 01 November 1995 Publication History

Abstract

Fault-tolerant distributed systems are becoming more important, but in existing systems, maintaining the consistency of replicated data is quite expensive. The Totem single-ring protocol supports consistent concurrent operations by placing a total order on broadcast messages. This total order is derived from the sequence number in a token that circulates around a logical ring imposed on a set of processors in a broadcast domain. The protocol handles reconfiguration of the system when processors fail and restart or when the network partitions and remerges. Extended virtual synchrony ensures that processors deliver messages and configuration changes to the application in a consistent, systemwide total order. An effective flow control mechanism enables the Totem single-ring protocol to achieve message-ordering rates significantly higher than the best prior total-ordering protocols.

References

[1]
AMiR, Y, DOLEV, D., KRAMER, S., AND MALKI, D. 1992a. Membership algorithms in broadcast domains In Proceedings of the 6th Internattonal Workshop on Distributed Algorithms (Haifa, Israel). Spnnger-Verlag, Berlin, 292-312.
[2]
AMm, Y, DOLEV, D., KRAIvlER, S., AND MALKI, D. 1992b. Transis: A commumcation subsystem for high avafiabfilty. In Proceedings of the IEEE 22nd Annual Internatmnal Svmposiiim on Fault-Tolerant Computing (Boston, Mass). IEEE, New York, 76-84
[3]
A~nR, Y, MOSER, L E., MELLIAR-SMITH, P. M., AGARWAL, D. A., AND CIARFELLA, P. 1993 Fast message ordering and membership using a logical token-passing ring. In Proceedings of the IEEE 13th International Conference on Distributed Computing Systems (Pittsburgh, Pa) IEEE, New York, 551-560.
[4]
A~nn, Y, MOSER, L. E, MELLIAR-SMITH, P. M., AGARWAL, D. A, A~'4O CIAnFELLA, P. 1994. The Totem single-ring ordering and membership protocol. Tech Rep. 94-19, Dept of Electrical and Computer Engnneermg, Umv. of California, Santa Barbara, Calif. Aug.
[5]
BIRMAN K. P AND VAN RENESSE, R. 1994 Reliable Distributed Computing w~th the Isis Toolkit. IEEE Computer Somety Press, Los Alamltos, Calif
[6]
BOXMA, O. J., LEVY J., AND WESTRATE, J.A. 1990. Optimization of polling systems. In Performance '90, Proceedings of the 14th IFIP WG 7.3 International Symposium on Computer Perforrnarice Modelhng, Measurement and Evaluation (Edinburg, U.K ). North-Holland, Amsterdam, 349 361.
[7]
CHANG, J. M. AND Mg24EMCHUK, N F. 1984. Reliable broadcast protocols ACM Trans Comput Syst. 2, 3 (Aug.), 251 273.
[8]
FISCHER, M. J., LYNCH, N. A., AND PATERSON, M S. 1985. Impossibility of distributed consensus with one faulty process J. ACM 32, 2 (Apr), 374-382
[9]
K~SHOEK, M F. ANn TANENnAU~, A. S. 1991. Group communicatmn m the Amoeba distnbuted operating system. In Proceedings of the IEEE lltb International Conference on Distributed Computing Systems (Arlington, Tax ). IEEE, New York, 882 891.
[10]
LAMPORT, L 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (July), 558-565.
[11]
MELLIAR-SMITH, P. M., MOSER, L. E., AND AGARWAL, D.A. 1991. Ring-based ordering protocols. In Proceedings of the IEE International Conference on Information Engineering (Singapore) IEE, Stevenage, Harts, U.K, 882 891
[12]
MELLIAn-SMITn, P. M., MOSE~, L. E., ANi~ A(n~AWALA, V. 1990. Broadcast protocols for d~stributed sytems. IEEE Trans. Parallel D*strib. Syst. 1, 1 (Jan.), 17-25.
[13]
MISHRA, S., PETERSON, L. L., AND SCHLICHTING, R.D. 1991. A membership protocol based on partml order. In Procee&ngs of the 2rid IFIP WG 10.4 Internatzonal Working Conference on Dependable Computing for Cr~tzcal Appl~catmns (Tucson, Ariz.). Springer-Verlag, Wien, Austria, 309-331.
[14]
MosE~, L. E ANn MELLIAR-SMITH, P.M. 1994. Probablhstm bounds on message delivery for the Totem single-rang protocol. In Proceedings of the IEEE 15th Real-T~me Systems Symposium (San Juan, Puerto Rico). IEEE, New York, 238-248.
[15]
MOSEn, L. E., AMIn, Y., MnLLIAn-SMITn, P M., AND AOARWAL, D.A. 1994a. Extended virtual synchrony. In Proceedings of the IEEE 14th International Conference on Dzstrlbuted Computing Systems (Posnan, Poland). IEEE, New York, 56-65.
[16]
MOSER, L. E., MELL~AR-SMn'H, P. M., AND AGRAWALA, V. 1994b. Processor membership in asynchronous distributed systems IEEE Trans Parallel Dlstrzb. Syst. 5, 5 (May), 459-473.
[17]
PETERSON, L. L, BUCHHOLZ, N. C., AND 8CHLICHTING, R.D. 1989. Preaerving and using context information in interprocess communication. ACM Trans Comput. Syst 7, 3 (Aug.), 217-246.
[18]
RAJAGOPALAN, B. AND McKINLEY, P. K. 1989. A token-based protocol for reliable, ordered multicast communication In Proceedings of the IEEE 8th Symposium on Reliable Distributed Systems (Seattle, Wash ). IEEE, New York, 84-93.
[19]
VAN RENESSE, R., HICKEY, T. M., AND BIRMAN, K. P 1994. Design and performance of Horus: A lightwmght group communications system Tech Rep. 94-1442, Dept. of Computer Science, Cornell Univ., Ithaca, N.Y. Aug.

Cited By

View all

Recommendations

Reviews

Valentin Cristea

Totem is a single-ring protocol for high-performance, fault-tolerant distributed systems that must continue to operate despite network partitioning and re-merging and despite processor failure and restart. Totem provides totally ordered message delivery with good performance using a logical token-passing ring imposed on a broadcast domain. After an introductory section, the authors present related work and highlight the differences of the Totem protocol. Significant literature on the subject is analyzed. Section 3 is dedicated to the distributed system model used in the Totem protocol design. Several terms related to protocol functioning are defined. The objective of Totem is to provide the application with reliable message delivery and membership services. These services are described in section 4 of the paper. Section 5 is devoted to the total ordering protocol with the assumptions that the token is never lost; processor failures do not occur; and the network does not become partitioned. In section 6, the conditions are relaxed, and the protocol to handle token loss, processor failure and restart, and network partitioning and re-merging is presented. The protocol is described using a finite-state machine model. Data structures used, as well as pseudocode for the work performed by processors during different states of the model, are also given. Sections 7 and 8 present the recovery protocol that maintains extended virtual synchrony during recovery after a failure, and the flow control mechanism that avoids message loss due to buffer overflow. Section 9 addresses implementation and performance. Future work is mentioned at the end of the paper. The paper is well structured, but the presentation is not uniform, some aspects being described in great detail, while others are quickly summarized. The important works on the subject are included as references.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems
ACM Transactions on Computer Systems  Volume 13, Issue 4
Nov. 1995
88 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/210223
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 1995
Published in TOCS Volume 13, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. flow control
  2. membership
  3. reliable delivery
  4. token passing
  5. total ordering
  6. virtual synchrony

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)424
  • Downloads (Last 6 weeks)32
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media