skip to main content
article
Free access

Management of a remote backup copy for disaster recovery

Published: 01 May 1991 Publication History

Abstract

A remote backup database system tracks the state of a primary system, taking over transaction processing when disaster hits the primary site. The primary and backup sites are physically isolated so that failures at one site are unlikely to propogate to the other. For correctness, the execution schedule at the backup must be equivalent to that at the primary. When the primary and backup sites contain a single processor, it is easy to achieve this property. However, this is harder to do when each site contains multiple processors and sites are connected via multiple communication lines. We present an efficient transaction processing mechanism for multiprocessor systems that guarantees this and other important properties. We also present a database initialization algorithm that copies the database to a backup site while transactions are being processed.

References

[1]
~kGRAWAL, R. A parallel logging algorithm for multiprocessor database machines. In Proceedings of the 4th Internahonal Workshop on Database Machines. Springer, New York, 1985.
[2]
BERNSTEIN, P. A., HADZILACOS, Y., AND GOODMAN, N. Concurrency Control and Recovery zn Database Systems. Addison-Wesley, Reading, Mass., 1987.
[3]
BURKES, D., AND TREIBER, K. Design approaches for real time recovery. Presentation at the Third International Workshop on High Performance Transaction Systems (Pacific Grove, Calif., Sept. 1989).
[4]
CRUS, R.A. Data recovery in IBM Database 2. IBM Syst. J. 23, 2 (1984), 178-188.
[5]
FINKELSTEIN, W., AND CAPPI, M. Experiences with large networks of computers. Presenta- Lion at the International Workshop on High Performance Transaction Systems (Pacific Grove, Calif., Sept. 1985).
[6]
GARCIA-MOLINA, H., AND ABBOTT, R. K. Reliable distributed database management. In Proceedings of the IEEE, Special Issue on D~stributed Database Systems (May 1987), 601-620.
[7]
GRAY, J. N., AND ANDERTON, M. Distributed computer systems: Four case studies. In Proceedings of the IEEE, Special Issue on Distributed Database Systems (May 1987), 719-726.
[8]
GRAY, J.N. Why do computers stop and what can be done about it? Presentation at the Fifth Symposium on Reliability in Distributed Software and Database Systems (Los Angeles, Calif., Jan. 1986).
[9]
GRAY, J. N. Notes on database operating systems. Operating Systems: An Advanced Course. R. Bayer et al., Eds., Springer, New York, 1979.
[10]
GRAY, J. N., AND t~EUTER, A. Transaction processing. Course Notes from CS~,445 StanforJ Spring Term, 1988.
[11]
KORTH H. F., AND SmBERSCHATZ, A. Database System Concepts. McGraw-Hill, New York, 1986.
[12]
IBM, IMS/VS Extended Recovery Facility (XRF)' General Informatzon. Doc. GG24-3150, March 1987
[13]
LYON, J Design considerations in replicated database systems for dmaster protectmn. IEEE Compcon, 1988
[14]
O'NErL, P.E. The escrow transactional method. ACM, Trans Database Syst 11, 4 (Dec. 1986), 405-430.
[15]
ROSENKRA~TZ, D.J. Dynamic database dumping. In Procee&ngs ofSIGMOD Internatzonal Conference on Management of Data. ACM (1978), 3-8
[16]
SKEEN, D. Nonblocking commit protocols In Proceedings of the ACM SIGMOD Conference on Management of Data (Orlando, Fl., June t982), 133-147
[17]
SCHLICHTING, R. D, AND SCHt~EmER, F D. Fail-stop processors: An approach to designing fault-tolerant computing systems ACM, Trans. Comput Syst i (Aug 1983), 222-238.
[18]
Tandem Computers Remote Duplicate Database Factllty (RDF) System Management Manual. March 1987
[19]
TAt~Et~BAU~, A.S. Computer Networks. Prentice Hall, Englewood Cliffs, N.J, 1988.

Cited By

View all

Recommendations

Reviews

Peter John Trueman

Two multicomputer systems are linked by multiple communication paths, and each has a partitioned relational database. The primary system processes the database transactions, and the other acts as a hot-standby that will take over the processing in the event of the primary system failing. This paper describes the decentralized algorithm used to ensure that the backup system's database is up to date and consistent. The algorithm is intended for use in applications that have a lot of transactions and need a quick response time. So, rather than use an expensive two-phase commit protocol to ensure that a transaction atomically updates both systems, the transaction simply commits and then propagates to the backup. This means that when the primary fails, some committed transactions may be lost and others may have to be discarded to maintain consistency; for example, if the transaction that created a bank account is lost, updating that account would be wrong. This risk is deemed to be economically acceptable given the performance requirement; if the consequences of a lost transaction are high, however, it is possible to use an atomic commit over both systems. After some good background and introductory sections, the paper defines what it means for the backup system to be consistent, and then describes how the backup system is initialized and updated, and argues that this process results in a consistent backup. Although the paper is lengthy, it is not too long; it is well structured and clear. The reader who wants just as much information as can be easil y remembered can read the first ten pages, leaving the rest for the serious student. The paper is well worth reading.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 16, Issue 2
June 1991
160 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/114325
  • Editor:
  • Gio Wiederhold
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1991
Published in TODS Volume 16, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. database initialization
  2. hot spare
  3. hot standby
  4. remote backup

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)125
  • Downloads (Last 6 weeks)22
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media