skip to main content
research-article

Bigtable: A Distributed Storage System for Structured Data

Published: 01 June 2008 Publication History

Abstract

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this article, we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.

References

[1]
Abadi, D. J., Madden, S. R., and Ferreira, M. C. 2006. Integrating compression and execution in column-oriented database systems. Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York.
[2]
Ailamaki, A., DeWitt, D. J., Hill, M. D., and Skounakis, M. 2001. Weaving relations for cache performance. The VLDB J. 169--180.
[3]
Banga, G., Druschel, P., and Mogul, J. C. 1999. Resource containers: A new facility for resource management in server systems. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation. 45--58.
[4]
Baru, C. K., Fecteau, G., Goyal, A., Hsiao, H., Jhingran, A., Padmanabhan, S., Copeland, G. P., and Wilson, W. G. 1995. DB2 parallel edition. IBM Syst. J. 34, 2, 292--322.
[5]
Bavier, A., Bowman, M., Chun, B., Culler, D., Karlin, S., Peterson, L., Roscoe, T., Spalink, T., and Wawrzoniak, M. 2004. Operating system support for planetary-scale network services. In Proceedings of the 1st Symposium on Networked Systems Design and Implementation. 253--266.
[6]
Bentley, J. L. and McIlroy, M. D. 1999. Data compression using long common strings. In Data Compression Conference. 287--295.
[7]
Bloom, B. H. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7, 422--426.
[8]
Burrows, M. 2006. The Chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation. 335--350.
[9]
Chandra, T., Griesemer, R., and Redstone, J. 2007. Paxos made live --- An engineering perspective. In Proceedings of PODC.
[10]
Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R. E. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation. 205--218.
[11]
Comer, D. 1979. Ubiquitous B-tree. Computing Surveys 11, 2 (June), 121--137.
[12]
Copeland, G. P., Alexander, W., Boughter, E. E., and Keller, T. W. 1988. Data placement in Bubba. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 99--108.
[13]
Dean, J. and Ghemawat, S. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation. 137--150.
[14]
DeWitt, D., Katz, R., Olken, F., Shapiro, L., Stonebraker, M., and Wood, D. 1984. Implementation techniques for main memory database systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 1--8.
[15]
DeWitt, D. J. and Gray, J. 1992. Parallel database systems: The future of high performance database systems. Commun. ACM 35, 6 (June), 85--98.
[16]
French, C. D. 1995. One size fits all database architectures do not work for DSS. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 449--450.
[17]
Gawlick, D. and Kinkade, D. 1985. Varieties of concurrency control in IMS/VS fast path. Datab. Eng. Bull. 8, 2, 3--10.
[18]
Ghemawat, S., Gobioff, H., and Leung, S.-T. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles. ACM, New York, 29--43.
[19]
Gray, J. 1978. Notes on database operating systems. In Operating Systems --- An Advanced Course. Lecture Notes in Computer Science, vol. 60. Springer-Verlag, ACM, New York.
[20]
Greer, R. 1999. Daytona and the fourth-generation language Cymbal. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 525--526.
[21]
Hagmann, R. 1987. Reimplementing the Cedar file system using logging and group commit. In Proceedings of the 11th Symposium on Operating Systems Principles. 155--162.
[22]
Hartman, J. H. and Ousterhout, J. K. 1993. The Zebra striped network file system. In Proceedings of the 14th Symposium on Operating Systems Principles. ACM, New York, 29--43.
[23]
kx.com. kx.com/products/database.php. Product page.
[24]
Lamport, L. 1998. The part-time parliament. ACM Trans. Comput. Syst. 16, 2, 133--169.
[25]
MacCormick, J., Murphy, N., Najork, M., Thekkath, C. A., and Zhou, L. 2004. Boxwood: Abstractions as the foundation for storage infrastructure. In Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation. 105--120.
[26]
McCarthy, J. 1960. Recursive functions of symbolic expressions and their computation by machine. Commun. ACM 3, 4 (Apr.), 184--195.
[27]
O'Neil, P., Cheng, E., Gawlick, D., and O'Neil, E. 1996. The log-structured merge-tree (LSM-tree). Acta Inf. 33, 4, 351--385.
[28]
oracle.com. www.oracle.com/technology/products/database/clustering/index.html. Product page.
[29]
Pike, R., Dorward, S., Griesemer, R., and Quinlan, S. 2005. Interpreting the data: Parallel analysis with Sawzall. Scientific Programming Journal 13, 4, 227--298.
[30]
Ratnasamy, S., Francis, P., Handley, M., Karp, R., and Shenker, S. 2001. A scalable content-addressable network. In Proceedings of SIGCOMM. ACM, New York, 161--172.
[31]
Rowstron, A. and Druschel, P. 2001. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proceedings of Middleware 2001. 329--350.
[32]
sensage.com. sensage.com/products-sensage.htm. Product page.
[33]
Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of SIGCOMM. ACM, New York, 149--160.
[34]
Stonebraker, M. 1986. The case for shared nothing. Datab. Eng. Bull. 9, 1 (Mar.), 4--9.
[35]
Stonebraker, M., Abadi, D. J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O'Neil, E., O'Neil, P., Rasin, A., Tran, N., and Zdonik, S. 2005. C-Store: A column-oriented DBMS. In Proceedings of the 10th International Conference on Very Large Data Bases. ACM, New York, 553--564.
[36]
Stonebraker, M., Aoki, P. M., Devine, R., Litwin, W., and Olson, M. A. 1994. Mariposa: A new architecture for distributed data. In Proceedings of the 10th International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA, 54--65.
[37]
sybase.com. www.sybase.com/products/databaseservers/sybaseiq. Product page.
[38]
Zhao, B. Y., Kubiatowicz, J., and Joseph, A. D. 2001. Tapestry: An infrastructure for fault-tolerant wide-area location and routing. Tech. Rep. UCB/CSD-01-1141, CS Division, University of California, Berkeley. Apr.
[39]
Zukowski, M., Boncz, P. A., Nes, N., and Heman, S. 2005. MonetDB/X100 --- A DBMS in the CPU cache. IEEE Data Eng. Bull. 28, 2, 17--22.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems
ACM Transactions on Computer Systems  Volume 26, Issue 2
June 2008
92 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/1365815
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2008
Accepted: 01 April 2008
Revised: 01 April 2008
Received: 01 December 2006
Published in TOCS Volume 26, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. Large-Scale Distributed Storage

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,123
  • Downloads (Last 6 weeks)76
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media