research-article

Bigtable: A Distributed Storage System for Structured Data

Authors:

Sanjay Ghemawat,

Wilson C. Hsieh,

Deborah A. Wallach,

Tushar Chandra,

Robert E. GruberAuthors Info & Claims

ACM Transactions on Computer Systems (TOCS), Volume 26, Issue 2

Article No.: 4, Pages 1 - 26

https://doi.org/10.1145/1365815.1365816

Published: 01 June 2008 Publication History

Abstract

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this article, we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.

References

[1]

Abadi, D. J., Madden, S. R., and Ferreira, M. C. 2006. Integrating compression and execution in column-oriented database systems. Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York.

Digital Library

[2]

Ailamaki, A., DeWitt, D. J., Hill, M. D., and Skounakis, M. 2001. Weaving relations for cache performance. The VLDB J. 169--180.

Digital Library

[3]

Banga, G., Druschel, P., and Mogul, J. C. 1999. Resource containers: A new facility for resource management in server systems. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation. 45--58.

Digital Library

[4]

Baru, C. K., Fecteau, G., Goyal, A., Hsiao, H., Jhingran, A., Padmanabhan, S., Copeland, G. P., and Wilson, W. G. 1995. DB2 parallel edition. IBM Syst. J. 34, 2, 292--322.

Digital Library

[5]

Bavier, A., Bowman, M., Chun, B., Culler, D., Karlin, S., Peterson, L., Roscoe, T., Spalink, T., and Wawrzoniak, M. 2004. Operating system support for planetary-scale network services. In Proceedings of the 1st Symposium on Networked Systems Design and Implementation. 253--266.

Digital Library

[6]

Bentley, J. L. and McIlroy, M. D. 1999. Data compression using long common strings. In Data Compression Conference. 287--295.

Digital Library

[7]

Bloom, B. H. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7, 422--426.

Digital Library

[8]

Burrows, M. 2006. The Chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation. 335--350.

Digital Library

[9]

Chandra, T., Griesemer, R., and Redstone, J. 2007. Paxos made live --- An engineering perspective. In Proceedings of PODC.

Digital Library

[10]

Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R. E. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation. 205--218.

Digital Library

[11]

Comer, D. 1979. Ubiquitous B-tree. Computing Surveys 11, 2 (June), 121--137.

Digital Library

[12]

Copeland, G. P., Alexander, W., Boughter, E. E., and Keller, T. W. 1988. Data placement in Bubba. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 99--108.

Digital Library

[13]

Dean, J. and Ghemawat, S. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation. 137--150.

Digital Library

[14]

DeWitt, D., Katz, R., Olken, F., Shapiro, L., Stonebraker, M., and Wood, D. 1984. Implementation techniques for main memory database systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 1--8.

Digital Library

[15]

DeWitt, D. J. and Gray, J. 1992. Parallel database systems: The future of high performance database systems. Commun. ACM 35, 6 (June), 85--98.

Digital Library

[16]

French, C. D. 1995. One size fits all database architectures do not work for DSS. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 449--450.

Digital Library

[17]

Gawlick, D. and Kinkade, D. 1985. Varieties of concurrency control in IMS/VS fast path. Datab. Eng. Bull. 8, 2, 3--10.

[18]

Ghemawat, S., Gobioff, H., and Leung, S.-T. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles. ACM, New York, 29--43.

Digital Library

[19]

Gray, J. 1978. Notes on database operating systems. In Operating Systems --- An Advanced Course. Lecture Notes in Computer Science, vol. 60. Springer-Verlag, ACM, New York.

Digital Library

[20]

Greer, R. 1999. Daytona and the fourth-generation language Cymbal. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 525--526.

Digital Library

[21]

Hagmann, R. 1987. Reimplementing the Cedar file system using logging and group commit. In Proceedings of the 11th Symposium on Operating Systems Principles. 155--162.

Digital Library

[22]

Hartman, J. H. and Ousterhout, J. K. 1993. The Zebra striped network file system. In Proceedings of the 14th Symposium on Operating Systems Principles. ACM, New York, 29--43.

Digital Library

[23]

kx.com. kx.com/products/database.php. Product page.

[24]

Lamport, L. 1998. The part-time parliament. ACM Trans. Comput. Syst. 16, 2, 133--169.

Digital Library

[25]

MacCormick, J., Murphy, N., Najork, M., Thekkath, C. A., and Zhou, L. 2004. Boxwood: Abstractions as the foundation for storage infrastructure. In Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation. 105--120.

Digital Library

[26]

McCarthy, J. 1960. Recursive functions of symbolic expressions and their computation by machine. Commun. ACM 3, 4 (Apr.), 184--195.

Digital Library

[27]

O'Neil, P., Cheng, E., Gawlick, D., and O'Neil, E. 1996. The log-structured merge-tree (LSM-tree). Acta Inf. 33, 4, 351--385.

Digital Library

[28]

oracle.com. www.oracle.com/technology/products/database/clustering/index.html. Product page.

[29]

Pike, R., Dorward, S., Griesemer, R., and Quinlan, S. 2005. Interpreting the data: Parallel analysis with Sawzall. Scientific Programming Journal 13, 4, 227--298.

Digital Library

[30]

Ratnasamy, S., Francis, P., Handley, M., Karp, R., and Shenker, S. 2001. A scalable content-addressable network. In Proceedings of SIGCOMM. ACM, New York, 161--172.

Digital Library

[31]

Rowstron, A. and Druschel, P. 2001. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proceedings of Middleware 2001. 329--350.

Digital Library

[32]

sensage.com. sensage.com/products-sensage.htm. Product page.

[33]

Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of SIGCOMM. ACM, New York, 149--160.

Digital Library

[34]

Stonebraker, M. 1986. The case for shared nothing. Datab. Eng. Bull. 9, 1 (Mar.), 4--9.

[35]

Stonebraker, M., Abadi, D. J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O'Neil, E., O'Neil, P., Rasin, A., Tran, N., and Zdonik, S. 2005. C-Store: A column-oriented DBMS. In Proceedings of the 10th International Conference on Very Large Data Bases. ACM, New York, 553--564.

Digital Library

[36]

Stonebraker, M., Aoki, P. M., Devine, R., Litwin, W., and Olson, M. A. 1994. Mariposa: A new architecture for distributed data. In Proceedings of the 10th International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA, 54--65.

Digital Library

[37]

sybase.com. www.sybase.com/products/databaseservers/sybaseiq. Product page.

[38]

Zhao, B. Y., Kubiatowicz, J., and Joseph, A. D. 2001. Tapestry: An infrastructure for fault-tolerant wide-area location and routing. Tech. Rep. UCB/CSD-01-1141, CS Division, University of California, Berkeley. Apr.

Digital Library

[39]

Zukowski, M., Boncz, P. A., Nes, N., and Heman, S. 2005. MonetDB/X100 --- A DBMS in the CPU cache. IEEE Data Eng. Bull. 28, 2, 17--22.

Cited By

Fu XPan LLiu S(2025)Caching or re-computing: Online cost optimization for running big data tasks in IaaS cloudsJournal of Network and Computer Applications10.1016/j.jnca.2024.104080235(104080)Online publication date: Mar-2025
https://doi.org/10.1016/j.jnca.2024.104080
Zou WLi WWang ZPei JLou TChen GJing WZomaya A(2025)Efficient distributed matrix for resolving computational intensity in remote sensingFuture Generation Computer Systems10.1016/j.future.2024.107644166(107644)Online publication date: May-2025
https://doi.org/10.1016/j.future.2024.107644
Lu JLv MLi PYuan ZXie P(2025)Dhcache: a dual-hash cache for optimizing the read performance in key-value storeThe Journal of Supercomputing10.1007/s11227-024-06828-w81:2Online publication date: 19-Jan-2025
https://doi.org/10.1007/s11227-024-06828-w
Show More Cited By

Index Terms

Bigtable: A Distributed Storage System for Structured Data
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
2. Information systems
  1. Data management systems
    1. Database management system engines
      1. Parallel and distributed DBMSs

Recommendations

Bigtable: a distributed storage system for structured data
OSDI '06: Proceedings of the 7th symposium on Operating systems design and implementation

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google ...
Bigtable: a distributed storage system for structured data
OSDI '06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google ...
On Managing Very Large Sensor-Network Data Using Bigtable
CCGRID '12: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)

Recent advances and innovations in smart sensor technologies, energy storage, data communications, and distributed computing paradigms are enabling technological breakthroughs in very large sensor networks. There is an emerging surge of next-generation ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems

ACM Transactions on Computer Systems Volume 26, Issue 2

June 2008

92 pages

ISSN:0734-2071

EISSN:1557-7333

DOI:10.1145/1365815

Issue’s Table of Contents

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2008

Accepted: 01 April 2008

Revised: 01 April 2008

Received: 01 December 2006

Published in TOCS Volume 26, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tag

Large-Scale Distributed Storage

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2,165
Total Citations
View Citations
29,801
Total Downloads

Downloads (Last 12 months)1,123
Downloads (Last 6 weeks)76

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fu XPan LLiu S(2025)Caching or re-computing: Online cost optimization for running big data tasks in IaaS cloudsJournal of Network and Computer Applications10.1016/j.jnca.2024.104080235(104080)Online publication date: Mar-2025
https://doi.org/10.1016/j.jnca.2024.104080
Zou WLi WWang ZPei JLou TChen GJing WZomaya A(2025)Efficient distributed matrix for resolving computational intensity in remote sensingFuture Generation Computer Systems10.1016/j.future.2024.107644166(107644)Online publication date: May-2025
https://doi.org/10.1016/j.future.2024.107644
Lu JLv MLi PYuan ZXie P(2025)Dhcache: a dual-hash cache for optimizing the read performance in key-value storeThe Journal of Supercomputing10.1007/s11227-024-06828-w81:2Online publication date: 19-Jan-2025
https://doi.org/10.1007/s11227-024-06828-w
Lin XPan YFeng WZhang HLin M(2025)RIOKV: reducing iterator overhead for efficient short-range query in LSM-tree-based key-value storesThe Journal of Supercomputing10.1007/s11227-024-06735-081:1Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s11227-024-06735-0
Pan YZhao JCai YZhang HLin M(2025)PMCKV: pipeline-based multi-compactions KV stores to improve the system performanceThe Journal of Supercomputing10.1007/s11227-024-06680-y81:1Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s11227-024-06680-y
Huang FPan YZhang HLin M(2025)CDNRocks: computable data nodes with RocksDB to improve the read performance of LSM-tree-based distributed key-value storage systemsThe Journal of Supercomputing10.1007/s11227-024-06526-781:1Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s11227-024-06526-7
Shin JZhou LWang JAref W(2025)An update-intensive LSM-based R-tree indexThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-024-00876-734:1Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s00778-024-00876-7
Yang X(2024)Improving the Relevance, Speed, and Computational Efficiency of Semantic Search through Database Indexing: A ReviewOptimization Algorithms - Classics and Recent Advances10.5772/intechopen.112232Online publication date: 10-Jul-2024
https://doi.org/10.5772/intechopen.112232
Maabed MDennouni NAridj M(2024)Optimizing Data Availability and Scalability with RP*-SD2DS Architecture for Distributed SystemsEngineering, Technology & Applied Science Research10.48084/etasr.817614:5(16178-16184)Online publication date: 9-Oct-2024
https://doi.org/10.48084/etasr.8176
Ashwini Shivarudra (2024)Optimizing Test Data Management Strategies in Banking Domain ProjectsJournal of Sustainable Solutions10.36676/j.sust.sol.v1.i4.371:4(87-100)Online publication date: 28-Oct-2024
https://doi.org/10.36676/j.sust.sol.v1.i4.37
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents