skip to main content
10.1145/3035918.3058744acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper
Public Access

OrpheusDB: A Lightweight Approach to Relational Dataset Versioning

Published: 09 May 2017 Publication History

Abstract

We demonstrate OrpheusDB, a lightweight approach to versioning of relational datasets. OrpheusDB is built as a thin layer on top of standard relational databases, and therefore inherits much of their benefits while also compactly storing, tracking, and recreating dataset versions on demand. OrpheusDB also supports a range of querying modalities spanning both SQL and git-style version commands. Conference attendees will be able to interact with OrpheusDB via an interactive version browser interface. The demo will highlight underlying design decisions of OrpheusDB, and provide an understanding of how OrpheusDB translates versioning commands into commands understood by a database system that is unaware of the presence of versions. OrpheusDB has been developed as open-source software; code is available at http://orpheus-db.github.io.

References

[1]
Click: a command line library for python. http://click.pocoo.org/5/.
[2]
Go annotation. ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/old/HUMAN/.
[3]
sqlparse 0.2.2: Non-validating sql parser. https://pypi.python.org/pypi/sqlparse.
[4]
I. Ahn and R. Snodgrass. Performance evaluation of a temporal database management system. In SIGMOD Record, volume 15, pages 96--107, 1986.
[5]
A. Bhardwaj et al. Datahub: Collaborative data science & dataset version management at scale. CIDR, 2015.
[6]
S. Bhattacherjee et al. Principles of dataset versioning: Exploring the recreation/storage tradeoff. VLDB, 8(12):1346--1357, 2015.
[7]
P. Buneman, S. Khanna, K. Tajima, and W.-C. Tan. Archiving scientific data. ACM Transactions on Database Systems (TODS), 29(1):2--42, 2004.
[8]
A. Chavan et al. Towards a unified query language for provenance and versioning. In TaPP, 2015.
[9]
S. Huang, L. Xu, J. Liu, A. Elmore, and A. Parameswaran. Orpheusdb: Bolt-on versioning for relational databases. arXiv preprint arXiv:1703.02475, 2017.
[10]
J. W. Lee, J. Loaiza, M. J. Stewart, W.-M. Hu, and W. H. Bridge Jr. Flashback database, Feb. 20 2007. US Patent 7,181,476.
[11]
M. Maddox et al. Decibel: The relational dataset branching system. VLDB, 9(9):624--635, 2016.
[12]
D. Szklarczyk et al. The string database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic acids research, 39(suppl 1):D561--D568, 2011.

Cited By

View all

Index Terms

  1. OrpheusDB: A Lightweight Approach to Relational Dataset Versioning

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
        May 2017
        1810 pages
        ISBN:9781450341974
        DOI:10.1145/3035918
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 09 May 2017

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. data model
        2. optimization
        3. query translation
        4. version control system

        Qualifiers

        • Short-paper

        Funding Sources

        Conference

        SIGMOD/PODS'17
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 785 of 4,003 submissions, 20%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)80
        • Downloads (Last 6 weeks)13
        Reflects downloads up to 08 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media