skip to main content
10.1145/3486609.3487197acmconferencesArticle/Chapter ViewAbstractPublication PagesgpceConference Proceedingsconference-collections
research-article

A variational database management system

Published: 22 November 2021 Publication History

Abstract

Many problems require working with data that varies in its structure and content. Current approaches, such as schema evolution or data integration tools, are highly tailored to specific kinds of variation in databases. While these approaches work well in their roles, they do not address all kinds of variation and do address the interaction of different kinds of variation in databases. In this paper, we define a framework for capturing variation as a generic and orthogonal con- cern in relational databases. We define variational schemas, variational databases, and variational queries for capturing variation in the structure, content, and information needs of relational databases, respectively. We define a type system that ensures variational queries are consistent with respect to a variational schema. Finally, we design and implement a variational database management system as an abstraction layer over a traditional relational database management system. Using previously developed use cases, we show the feasibility of our framework and demonstrate the performance of different approaches used in our system

Supplementary Material

Auxiliary Presentation Video (splashws21gpcemain-p5-p-video.mp4)
This is a presentation video my our accepted paper in GPCE'21 "A Variational Database Management System". The main contributions are: We define a formal model of variational databases, whose structure are given by a variational schema and whose content are given by variational tables. We define variational relational algebra for querying VDB, a static type system for ensuring that all variants of a query are compatible with the corresponding variants of the VDB, and identify properties of VRA. We implement a prototype of VDBMS as a layer on top of a traditional relational database management system (RDBMS) and evaluate this implementation on previously developed use cases.

References

[1]
Lamia Abo Zaid and Olga De Troyer. 2011. Towards Modeling Data Variability in Software Product Lines. In Enterprise, Business-Process and Information Systems Modeling, Terry Halpin, Selmin Nurcan, John Krogstie, Pnina Soffer, Erik Proper, Rainer Schmidt, and Ilia Bider (Eds.). Springer, Berlin, Heidelberg. 453–467. isbn:978-3-642-21759-3
[2]
Mohammed Al-Kateb, Ahmad Ghazal, Alain Crolotte, Ramesh Bhashyam, Jaiprakash Chimanchode, and Sai Pavan Pakala. 2013. Temporal Query Processing in Teradata. In Proceedings of the 16th International Conference on Extending Database Technology (EDBT ’13). Association for Computing Machinery, New York, NY, USA. 573–578. isbn:9781450315975 https://doi.org/10.1145/2452376.2452443
[3]
Sven Apel, Don Batory, Christian Kästner, and Gunter Saake. 2013. Feature-Oriented Software Product Lines: Concepts and Implementation. Springer-Verlag, Berlin/Heidelberg. isbn:978-3-642-37520-0
[4]
Gad Ariav. 1991. Temporally oriented data definitions: Managing schema evolution in temporally oriented databases. Data & Knowledge Engineering, 6, 6 (1991), 451 – 467. issn:0169-023X https://doi.org/10.1016/0169-023X(91)90023-Q
[5]
Parisa Ataei. 2021. The Theory and Implementation of a Variational Database Management System. Ph.D. Dissertation. Oregon State University.
[6]
Parisa Ataei, Qiaoran Li, and Eric Walkingshaw. 2021. Should Variation Be Encoded Explicitly in Databases? In 15th International Working Conference on Variability Modelling of Software-Intensive Systems (VaMoS’21). Association for Computing Machinery, New York, NY, USA. Article 3, 9 pages. isbn:9781450388245 https://doi.org/10.1145/3442391.3442395
[7]
Parisa Ataei, Arash Termehchy, and Eric Walkingshaw. 2017. Variational Databases. In Int. Symp. on Database Programming Languages (DBPL). ACM, 11:1–11:4.
[8]
Parisa Ataei, Arash Termehchy, and Eric Walkingshaw. 2018. Managing Structurally Heterogeneous Databases in Software Product Lines. In VLDB Workshop: Polystores and Other Systems for Heterogeneous Data (Poly).
[9]
C. Batini, M. Lenzerini, and S. B. Navathe. 1986. A Comparative Analysis of Methodologies for Database Schema Integration. ACM Comput. Surv., 18, 4 (1986), Dec., 323?364. issn:0360-0300 https://doi.org/10.1145/27633.27634
[10]
Zohra Bellahsene, Angela Bonifati, and Erhard Rahm. 2011. Schema Matching and Mapping (1st ed.). Springer Publishing Company, Incorporated. isbn:9783642165177
[11]
Anant P. Bhardwaj, Souvik Bhattacherjee, Amit Chavan, Amol Deshpande, Aaron J. Elmore, Samuel Madden, and Aditya G. Parameswaran. 2015. DataHub: Collaborative Data Science & Dataset Version Management at Scale. In CIDR 2015, Seventh Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 4-7, 2015, Online Proceedings. www.cidrdb.org. http://cidrdb.org/cidr2015/Papers/CIDR15_Paper18.pdf
[12]
Souvik Bhattacherjee, Amit Chavan, Silu Huang, Amol Deshpande, and Aditya Parameswaran. 2015. Principles of Dataset Versioning: Exploring the Recreation/Storage Tradeoff. Proc. VLDB Endow., 8, 12 (2015), Aug., 1346–1357. issn:2150-8097 https://doi.org/10.14778/2824032.2824035
[13]
Cristina De Castro, Fabio Grandi, and Maria Rita Scalas. 1997. Schema Versioning for Multitemporal Relational Databases. Information Systems, 22, 5 (1997), 249 – 290. issn:0306-4379 https://doi.org/10.1016/S0306-4379(97)00017-3
[14]
Badrish Chandramouli, Johannes Gehrke, Jonathan Goldstein, Donald Kossmann, Justin J. Levandoski, Renato Marroquin, and Wenlei Xie. 2017. READY: Completeness is in the Eye of the Beholder. In CIDR 2017, 8th Biennial Conference on Innovative Data Systems Research, Chaminade, CA, USA, January 8-11, 2017, Online Proceedings. www.cidrdb.org. http://cidrdb.org/cidr2017/papers/p18-chandramouli-cidr17.pdf
[15]
Amit Chavan, Silu Huang, Amol Deshpande, Aaron J. Elmore, Samuel Madden, and Aditya G. Parameswaran. 2015. Towards a unified query language for provenance and versioning. CoRR, abs/1506.04815 (2015), arxiv:1506.04815. arxiv:1506.04815
[16]
Sheng Chen, Martin Erwig, and Eric Walkingshaw. 2012. An Error-Tolerant Type System for Variational Lambda Calculus. In ACM SIGPLAN Int. Conf. on Functional Programming (ICFP). 29–40.
[17]
Jan Chomicki. 1995. Temporal Query Languages: a Survey.
[18]
AnHai Doan, Alon Halevy, and Zachary Ives. 2012. Principles of Data Integration (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. isbn:0124160441, 9780124160446
[19]
Anhai Doan and Alon Y. Halevy. 2005. Semantic integration research in the database community: A brief survey. AI Magazine, 26 (2005), 83–94.
[20]
Martin Erwig and Eric Walkingshaw. 2011. The Choice Calculus: A Representation for Software Variation. ACM Trans. on Software Engineering and Methodology (TOSEM), 21, 1 (2011), 6:1–6:27.
[21]
Martin Erwig, Eric Walkingshaw, and Sheng Chen. 2013. An Abstract Representation of Variational Graphs. In Int. Work. on Feature-Oriented Software Development (FOSD). ACM, 25–32.
[22]
Mina Farid, Alexandra Roatis, Ihab F. Ilyas, Hella-Franziska Hoffmann, and Xu Chu. 2016. CLAMS: Bringing Quality to Data Lakes. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD ’16). Association for Computing Machinery, New York, NY, USA. 2089–2092. isbn:9781450335317 https://doi.org/10.1145/2882903.2899391
[23]
Robert J. Hall. 2005. Fundamental Nonmodularity in Electronic Mail. Automated Software Engineering, 12, 1 (2005), 41–79.
[24]
Kai Herrmann, Jan Reimann, Hannes Voigt, Birgit Demuth, Stefan Fromm, Robert Stelzmann, and Wolfgang Lehner. 2015. Database Evolution for Software Product Lines. In DATA.
[25]
Jean-Marc Hick and Jean-Luc Hainaut. 2006. Database application evolution: A transformational approach. Data & Knowledge Engineering, 59, 3 (2006), 534 – 558. issn:0169-023X https://doi.org/10.1016/j.datak.2005.10.003 Including: ER 2003.
[26]
Silu Huang, Liqi Xu, Jialin Liu, Aaron J. Elmore, and Aditya Parameswaran. 2017. OrpheusDB: Bolt-on Versioning for Relational Databases. Proc. VLDB Endow., 10, 10 (2017), June, 1130–1141. issn:2150-8097 http://dl.acm.org/citation.cfm?id=3115404.3115417
[27]
Spencer Hubbard and Eric Walkingshaw. 2016. Formula Choice Calculus. In Int. Work. on Feature-Oriented Software Development (FOSD). ACM, 49–57.
[28]
Christian S. Jensen and Richard Thomas Snodgrass. 1999. Temporal Data Management. IEEE Trans. on Knowl. and Data Eng., 11, 1 (1999), Jan., 36–44. issn:1041-4347 https://doi.org/10.1109/69.755613
[29]
Christian S. Jensen and Richard T. Snodgrass. 2009. Temporal Query Languages. Springer US, Boston, MA. 3009–3012. isbn:978-0-387-39940-9 https://doi.org/10.1007/978-0-387-39940-9_407
[30]
Martin Kaufmann, Amin Amiri Manjili, Panagiotis Vagenas, Peter M. Fischer, Donald Kossmann, Franz Färber, and Norman May. 2013. Timeline index: a unified data structure for processing queries on temporal data in SAP HANA. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22-27, 2013, Kenneth A. Ross, Divesh Srivastava, and Dimitris Papadias (Eds.). ACM, 1173–1184. https://doi.org/10.1145/2463676.2465293
[31]
Fariba Khan. 2021. Formal Verification of the Variational Database Management System. Master’s thesis. Oregon State University. https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/qf85nj87k?locale=en
[32]
Edwin McKenzie and Richard Thomas Snodgrass. 1990. Schema Evolution and the Relational Algebra. Inf. Syst., 15, 2 (1990), May, 207–232. issn:0306-4379 https://doi.org/10.1016/0306-4379(90)90036-O
[33]
L. Edwin McKenzie and Richard Thomas Snodgrass. 1991. Evaluation of Relational Algebras Incorporating the Time Dimension in Databases. ACM Comput. Surv., 23, 4 (1991), Dec., 501–543. issn:0360-0300 https://doi.org/10.1145/125137.125166
[34]
Hyun J. Moon, Carlo A. Curino, Alin Deutsch, Chien-Yi Hou, and Carlo Zaniolo. 2008. Managing and Querying Transaction-time Databases Under Schema Evolution. Proc. VLDB Endow., 1, 1 (2008), Aug., 882–895. issn:2150-8097 https://doi.org/10.14778/1453856.1453952
[35]
Marco Mori and Anthony Cleve. 2013. Feature-Based Adaptation of Database Schemas. In Model-Based Methodologies for Pervasive and Embedded Software, Ricardo J. Machado, Rita Suzana P. Maciel, Julia Rubin, and Goetz Botterweck (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 85–105. isbn:978-3-642-38209-3
[36]
Atsushi Ohori and Peter Buneman. 1988. Type Inference in a Database Programming Language. In Proceedings of the 1988 ACM Conference on LISP and Functional Programming (LFP ’88). Association for Computing Machinery, New York, NY, USA. 174–183. isbn:089791273X https://doi.org/10.1145/62678.62700
[37]
G. Ozsoyoglu and R.T. Snodgrass. 1995. Temporal and real-time databases: a survey. IEEE Transactions on Knowledge and Data Engineering, 7, 4 (1995), 513–532. https://doi.org/10.1109/69.404027
[38]
Erhard Rahm and Philip A. Bernstein. 2001. A survey of approaches to automatic schema matching. The VLDB Journal, 10, 4 (2001), December, 334–350. issn:1066-8888 https://doi.org/10.1007/s007780100057
[39]
Cynthia M. Saracco, Matthias Nicola, and Lenisha Gandhi. 2010. A matter of time: Temporal data management in db2 for z.
[40]
Martin Schäler, Thomas Leich, Marko Rosenmüller, and Gunter Saake. 2012. Building Information System Variants with Tailored Database Schemas Using Features. In Advanced Information Systems Engineering, Jolita Ralyté, Xavier Franch, Sjaak Brinkkemper, and Stanislaw Wrycza (Eds.). Springer, Berlin, Heidelberg. 597–612. isbn:978-3-642-31095-9
[41]
Norbert Siegmund, Christian Kästner, Marko Rosenmüller, Florian Heidenreich, Sven Apel, and Gunter Saake. 2009. Bridging the Gap Between Variability in Client Application and Database Schema. In 13. GI-Fachtagung Datenbanksysteme für Business, Technologie und Web (BTW). Gesellschaft für Informatik (GI), 297–306.
[42]
Richard Thomas Snodgrass. 1995. The TSQL2 Temporal Query Language. Kluwer Academic Publishers, USA. isbn:0792396146
[43]
Micheal Stonebraker, Dong Deng, and Micheal L. Brodie. 2016. Database Decay and How to Avoid It. In Big Data (Big Data), 2016 IEEE International Conference. IEEE. https://doi.org/10.1109/BigData.2016.7840584
[44]
U. Störl, D. Müller, A. Tekleab, S. Tolale, J. Stenzel, M. Klettke, and S. Scherzinger. 2018. Curating Variational Data in Application Development. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). 1605–1608. https://doi.org/10.1109/ICDE.2018.00187
[45]
Abdullah Uz Tansel, James Clifford, Shashi Gadia, Sushil Jajodia, Arie Segev, and Richard Snodgrass. 1993. Temporal Databases: Theory, Design, and Implementation. Benjamin-Cummings Publishing Co., Inc., USA. isbn:0805324135
[46]
Kristian Torp, Christian S. Jensen, and Richard T. Snodgrass. 1998. Stratum Approaches to Temporal DBMS Implementation. In Proceedings of the 1998 International Database Engineering and Applications Symposium, IDEAS 1998, Cardiff, Wales, UK, July 8-10, 1998, Barry Eaglestone, Bipin C. Desai, and Jianhua Shao (Eds.). IEEE Computer Society, 4–13. https://doi.org/10.1109/IDEAS.1998.694346
[47]
Eric Walkingshaw. 2013. The Choice Calculus: A Formal Language of Variation. Ph.D. Dissertation. Oregon State University. http://hdl.handle.net/1957/40652
[48]
Eric Walkingshaw, Christian Kästner, Martin Erwig, Sven Apel, and Eric Bodden. 2014. Variational Data Structures: Exploring Trade-Offs in Computing with Variability. In ACM SIGPLAN Symp. on New Ideas in Programming and Reflections on Software (Onward!). 213–226.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GPCE 2021: Proceedings of the 20th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences
October 2021
209 pages
ISBN:9781450391122
DOI:10.1145/3486609
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 November 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. choice calculus
  2. relational databases
  3. software product lines
  4. type systems
  5. variation
  6. variational data

Qualifiers

  • Research-article

Conference

GPCE '21
Sponsor:
GPCE '21: Concepts and Experiences
October 17 - 18, 2021
IL, Chicago, USA

Acceptance Rates

Overall Acceptance Rate 56 of 180 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)1
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media