research-article

Hybrid Methods for Reducing Database Schema Test Suites: Experimental Insights from Computational and Human Studies

Authors:

Abdullah Alsharif,

Gregory M. Kapfhammer,

Phil McMinnAuthors Info & Claims

AST '20: Proceedings of the IEEE/ACM 1st International Conference on Automation of Software Test

Pages 41 - 50

https://doi.org/10.1145/3387903.3389305

Published: 07 October 2020 Publication History

Abstract

Given that a relational database is a critical component of many software applications, it is important to thoroughly test the integrity constraints of a database's schema, because they protect the data. Although automated test data generation techniques ameliorate the otherwise manual task of database schema testing, they often create test suites that contain many, sometimes redundant, tests. Since prior work presented a hybridized test suite reduction technique, called STICCER, that beneficially combined Greedy test suite reduction with a test merging method customized for database schemas, this paper experimentally evaluates a different hybridization. Motivated by prior results showing that test suite reduction with the Harrold-Gupta-Soffa (HGS) method can be more effective than Greedy at reducing database schema test suites, this paper evaluates an HGS-driven STICCER variant with both a computational and a human study. Using 34 database schemas and tests created by two test data generators, the results from the computational study reveal that, while STICCER is equally efficient and effective when combined with either Greedy or HGS, it is always better than the isolated use of either Greedy or HGS. Involving 27 participants, the human study shows that, when compared to test suites reduced by HGS, those reduced by a STICCER-HGS hybrid allow humans to inspect test cases faster, but not always more accurately.

References

[1]

Sheeva Afshan, Phil McMinn, and Mark Stevenson. 2013. Evolving Readable String Test Inputs Using a Natural Language Model to Reduce Human Oracle Cost. In Proc. of ICST.

Digital Library

[2]

Abdullah Alsharif, Gregory M. Kapfhammer, and Phil McMinn. 2018. DOMINO: Fast and Effective Test Data Generation for Relational Database Schemas. In Proc. of ICST.

[3]

Abdullah Alsharif, Gregory M. Kapfhammer, and Phil McMinn. 2019. What Factors Make SQL Test Cases Understandable for Testers? A Human Study of Automated Test Data Generation Techniques. In Proc. of ICSME.

[4]

Abdullah Alsharif, Gregory M. Kapfhammer, and Phil McMinn. 2020. STICCER: Fast and Effective Database Test Suite Reduction Through Merging of Similar Test Cases. In Proc. of ICST.

[5]

Scott Ambler. 2019. Database Testing: How to Regression Test a Relational Database. http://www.agiledata.org/essays/databaseTesting.html.

[6]

Scott Ambler and Pramod J. Sadalage. 2006. Refactoring Databases: Evolutionary Database Design.

[7]

Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The Oracle Problem in Software Testing: A Survey. Trans. on Software Engineering 41, 5 (2015).

[8]

Jennifer Black, Emanuel Melachrinoudis, and David Kaeli. 2004. Bi-Criteria Models for All-Uses Test Suite Reduction. In Proc. of ICSE.

[9]

Tsong Yueh Chen and Man Fai Lau. 1998. A New Heuristic for Test Suite Reduction. Information and Software Technology (1998).

[10]

Tsong Yueh Chen and Man Fai Lau. 1998. A Simulation Study on Some Heuristics for Test Suite Reduction. Information and Software Technology (1998).

[11]

Vasek Chvatal. 1979. A Greedy Heuristic for the Set-Covering Problem. Mathematics of Operations Research (1979).

[12]

Bas Cornelissen, Arie Van Deursen, Leon Moonen, and Andy Zaidman. 2007. Visualizing Testsuites to Aid in Software Understanding. In Proc. of CSMR.

Digital Library

[13]

Carlo A. Curino, Letizia Tanca, Hyun J. Moon, and Carlo Zaniolo. 2008. Schema Evolution in Wikipedia: Toward a Web Information System Benchmark. In Proc. of ICEIS.

[14]

Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer. 2015. Modeling Readability to Improve Unit Tests. In Proc. of FSE.

Digital Library

[15]

Ermira Daka, José Miguel Rojas, and Gordon Fraser. 2017. Generating Unit Tests with Descriptive Names or: Would You Name Your Children Thing1 and Thing2?. In Proc. of ISSTA.

Digital Library

[16]

Giovanni Grano, Simone Scalabrino, Harald C Gall, and Rocco Oliveto. 2018. An Empirical Investigation on the Readability of Manual and Generated Test Cases. In Proc. of ICPC.

Digital Library

[17]

Szymon Guz. 2011. Basic Mistakes in Database Testing. http://java.dzone.com/articles/basic-mistakes-database.

[18]

Mark Harman, Sung Gon Kim, Kiran Lakhotia, Phil McMinn, and Shin Yoo. 2010. Optimizing for the Number of Tests Generated in Search Based Test Data Generation with an Application to the Oracle Cost Problem. In Proc. of SBST.

Digital Library

[19]

M. Harman and P. McMinn. 2010. A Theoretical and Empirical Study of Search-Based Testing: Local, Global and Hybrid Search. Trans. of Software Engineering 36, 2 (2010).

Digital Library

[20]

Mary Jean Harrold, Rajiv Gupta, and Mary Lou Soffa. 1993. A Methodology for Controlling the Size of a Test Suite. Trans. on Software Engineering and Methodology 2, 3 (1993).

[21]

J Hartmann and DJ Robson. 1989. Revalidation During the Software Maintenance Phase. In Proc. of ICSME.

[22]

Martin Höst, Björn Regnell, and Claes Wohlin. 2000. Using Students as Subjects: A Comparative Study of Students and Professionals in Lead-Time Impact Assessment. Empirical Software Engineering 5, 3 (2000).

[23]

Dennis Jeffrey and Neelam Gupta. 2007. Improving Fault Detection Capability by Selectively Retaining Test Cases During Test Suite Reduction. Trans. on Software Engineering (2007).

[24]

James A. Jones and Mary Jean Harrold. 2003. Test-Suite Reduction and Prioritization for Modified Condition/Decision Coverage. Trans. on Software Engineering 29, 3 (2003).

[25]

Gregory M. Kapfhammer. 2007. A Comprehensive Framework for Testing Database-Centric Applications. Ph.D. Dissertation. University of Pittsburgh.

[26]

Gregory M. Kapfhammer. 2010. Regression Testing. In The Encyclopedia of Software Engineering.

[27]

Gregory M. Kapfhammer. 2012. Towards a Method for Reducing the Test Suites of Database Applications. In Comp. of ICST.

[28]

Gregory M. Kapfhammer, Phil McMinn, and Chris J. Wright. 2013. Search-Based Testing of Relational Schema Integrity Constraints Across Multiple Database Management Systems. In Proc. of ICST.

[29]

Joseph Kempka, Phil McMinn, and Dirk Sudholt. 2013. A Theoretical Runtime and Empirical Analysis of Different Alternating Variable Searches for Search-Based Testing. In Proc. of GECCO.

Digital Library

[30]

Joseph Kempka, Phil McMinn, and Dirk Sudholt. 2015. Design and Analysis of Different Alternating Variable Searches for Search-Based Software Testing. Theoretical Computer Science (2015).

[31]

B. Korel. 1990. Automated Software Test Data Generation. Trans. on Software Engineering 16, 8 (1990).

[32]

Boyang Li, Christopher Vendome, Mario Linares-Vásquez, and Denys Poshyvanyk. 2018. Aiding Comprehension of Unit Test Cases and Test Suites with Stereotype-Based Tagging. In Proc. of ICPC.

Digital Library

[33]

Nan Li, Yu Lei, Haider Riaz Khan, Jingshu Liu, and Yun Guo. 2016. Applying Combinatorial Test Data Generation to Big Data Applications. In Proc. of ASE.

Digital Library

[34]

Chu-Ti Lin, Kai-Wei Tang, Cheng-Ding Chen, and Gregory M. Kapfhammer. 2012. Reducing the cost of regression testing by identifying irreplaceable test cases. In Proc. of ICGEC.

[35]

Chu-Ti Lin, Kai-Wei Tang, and Gregory M. Kapfhammer. 2014. Test Suite Reduction Methods that Decrease Regression Testing Costs by Identifying Irreplaceable Tests. Information and Software Technology 56, 10 (2014).

[36]

Chu-Ti Lin, Kai-Wei Tang, Jiun-Shiang Wang, and Gregory M. Kapfhammer. 2017. Empirically Evaluating Greedy-Based Test Suite Reduction Methods at Different Levels of Test Suite Complexity. Science of Computer Programming 150 (2017).

[37]

Mario Linares-Vásquez, Boyang Li, Christopher Vendome, and Denys Poshyvanyk. 2016. Documenting Database Usages and Schema Constraints in Database-Centric Applications. In Proc. of ISSTA.

Digital Library

[38]

Nashat Mansour and Khalid El-Fakih. 1999. Simulated Annealing and Genetic Algorithms for Optimal Regression Testing. Journ. of Software Maintenance: Research and Practice 11, 1 (1999).

[39]

Phil McMinn and Gregory M. Kapfhammer. 2016. AVMf: An Open-Source Framework and Implementation of the Alternating Variable Method. In Proc. of SSBSE.

[40]

Phil McMinn, Gregory M. Kapfhammer, and Chris J. Wright. 2016. Virtual Mutation Analysis of Relational Database Schemas. In Proc. of AST.

[41]

Phil McMinn, Mark Stevenson, and Mark Harman. 2010. Reducing Qualitative Human Oracle Costs associated with Automatically Generated Test Data. In Proc. of STOV.

Digital Library

[42]

Phil McMinn, Chris J. Wright, and Gregory M. Kapfhammer. 2015. The Effectiveness of Test Coverage Criteria for Relational Database Schema Integrity Constraints. Trans. on Software Engineering and Methodology 25, 1 (2015).

[43]

Phil McMinn, Chris J. Wright, Cody Kinneer, Colton J. McCurdy, Michael Camara, and Gregory M. Kapfhammer. 2016. SchemaAnalyst: Search-based Test Data Generation for Relational Database Schemas. In Proc. of ICMSE.

[44]

Phil McMinn, Chris J. Wright, Colton J. McCurdy, and Gregory M. Kapfhammer. 2019. Automatic Detection and Removal of Ineffective Mutants for the Mutation Analysis of Relational Database Schemas. Trans. on Software Engineering 45, 5 (2019).

[45]

PostgreSQL Project. 2012. About PostgreSQL. http://www.postgresql.org/about/.

[46]

Dong Qiu, Bixin Li, and Zhendong Su. 2013. An Empirical Analysis of the Co-evolution of Schema and Code in Database Applications. In Proc. of FSE.

Digital Library

[47]

José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. 2015. Automated Unit Test Generation During Software Development: A Controlled Experiment and Think-Aloud Observations. In Proc. of ISSTA.

Digital Library

[48]

Adam M. Smith, Joshua J. Geiger, Gregory M. Kapfhammer, Manos Renieris, and G. Elisabeta Marai. 2009. Interactive Coverage Effectiveness Multiplots for Evaluating Prioritized Regression Test Suites. In Comp. of InfoVis.

[49]

Adam M. Smith and Gregory M. Kapfhammer. 2009. An Empirical Study of Incorporating Cost into Test Suite Reduction and Prioritization. In Proc. of SAC.

[50]

SQLite Developers. 2012. Most Widely Deployed SQL Database Engine. http://www.sqlite.org/mostdeployed.html.

[51]

SQLite Developers. 2019. SQL as Understood by SQLite. https://sqlite.org/lang_createtable.html.

[52]

Sriraman Tallam and Neelam Gupta. 2005. A Concept Analysis Inspired Greedy Algorithm for Test Suite Minimization. In Proceedings of the 6th Workshop on Program Analysis for Software Tools and Engineering.

Digital Library

[53]

Nigel Tracey, John Clark, Keith Mander, and John McDermid. 1998. An Automated Framework for Structural Test-Data Generation. In Proc. of ASE.

[54]

Javier Tuya, Claudio de la Riva, Maria Jose Suarez-Cabal, and Raquel Blanco. 2016. Coverage-Aware Test Database Reduction. Trans. on Software Engineering 42, 10 (2016).

[55]

András Vargha and Harold D. Delaney. 2000. A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong. Journ. of Educational and Behavioral Statistics 25, 2 (2000).

[56]

Chris J. Wright, Gregory M. Kapfhammer, and Phil McMinn. 2013. Efficient Mutation Analysis of Relational Database Structure Using Mutant Schemata and Parallelisation. In Proc. of Mutation.

Digital Library

[57]

Chris J Wright, Gregory M Kapfhammer, and Phil McMinn. 2014. The Impact of Equivalent, Redundant and Quasi Mutants on Database Schema Mutation Analysis. In Proc. of QSIC.

Digital Library

[58]

Shin Yoo and Mark Harman. 2010. Using Hybrid Algorithm for Pareto Efficient Multi-Objective Test Suite Minimisation. Journ. of Systems and Software (2010).

[59]

Shin Yoo and Mark Harman. 2012. Regression Testing Minimization, Selection and Prioritization: A Survey. Software Testing, Verification and Reliability 22, 2 (2012).

[60]

Hao Zhong, Lu Zhang, and Hong Mei. 2006. An Experimental Comparison of Four Test Suite Reduction Techniques. In Proc. of ICSE.

Digital Library

Hybrid Methods for Reducing Database Schema Test Suites: Experimental Insights from Computational and Human Studies
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis

Recommendations

Prioritizing random combinatorial test suites
SAC '17: Proceedings of the Symposium on Applied Computing

The behaviour of a system under test can be influenced by several factors, such as system configurations, user inputs, and so on. It has also been observed that many failures are caused by only a small number of factors. Combinatorial testing aims at ...
Evaluating String Distance Metrics for Reducing Automatically Generated Test Suites
AST '24: Proceedings of the 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024)

Regression test suites can have a large number of test cases, especially automatically generated ones, and tend to grow in size, making it costly to run the entire test suite. Test suite reduction aims to eliminate some test cases to reduce the test ...
Understanding and Reusing Test Suites Across Database Systems
SIGMOD

Database Management System (DBMS) developers have implemented extensive test suites to test their DBMSs. For example, the SQLite test suites contain over 92 million lines of code. Despite these extensive efforts, test suites are not systematically reused ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AST '20: Proceedings of the IEEE/ACM 1st International Conference on Automation of Software Test

October 2020

122 pages

ISBN:9781450379571

DOI:10.1145/3387903

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

AST '20

Sponsor:

SIGSOFT

AST '20: IEEE/ACM 15nd International Conference on Automation of Software Test

October 7 - 8, 2020

Seoul, Republic of Korea

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
53
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents