research-article

Automatically translating bug reports into test cases for mobile apps

Authors:

Mattia Fazzini,

Martin Prammer,

Marcelo d'Amorim,

Alessandro OrsoAuthors Info & Claims

ISSTA 2018: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pages 141 - 152

https://doi.org/10.1145/3213846.3213869

Published: 12 July 2018 Publication History

Abstract

When users experience a software failure, they have the option of submitting a bug report and provide information about the failure and how it happened. If the bug report contains enough information, developers can then try to recreate the issue and investigate it, so as to eliminate its causes. Unfortunately, the number of bug reports filed by users is typically large, and the tasks of analyzing bug reports and reproducing the issues described therein can be extremely time consuming. To help make this process more efficient, in this paper we propose Yakusu, a technique that uses a combination of program analysis and natural language processing techniques to generate executable test cases from bug reports. We implemented Yakusu for Android apps and performed an empirical evaluation on a set of over 60 real bug reports for different real-world apps. Overall, our technique was successful in 59.7% of the cases; that is, for a majority of the bug reports, developers would not have to study the report to reproduce the issue described and could simply use the test cases automatically generated by Yakusu. Furthermore, in many of the remaining cases, Yakusu was unsuccessful due to limitations that can be addressed in future work.

References

[1]

0nko. 2017. Notification icon: App crash when publishing a post. Retrieved June 8, 2018 from https://github.com/wordpress-mobile/WordPress-Android/issues/5497

[2]

Gabor Angeli, Melvin Jose Johnson Premkumar, and Christopher D. Manning. 2015. Leveraging Linguistic Structure For Open Domain Information Extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, USA, 344–354.

[3]

Shay Artzi, Sunghun Kim, and Michael D. Ernst. 2008. ReCrash: Making Software Failures Reproducible by Preserving Object States. In 22nd European Conference on Object-Oriented Programming. Springer, Berlin, Heidelberg, 542–565.

Digital Library

[4]

Nicolas Bettenburg, Sascha Just, Adrian Schröter, Cathrin Weiss, Rahul Premraj, and Thomas Zimmermann. 2008. What Makes a Good Bug Report?. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, New York, NY, USA, 308–318.

Digital Library

[5]

S. R. K. Branavan, Harr Chen, Luke S. Zettlemoyer, and Regina Barzilay. 2009. Reinforcement Learning for Mapping Instructions to Actions. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP. The Association for Computer Linguistics, Stroudsburg, PA, USA, 82–90.

Digital Library

[6]

Bullnados. 2017. Main and Nightly Version crashing all time on LG G4. Retrieved June 8, 2018 from https://github.com/nextcloud/android/issues/760

[7]

Cohan Sujay Carlos. 2011. Natural Language Programming Using Class Sequential Rules. In Proceedings of the Fifth International Joint Conference on Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, USA, 237–245.

[8]

Wontae Choi, George Necula, and Koushik Sen. 2013. Guided GUI Testing of Android Apps with Minimal Restart and Approximate Learning. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications. ACM, New York, NY, USA, 623–640.

Digital Library

[9]

Pedro Costa, Ana C. R. Paiva, and Miguel Nabuco. 2014. Pattern Based GUI Testing for Mobile Applications. In 2014 9th International Conference on the Quality of Information and Communications Technology. IEEE Computer Society, Washington, DC, USA, 66–74.

[10]

Anthony Cozzie, Murph Finnicum, and Samuel T. King. 2011. Macho: Programming with Man Pages. In 13th Workshop on Hot Topics in Operating Systems. USENIX Association, Napa, CA, United States.

Digital Library

[11]

Marie-Catherine de Marneffe, Bill MacCartney, and Christopher D. Manning. 2006. Generating Typed Dependency Parses from Phrase Structure Parses. In Proceedings of the Fifth International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), Genoa, Italy, 449–454.

[12]

Aditya Desai, Sumit Gulwani, Vineet Hingorani, Nidhi Jain, Amey Karkare, Mark Marron, Sailesh R., and Subhajit Roy. 2016. Program Synthesis Using Natural Language. In Proceedings of the 38th International Conference on Software Engineering. ACM, New York, NY, USA, 345–356.

Digital Library

[13]

Mattia Fazzini, Martin Prammer, Marcelo d’Amorim, and Alessandro Orso. 2018. Yakusu. Retrieved June 8, 2018 from http://www.cc.gatech.edu/~orso/software/ yakusu

[14]

GitHub 2018. GitHub. Retrieved June 8, 2018 from https://github.com

[15]

GitHub 2018. Manually creating a single issue template for your repository. Retrieved June 8, 2018 from https://help.github.com/articles/ manually-creating-a-single-issue-template-for-your-repository

[16]

Alberto Goffi, Alessandra Gorla, Michael D. Ernst, and Mauro Pezzè. 2016. Automatic Generation of Oracles for Exceptional Behaviors. In Proceedings of the 25th International Symposium on Software Testing and Analysis. ACM, New York, NY, USA, 213–224.

Digital Library

[17]

Google 2018. Custom View Components. Retrieved June 8, 2018 from https: //developer.android.com/training/custom-views/index.html

[18]

Google 2018. Espresso. Retrieved June 8, 2018 from https://developer.android. com/training/testing/espresso/index.html

[19]

Google 2018. Google News Vectors Negative 300. Retrieved June 8, 2018 from https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM

[20]

Google 2018. Reporting Bugs. Retrieved June 8, 2018 from https://source.android. com/setup/report-bugs

[21]

Tihomir Gvero and Viktor Kuncak. 2015. Synthesizing Java Expressions from Freeform Queries. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. ACM, New York, NY, USA, 416–432.

Digital Library

[22]

Wei Jin and Alessandro Orso. 2012. BugRedux: Reproducing Field Failures for Inhouse Debugging. In Proceedings of the 34th International Conference on Software Engineering. IEEE Computer Society, Washington, DC, USA, 474–484.

Digital Library

[23]

Wei Jin and Alessandro Orso. 2013. F3: Fault Localization for Field Failures. In Proceedings of the 2013 International Symposium on Software Testing and Analysis. ACM, New York, NY, USA, 213–223.

Digital Library

[24]

JSON 2018. Introducing JSON. Retrieved June 8, 2018 from https://www.json.org

[25]

Dan Jurafsky and James H Martin. 2014. Speech and language processing. Pearson Education, London, UK.

[26]

Mathias Landhäußer, Sebastian Weigelt, and Walter F. Tichy. 2017. NLCI: a Natural Language Command Interpreter. Automated Software Engineering 24 (2017), 839–861.

Digital Library

[27]

Tessa A. Lau, Clemens Drews, and Jeffrey Nichols. 2009. Interpreting Written How-To Instructions. In Proceedings of the 21st International Joint Conference on Artificial Intelligence. AAAI Press, Bellevue, Washington, USA, 1433–1438.

Digital Library

[28]

Vu Le, Sumit Gulwani, and Zhendong Su. 2013. SmartSynth: Synthesizing Smartphone Automation Scripts from Natural Language. In Proceeding of the 11th Annual International Conference on Mobile Systems, Applications, and Services. ACM, New York, NY, USA, 193–206.

Digital Library

[29]

Dennis Lee. 2016. How to write a bug report that will make your engineers love you. Retrieved June 8, 2018 from https://testlio.com/blog/the-ideal-bug-report

[30]

Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. The Association for Computer Linguistics, Stroudsburg, PA, USA, 55–60.

[31]

Mehdi Hafezi Manshadi, Daniel Gildea, and James F. Allen. 2013. Integrating Programming by Example and Natural Language Programming. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence. AAAI Press, Bellevue, Washington, USA, 661–667.

Digital Library

[32]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).

[33]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and Their Compositionality. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Curran Associates Inc., Lake Tahoe, Nevada, USA, 3111–3119.

Digital Library

[34]

Gary Miner, John Elder, Thomas Hill, Robert Nisbet, Dursun Delen, and Andrew Fast. 2012. Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications. Academic Press, Orlando, FL, USA.

Digital Library

[35]

Chris Moody. 2015. A Word is Worth a Thousand Vectors. Retrieved June 8, 2018 from https://multithreaded.stitchfix.com/blog/2015/03/11/ word-is-worth-a-thousand-vectors

[36]

Rodrigo M. L. M. Moreira, Ana C. R. Paiva, and Atif Memon. 2013. A Pattern-Based Approach for GUI Modeling and Testing. In 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE). IEEE Computer Society, Washington, DC, USA, 288–297.

[37]

Ines Coimbra Morgado and Ana C. R. Paiva. 2015. Testing Approach for Mobile Applications through Reverse Engineering of UI Patterns. In 2015 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW). IEEE Computer Society, Washington, DC, USA, 42–49.

Digital Library

[38]

Satish Narayanasamy, Gilles Pokam, and Brad Calder. 2005. BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging. In 32nd International Symposium on Computer Architecture. IEEE Computer Society, Washington, DC, USA, 284–295.

Digital Library

[39]

Mathieu Nayrolles, Abdelwahab Hamou-Lhadj, Sofiène Tahar, and Alf Larsson. 2015. JCHARMING: A Bug Reproduction Approach Using Crash Traces and Directed Model Checking. In 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE Computer Society, Washington, DC, USA, 101–110.

[40]

Jason Ostrander. 2012. Android UI Fundamentals: Develop and Design. Peachpit Press, Berkeley, CA, USA.

[41]

Mukund Raghothaman, Yi Wei, and Youssef Hamadi. 2016. SWIM: Synthesizing What I Mean: Code Search and Idiomatic Snippet Synthesis. In Proceedings of the 38th International Conference on Software Engineering. ACM, New York, NY, USA, 357–367.

Digital Library

[42]

Atanas Rountev and Dacong Yan. 2014. Static Reference Analysis for GUI Objects in Android Software. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, New York, NY, USA, 143–153.

Digital Library

[43]

Clara Sacramento and Ana C. R. Paiva. 2014. Web Application Model Generation through Reverse Engineering and UI Pattern Inferring. In 2014 9th International Conference on the Quality of Information and Communications Technology. IEEE Computer Society, Washington, DC, USA, 105–115.

[44]

John Saito. 2016. Making a case for letter case. Retrieved June 8, 2018 from https://medium.com/@jsaito/making-a-case-for-letter-case-19d09f653c98

[45]

Square 2018. JavaPoet. Retrieved June 8, 2018 from https://github.com/square/ javapoet

[46]

John Steven, Pravir Chandra, Bob Fleck, and Andy Podgurski. 2000. jRapture: A Capture/Replay Tool for Observation-based Testing. In Proceedings of the International Symposium on Software Testing and Analysis. ACM, New York, NY, USA, 158–167.

Digital Library

[47]

Suresh Thummalapenta, Saurabh Sinha, Nimit Singhania, and Satish Chandra. 2012. Automating Test Automation. In Proceedings of the 34th International Conference on Software Engineering. IEEE Computer Society, Washington, DC, USA, 881–891.

Digital Library

[48]

Universal Dependencies 2018. Universal Dependencies. Retrieved June 8, 2018 from http://universaldependencies.org ISSTA’18, July 16–21, 2018, Amsterdam, Netherlands Mattia Fazzini, Martin Prammer, Marcelo d’Amorim, and Alessandro Orso

[49]

David Vadas and James R. Curran. 2005. Programming With Unrestricted Natural Language. In Proceedings of the Australasian Language Technology Workshop. Australasian Language Technology Association, Sydney, Australia, 191–199.

[50]

Radim Řehůřek. 2018. Gensim. Retrieved June 8, 2018 from https://radimrehurek. com/gensim

[51]

WordPress 2018. WordPress. Retrieved June 8, 2018 from https://play.google. com/store/apps/details?id=org.wordpress.android

[52]

Shengqian Yang, Hailong Zhang, Haowei Wu, Yan Wang, Dacong Yan, and Atanas Rountev. 2015. Static Window Transition Graphs for Android (T). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Washington, DC, USA, 658–668.

Digital Library

Cited By

Bai RChen RLei XWu K(2024)A Test Report Optimization Method Fusing Reinforcement Learning and Genetic AlgorithmsElectronics10.3390/electronics1321428113:21(4281)Online publication date: 31-Oct-2024
https://doi.org/10.3390/electronics13214281
Zhang ZTawsif FRyu KYu THalfond W(2024)Mobile Bug Report Reproduction via Global Search on the App UI ModelProceedings of the ACM on Software Engineering10.1145/36608241:FSE(2656-2676)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660824
Wang DZhao YFeng SZhang ZHalfond WChen CSun XShi JYu TChristakis MPradel M(2024)Feedback-Driven Automated Whole Bug Report Reproduction for Android AppsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680341(1048-1060)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680341
Show More Cited By

Index Terms

Automatically translating bug reports into test cases for mobile apps
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Automatically Reproducing Android Bug Reports using Natural Language Processing and Reinforcement Learning
ISSTA 2023: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis

As part of the process of resolving issues submitted by users via bug reports, Android developers attempt to reproduce and observe the crashes described by the bug reports. Due to the low-quality of bug reports and the complexity of modern apps, the ...
ReCDroid+: Automated End-to-End Crash Reproduction from Bug Reports for Android Apps
The large demand of mobile devices creates significant concerns about the quality of mobile applications (apps). Developers heavily rely on bug reports in issue tracking systems to reproduce failures (e.g., crashes). However, the process of crash ...
An Empirical Analysis of Bug Reports and Bug Fixing in Open Source Android Apps
CSMR '13: Proceedings of the 2013 17th European Conference on Software Maintenance and Reengineering

Smartphone platforms and applications (apps) have gained tremendous popularity recently. Due to the novelty of the smartphone platform and tools, and the low barrier to entry for app distribution, apps are prone to errors, which affects user experience ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISSTA 2018: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

July 2018

379 pages

ISBN:9781450356992

DOI:10.1145/3213846

General Chair:
Frank Tip
Northeastern University, USA
,
Program Chair:
Eric Bodden
University of Paderborn, Germany / Fraunhofer IEM, Germany

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISSTA '18

Sponsor:

SIGSOFT

ISSTA '18: International Symposium on Software Testing and Analysis

July 16 - 21, 2018

Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Sponsor:
sigsoft

34th ACM SIGSOFT International Symposium on Software Testing and Analysis

June 25 - 28, 2025

Trondheim , Norway

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

49
Total Citations
View Citations
670
Total Downloads

Downloads (Last 12 months)56
Downloads (Last 6 weeks)9

Reflects downloads up to 06 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bai RChen RLei XWu K(2024)A Test Report Optimization Method Fusing Reinforcement Learning and Genetic AlgorithmsElectronics10.3390/electronics1321428113:21(4281)Online publication date: 31-Oct-2024
https://doi.org/10.3390/electronics13214281
Zhang ZTawsif FRyu KYu THalfond W(2024)Mobile Bug Report Reproduction via Global Search on the App UI ModelProceedings of the ACM on Software Engineering10.1145/36608241:FSE(2656-2676)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660824
Wang DZhao YFeng SZhang ZHalfond WChen CSun XShi JYu TChristakis MPradel M(2024)Feedback-Driven Automated Whole Bug Report Reproduction for Android AppsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680341(1048-1060)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680341
Baral KJohnson JMahmud JSalma SFazzini MRubin JOffutt JMoran KSpinellis DConstantinou EBacchelli A(2024)Automating GUI-based Test Oracles for Mobile AppsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644930(309-321)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644930
Mahmud JDe Silva NKhan SMostafavi SMansur SChaparro OMarcus AMoran KRoychoudhury APaiva AAbreu RStorey M(2024)On Using GUI Interaction Data to Improve Text Retrieval-based Bug LocalizationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3608139(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3608139
Feng SChen CRoychoudhury APaiva AAbreu RStorey M(2024)Prompting Is All You Need: Automated Android Bug Replay with Large Language ModelsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3608137(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3608137
Kang SYoon JAskarbekkyzy NYoo S(2024)Evaluating Diverse Large Language Models for Automatic and General Bug ReproductionIEEE Transactions on Software Engineering10.1109/TSE.2024.345083750:10(2677-2694)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3450837
Masoudian MHuang HAmini MZhang C(2024)Mole: Efficient Crash Reproduction in Android Applications With Enforcing Necessary UI EventsIEEE Transactions on Software Engineering10.1109/TSE.2024.342854350:8(2200-2218)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3428543
Yu SFang CDu MDing ZChen ZSu Z(2024)Practical, Automated Scenario-Based Mobile App TestingIEEE Transactions on Software Engineering10.1109/TSE.2024.341467250:7(1949-1966)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3414672
Cui ZLin GZheng LZhang Z(2024)A Fast Crash Reproduction Method for Android Applications Based on Widget Hierarchy GraphsIEEE Internet of Things Journal10.1109/JIOT.2024.335720911:8(13217-13230)Online publication date: 15-Apr-2024
https://doi.org/10.1109/JIOT.2024.3357209
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents