skip to main content
10.1145/3324884.3416567acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article
Public Access

Seven reasons why: an in-depth study of the limitations of random test input generation for Android

Published: 27 January 2021 Publication History

Abstract

Experience paper: Testing of mobile apps is time-consuming and requires a great deal of manual effort. For this reason, industry and academic researchers have proposed a number of test input generation techniques for automating app testing. Although useful, these techniques have weaknesses and limitations that often prevent them from achieving high coverage. We believe that one of the reasons for these limitations is that tool developers tend to focus mainly on improving the strategy the techniques employ to explore app behavior, whereas limited effort has been put into investigating other ways to improve the performance of these techniques. To address this problem, and get a better understanding of the limitations of input-generation techniques for mobile apps, we conducted an in-depth study of the limitations of Monkey-arguably the most widely used tool for automated testing of Android apps. Specifically, in our study, we manually analyzed Monkey's performance on a benchmark of 64 apps to identify the common limitations that prevent the tool from achieving better coverage results. We then assessed the coverage improvement that Monkey could achieve if these limitations were eliminated. In our analysis of the results, we also discuss whether other existing test input generation tools suffer from these common limitations and provide insights on how they could address them.

References

[1]
Saswat Anand, Mayur Naik, Mary Harrold, and Hongseok Yang. 2012. Automated Concolic Testing of Smartphone Apps. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE 2012. ACM, New York, NY, USA.
[2]
Android Open Source Project. 2020. Android Runtime (ART) and Dalvik. https://source.android.com/devices/tech/dalvik.
[3]
Android Open Source Project. 2020. Dalvik bytecode. https://source.android.com/devices/tech/dalvik/dalvik-bytecode.
[4]
Android Open Source Project. 2020. SDK Platform Tools release notes. https://developer.android.com/studio/releases/platform-tools.
[5]
Android Open Source Project. 2020. UI/Application Exerciser Monkey. https://developer.android.com/studio/test/monkey.
[6]
Tanzirul Azim and Iulian Neamtiu. 2013. Targeted and Depth-First Exploration for Systematic Testing of Android Apps. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications (Indianapolis, Indiana, USA) (OOPSLA '13). ACM, New York, NY, USA, 641--660.
[7]
Young-Min Baek and Doo-Hwan Bae. 2016. Automated Model-Based Android GUI Testing Using Multi-Level GUI Comparison Criteria. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (Singapore, Singapore) (ASE 2016). ACM, New York, NY, USA, 238--249.
[8]
Bastani, Osbert and Sharma, Rahul and Aiken, Alex and Liang, Percy. 2017. Synthesizing Program Input Grammars. SIGPLAN Not. 52, 6 (June 2017), 95--110.
[9]
Sebastian Bauersfeld and Tanja Vos. 2012. A Reinforcement Learning Approach to Automated GUI Robustness Testing. In Fast abstracts of the 4th symposium on search-based software engineering (SSBSE 2012). Springer Berlin Heidelberg, Berlin, Heidelberg, 7--12.
[10]
Sebastian Bauersfeld and Tanja EJ Vos. 2014. User interface level testing with TESTAR; what about more sophisticated action specification and selection?. In SATToSE. CEUR-WS.org, Aachen, 60--78.
[11]
Farnaz Behrang and Alessandro Orso. 2018. Automated Test Migration for Mobile Apps. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (Gothenburg, Sweden) (ICSE '18). ACM, New York, NY, USA, 384--385.
[12]
Behrang, Farnaz and Orso, Alessandro. 2020. Seven Reasons Why: An In-Depth Study of the Limitations of Random Test Input Generation for Android. https://sites.google.com/view/studymonkeylimitations/.
[13]
Patrick Carter, Collin Mulliner, Martina Lindorfer, William Robertson, and Engin Kirda. 2017. CuriousDroid: Automated User Interface Interaction for Android Application Analysis Sandboxes. In Financial Cryptography and Data Security, Jens Grossklags and Bart Preneel (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 231--249.
[14]
Wontae Choi, George Necula, and Koushik Sen. 2013. Guided GUI Testing of Android Apps with Minimal Restart and Approximate Learning. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications (Indianapolis, Indiana, USA) (OOPSLA '13). ACM, New York, NY, USA, 623--640.
[15]
Shauvik Roy Choudhary, Alessandra Gorla, and Alessandro Orso. 2015. Automated Test Input Generation for Android: Are We There Yet?. In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (Lincoln, Nebraska) (ASE '15). IEEE Press, New York, NY, USA, 429--440.
[16]
Christian Degott, Nataniel P. Borges Jr., and Andreas Zeller. 2019. Learning User Interface Element Interactions. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (Beijing, China) (ISSTA 2019). ACM, New York, NY, USA, 296--306.
[17]
Anna I Esparcia-Alcázar, Francisco Almenar, Mirella Martínez, Urko Rueda, and T Vos. 2016. Q-learning strategies for action selection in the TESTAR automated testing tool. 6th International Conferenrence on Metaheuristics and nature inspired computing (META 2016) (2016), 130--137.
[18]
F-droid Group. 2020. F-Droid. https://f-droid.org.
[19]
Godefroid, Patrice and Peleg, Hila and Singh, Rishabh. 2017. Learn&Fuzz: Machine Learning for Input Fuzzing. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (Urbana-Champaign, IL, USA) (ASE 2017). IEEE Press, New York, NY, USA, 50--59.
[20]
Shuai Hao, Bin Liu, Suman Nath, William G.J. Halfond, and Ramesh Govindan. 2014. PUMA: Programmable UI-Automation for Large-Scale Dynamic Analysis of Mobile Apps. In Proceedings of the 12th Annual International Conference on Mobile Systems, Applications, and Services (Bretton Woods, New Hampshire, USA) (MobiSys '14). ACM, New York, NY, USA, 204--217.
[21]
Y. Koroglu, A. Sen, O. Muslu, Y. Mete, C. Ulker, T. Tanriverdi, and Y. Donmez. 2018. QBE: QLearning-Based Exploration of Android Applications. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, New York, NY, USA, 105--115.
[22]
Peng Liu, Xiangyu Zhang, Marco Pistoia, Yunhui Zheng, Manoel Marques, and Lingfei Zeng. 2017. Automatic Text Input Generation for Mobile Testing. In Proceedings of the 39th International Conference on Software Engineering (Buenos Aires, Argentina) (ICSE '17). IEEE Press, New York, NY, USA, 643--653.
[23]
Aravind Machiry, Rohan Tahiliani, and Mayur Naik. 2013. Dynodroid: An Input Generation System for Android Apps. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (Saint Petersburg, Russia) (ESEC/FSE 2013). ACM, New York, NY, USA, 224--234.
[24]
Riyadh Mahmood, Nariman Mirzaei, and Sam Malek. 2014. EvoDroid: Segmented Evolutionary Testing of Android Apps. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (Hong Kong, China) (FSE 2014). ACM, New York, NY, USA, 599--609.
[25]
Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-Objective Automated Testing for Android Applications. In Proceedings of the 25th International Symposium on Software Testing and Analysis (Saarbrücken, Germany) (ISSTA 2016). ACM, New York, NY, USA, 94--105.
[26]
K. Mao, M. Harman, and Y. Jia. 2017. Crowd Intelligence Enhances Automated Mobile Testing. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE '17). ACM, New York, NY, USA, 16--26.
[27]
Mathis, Björn and Gopinath, Rahul and Mera, Michaël and Kampmann, Alexander and Höschele, Matthias and Zeller, Andreas. 2019. Parser-Directed Fuzzing. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (Phoenix, AZ, USA) (PLDI 2019). Association for Computing Machinery, New York, NY, USA, 548--560.
[28]
Nariman Mirzaei, Joshua Garcia, Hamid Bagheri, Alireza Sadeghi, and Sam Malek. 2016. Reducing Combinatorics in GUI Testing of Android Applications. In Proceedings of the 38th International Conference on Software Engineering (Austin, Texas) (ICSE '16). ACM, New York, NY, USA, 559--570.
[29]
Mountainminds GmbH & Co. 2020. JaCoCo Java Code Coverage Library. https://www.eclemma.org/jacoco/.
[30]
N. P. Borges, M. Gómez, and A. Zeller. 2018. Guiding App Testing with Mined Interaction Models. In 2018 IEEE/ACM 5th International Conference on Mobile Software Engineering and Systems (MOBILESoft). ACM, New York, NY, USA, 133--143.
[31]
Padhye, Rohan and Lemieux, Caroline and Sen, Koushik and Papadakis, Mike and Le Traon, Yves. 2019. Semantic Fuzzing with Zest. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (Beijing, China) (ISSTA 2019). Association for Computing Machinery, New York, NY, USA, 329--340.
[32]
Andreas Rau, Jenny Hotzkow, and Andreas Zeller. 2018. Efficient GUI Test Generation by Learning from Tests of Other Apps. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (Gothenburg, Sweden) (ICSE '18). ACM, New York, NY, USA, 370--371.
[33]
Raimondas Sasnauskas and John Regehr. 2014. Intent Fuzzer: Crafting Intents of Death. In Proceedings of the 2014 Joint International Workshop on Dynamic Analysis (WODA) and Software and System Performance Testing, Debugging, and Analytics (PERTEA) (San Jose, CA, USA) (WODA+PERTEA 2014). ACM, New York, NY, USA, 1--5.
[34]
Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, and Zhendong Su. 2017. Guided, Stochastic Model-Based GUI Testing of Android Apps. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (Paderborn, Germany) (ESEC/FSE 2017). ACM, New York, NY, USA, 245--256.
[35]
Heila van der Merwe, Brink van der Merwe, and Willem Visser. 2012. Verifying Android Applications Using Java PathFinder. SIGSOFT Softw. Eng. Notes 37, 6 (Nov. 2012), 1--5.
[36]
Wenyu Wang, Dengfeng Li, Wei Yang, Yurui Cao, Zhenwen Zhang, Yuetang Deng, and Tao Xie. 2018. An Empirical Study of Android Test Generation Tools in Industrial Cases. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (Montpellier, France) (ASE 2018). ACM, New York, NY, USA, 738--748.
[37]
Michelle Wong and David Lie. 2016. IntelliDroid: A targeted input generator for the dynamic analysis of Android malware. In NDSS16. The Internet Society, Reston, VA, USA, 21--24.
[38]
Wei Yang, Mukul R. Prasad, and Tao Xie. 2013. A Grey-Box Approach for Automated GUI-Model Generation of Mobile Applications. In Proceedings of the 16th International Conference on Fundamental Approaches to Software Engineering (Rome, Italy) (FASE'13). Springer-Verlag, Berlin, Heidelberg, 250--265.
[39]
Wei Yang, Mukul R. Prasad, and Tao Xie. 2013. A Grey-Box Approach for Automated GUI-Model Generation of Mobile Applications. In Proceedings of the 16th International Conference on Fundamental Approaches to Software Engineering (Rome, Italy) (FASE'13). Springer-Verlag, Berlin, Heidelberg, 250--265.
[40]
Hui Ye, Shaoyin Cheng, Lanbo Zhang, and Fan Jiang. 2013. DroidFuzzer: Fuzzing the Android Apps with Intent-Filter Tag. In Proceedings of International Conference on Advances in Mobile Computing & Multimedia (Vienna, Austria) (MoMM '13). ACM, New York, NY, USA, 68--74.
[41]
Yuanchun Li, Ziyue Yang, Yao Guo, and Xiangqun Chen. 2017. DroidBot: A Lightweight UI-Guided Test Input Generator for Android. In 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C). IEEE Press, New York, NY, USA, 23--26.
[42]
Xia Zeng, Dengfeng Li, Wujie Zheng, Fan Xia, Yuetang Deng, Wing Lam, Wei Yang, and Tao Xie. 2016. Automated Test Input Generation for Android: Are We Really There yet in an Industrial Case?. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (Seattle, WA, USA) (FSE 2016). ACM, New York, NY, USA, 987--992.
[43]
Y. Zheng, X. Xie, T. Su, L. Ma, J. Hao, Z. Meng, Y. Liu, R. Shen, Y. Chen, and C. Fan. 2019. Wuji: Automatic Online Combat Game Testing Using Evolutionary Deep Reinforcement Learning. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, New York, NY, USA, 772--784.

Cited By

View all

Index Terms

  1. Seven reasons why: an in-depth study of the limitations of random test input generation for Android

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering
    December 2020
    1449 pages
    ISBN:9781450367684
    DOI:10.1145/3324884
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    In-Cooperation

    • IEEE CS

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 January 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Android UI testing
    2. empirical study
    3. test generation

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ASE '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 82 of 337 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)173
    • Downloads (Last 6 weeks)40
    Reflects downloads up to 13 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Navigating Mobile Testing Evaluation: A Comprehensive Statistical Analysis of Android GUI Testing MetricsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695476(944-956)Online publication date: 27-Oct-2024
    • (2024)General and Practical Property-based Testing for Android AppsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3694986(53-64)Online publication date: 27-Oct-2024
    • (2024)Deeply Reinforcing Android GUI Testing with Deep Reinforcement LearningProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623344(1-13)Online publication date: 20-May-2024
    • (2023)Automata-Based Trace Analysis for Aiding Diagnosing GUI Testing Tools for AndroidProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616361(592-604)Online publication date: 30-Nov-2023
    • (2023)GUI Testing for Android Applications: A Survey2023 7th International Conference on Computer, Software and Modeling (ICCSM)10.1109/ICCSM60247.2023.00010(6-10)Online publication date: 21-Jul-2023
    • (2023)Understanding the Reproducibility Issues of Monkey for GUI TestingDependable Software Engineering. Theories, Tools, and Applications10.1007/978-981-99-8664-4_8(132-151)Online publication date: 27-Nov-2023
    • (2021)Benchmarking automated GUI testing for Android against real-world bugsProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3468620(119-130)Online publication date: 20-Aug-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media