skip to main content
10.1145/3597503.3623322acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Learning-based Widget Matching for Migrating GUI Test Cases

Published: 06 February 2024 Publication History

Abstract

GUI test case migration is to migrate GUI test cases from a source app to a target app. The key of test case migration is widget matching. Recently, researchers have proposed various approaches by formulating widget matching as a matching task. However, since these matching approaches depend on static word embeddings without using contextual information to represent widgets and manually formulated matching functions, there are main limitations of these matching approaches when handling complex matching relations in apps. To address the limitations, we propose the first learning-based widget matching approach named TEMdroid (TEst Migration) for test case migration. Unlike the existing approaches, TEMdroid uses BERT to capture contextual information and learns a matching model to match widgets. Additionally, to balance the significant imbalance between positive and negative samples in apps, we design a two-stage training strategy where we first train a hard-negative sample miner to mine hard-negative samples, and further train a matching model using positive samples and mined hard-negative samples. Our evaluation on 34 apps shows that TEM-droid is effective in event matching (i.e., widget matching and target event synthesis) and test case migration. For event matching, TEM-droid's Top1 accuracy is 76%, improving over 17% compared to baselines. For test case migration, TEMdroid's F1 score is 89%, also 7% improvement compared to the baseline approach.

References

[1]
2022. ChatGPT. https://chat.openai.com/
[2]
2023. ATM dataset. https://sites.google.com/view/apptestmigrator
[3]
2023. BERT base uncased. https://huggingface.co/bert-base-uncased
[4]
2023. Craftdroid dataset. https://github.com/seal-hub/CraftDroid
[5]
2023. FrUITeR dataset. https://felicitia.github.io/FrUITeR
[6]
2023. Google Play store. https://play.google.com/store/games
[7]
2023. GPT-4, a large multimodal model. https://openai.com/research/gpt-4
[8]
2023. PyTorch: from research to production. https://pytorch.org/
[9]
2023. SemFinder dataset.
[10]
2023. Source code and extra materials for TEMDroid. https://github.com/YakZhang/TEMdroid
[11]
2023. Tesseract OCR: an optical character recognition engine. https://github.com/tesseract-ocr
[12]
2023. UI/application exerciser Monkey. https://developer.Android.com/studio/test/monkey
[13]
Alan Akbik, Duncan Blythe, and Roland Vollgraf. 2018. Contextual string em-beddings for sequence labeling. In ACL. 1638--1649.
[14]
Domenico Amalfitano, Anna Rita Fasolino, Porfirio Tramontana, Bryan Dzung Ta, and Atif M Memon. 2014. MobiGUITAR: automated model-based testing of mobile apps. IEEE Software 32, 5 (2014), 53--59.
[15]
Saswat Anand, Mayur Naik, Mary Jean Harrold, and Hongseok Yang. 2012. Automated concolic testing of smartphone apps. In FSE. 1--11.
[16]
Issa Annamoradnejad and Gohar Zoghi. 2020. ColBERT: using BERT sentence embedding for humor detection. arXiv preprint arXiv:2004.12765 (2020).
[17]
Young-Min Baek and Doo-Hwan Bae. 2016. Automated model-based Android GUI testing using multi-level GUI comparison criteria. In ASE. 238--249.
[18]
Yude Bai, Sen Chen, Zhenchang Xing, and Xiaohong Li. 2023. ArgusDroid: detecting Android malware variants by mining permission-API knowledge graph. SCIS 66, 9 (2023), 1--19.
[19]
Farnaz Behrang and Alessandro Orso. 2019. Test migration between mobile apps with similar functionality. In ASE. 54--65.
[20]
Luca Bertinetto, Jack Valmadre, Joao F Henriques, Andrea Vedaldi, and Philip HS Torr. 2016. Fully-convolutional Siamese networks for object tracking. In ECCV. 850--865.
[21]
Christopher M Bishop and Nasser M Nasrabadi. 2006. Pattern recognition and machine learning. Vol. 4. Springer.
[22]
Eeshita Biswas, Mehmet Efruz Karabulut, Lori Pollock, and K Vijay-Shanker. 2020. Achieving reliable sentiment analysis in the software engineering domain using BERT. In ICSME. 162--173.
[23]
Nataniel P Borges Jr, Maria Gómez, and Andreas Zeller. 2018. Guiding app testing with mined interaction models. In MOBILESoft. 133--143.
[24]
Asli Celikyilmaz, Marcus Thint, and Zhiheng Huang. 2009. A graph-based semi-supervised learning for question-answering. In ACL. 719--727.
[25]
Sumit Chopra, Raia Hadsell, and Yann LeCun. 2005. Learning a similarity metric discriminatively, with application to face verification. In CVPR. 539--546.
[26]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: pre-training of deep bidirectional Transformers for language understanding. In NAACL. 4171--4186.
[27]
Felix Dobslaw, Robert Feldt, David Michaëlsson, Patrik Haar, Francisco Gomes de Oliveira Neto, and Richard Torkar. 2019. Estimating return on investment for GUI test automation frameworks. In ISSRE. 271--282.
[28]
Xingping Dong and Jianbing Shen. 2018. Triplet loss in Siamese network for object tracking. In ECCV. 459--474.
[29]
Xiang Gao, Shin Hwei Tan, Zhen Dong, and Abhik Roychoudhury. 2018. Android testing via synthetic symbolic execution. In ASE. 419--429.
[30]
Tianxiao Gu, Chun Cao, Tianchi Liu, Chengnian Sun, Jing Deng, Xiaoxing Ma, and Jian Lü. 2017. AimDroid: activity-insulated multi-level automated testing for Android applications. In ICSME. 103--114.
[31]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In CIKM. 55--64.
[32]
Walid Hariri. 2023. Unlocking the potential of ChatGPT: a comprehensive exploration of its applications, advantages, limitations, and future directions in Natural Language Processing. arXiv preprint arXiv:2304.02017 (2023).
[33]
Anfeng He, Chong Luo, Xinmei Tian, and Wenjun Zeng. 2018. A twofold Siamese network for real-time object tracking. In CVPR. 4834--4843.
[34]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778.
[35]
Gang Hu, Linjie Zhu, and Junfeng Yang. 2018. AppFlow: using machine learning to synthesize robust, reusable UI tests. In ESEC/FSE. 269--282.
[36]
Haruna Isotani, Hironori Washizaki, Yoshiaki Fukazawa, Tsutomu Nomoto, Saori Ouji, and Shinobu Saito. 2021. Duplicate bug report detection by using sentence embedding and fine-tuning. In ICSME. 535--544.
[37]
Bekir Karlik and A Vehbi Olgac. 2011. Performance analysis of various activation functions in generalized MLP architectures of neural networks. IJAE 1, 4 (2011), 111--122.
[38]
Yavuz Koroglu, Alper Sen, Ozlem Muslu, Yunus Mete, Ceyda Ulker, Tolga Tanriverdi, and Yunus Donmez. 2018. QBE: QLearning-based exploration of Android applications. In ICST. 105--115.
[39]
Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In ICML. 957--966.
[40]
Duling Lai and Julia Rubin. 2019. Goal-driven exploration for Android applications. In ASE. 115--127.
[41]
Vladimir I Levenshtein et al. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet physics. doklady 10, 8 (1966), 707--710.
[42]
Yuanchun Li, Ziyue Yang, Yao Guo, and Xiangqun Chen. 2019. Humanoid: a deep learning-based approach to automated black-box Android app testing. In ASE. 1070--1073.
[43]
Jiayuan Liang, Sinan Wang, Xiangbo Deng, and Yepang Liu. 2023. RIDA: cross-app record and replay for Android. In ICST. 246--257.
[44]
Jing Liao, Yikun Huang, Haolin Wang, and Mengting Li. 2021. Matching ontologies with Word2Vec model based on cosine similarity. In AICV. 367--374.
[45]
Jinfeng Lin, Yalin Liu, Qingkai Zeng, Meng Jiang, and Jane Cleland-Huang. 2021. Traceability transformed: generating more accurate links with pre-trained BERT models. In ICSE. 324--335.
[46]
Jun-Wei Lin, Reyhaneh Jabbarvand, and Sam Malek. 2019. Test transfer across mobile apps through semantic mapping. In ASE. 42--53.
[47]
Jun-Wei Lin and Sam Malek. 2022. GUI test transfer from Web to Android. In ICST. 1--11.
[48]
Kuo-Sui Lin and Chih-Chung Chiu. 2015. A fuzzy similarity matching model for interior design drawing recommendation. In ASE BD&SI. 1--6.
[49]
Mario Linares-Vásquez, Carlos Bernal-Cárdenas, Kevin Moran, and Denys Poshyvanyk. 2017. How do developers test Android applications?. In ICSME. 613--622.
[50]
Shuqi Liu, Yu Zhou, Tingting Han, and Taolue Chen. 2023. Test reuse based on adaptive semantic matching across Android mobile applications. arXiv preprint arXiv:2301.00530 (2023).
[51]
Yi Liu, Yun Ma, Xusheng Xiao, Tao Xie, and Xuanzhe Liu. 2023. LegoDroid: flexible Android app decomposition and instant installation. SCIS 66, 4 (2023), 142103.
[52]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[53]
Aravind Machiry, Rohan Tahiliani, and Mayur Naik. 2013. Dynodroid: an input generation system for Android apps. In ESEC/FSE. 224--234.
[54]
Saket Maheshwary and Hemant Misra. 2018. Matching resumes to jobs via deep Siamese network. In WWW. 87--88.
[55]
Riyadh Mahmood, Nariman Mirzaei, and Sam Malek. 2014. EvoDroid: segmented evolutionary testing of Android apps. In FSE. 599--609.
[56]
Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: multi-objective automated testing for Android applications. In ISSTA. 94--105.
[57]
Qun Mao, Weiwei Wang, Feng You, Ruilian Zhao, and Zheng Li. 2022. User behavior pattern mining and reuse across similar Android apps. JSS 183 (2022), 111085.
[58]
Leonardo Mariani, Ali Mohebbi, Mauro Pezzè, and Valerio Terragni. 2021. Semantic matching of GUI events for test reuse: are we there yet?. In ISSTA. 177--190.
[59]
Leonardo Mariani, Mauro Pezzè, Valerio Terragni, and Daniele Zuddas. 2023. An evolutionary approach to adapt tests across mobile apps. In AST. 70--79.
[60]
Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, and Rachel Rudinger. 2019. On measuring social biases in sentence encoders. In NAACL. 622--628.
[61]
Iaroslav Melekhov, Juho Kannala, and Esa Rahtu. 2016. Siamese network features for image matching. In ICPR. 378--383.
[62]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS. 3111--3119.
[63]
Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted Boltzmann machines. In ICML. 807--814.
[64]
Ekaterina Nepovinnykh, Tuomas Eerola, and Heikki Kalviainen. 2020. Siamese network based pelage pattern matching for ringed seal re-identification. In WACVW. 25--34.
[65]
Minxue Pan, An Huang, Guoxin Wang, Tian Zhang, and Xuandong Li. 2020. Reinforcement learning based curiosity-driven testing of Android applications. In ISSTA. 153--164.
[66]
Xue Qin, Hao Zhong, and Xiaoyin Wang. 2019. TestMig: migrating GUI test cases from iOS to Android. In ISSTA. 284--295.
[67]
Dezhi Ran, Zongyang Li, Chenxu Liu, Wenyu Wang, Weizhi Meng, Xionglin Wu, Hui Jin, Jing Cui, Xing Tang, and Tao Xie. 2022. Automated visual testing for mobile apps in an industrial setting. In ICSE-SEIP. 55--64.
[68]
Dezhi Ran, Hao Wang, Wenyu Wang, and Tao Xie. 2023. Badge: prioritizing UI events with hierarchical multi-armed bandits for automated UI testing. In ICSE. 894--905.
[69]
Andreas Rau, Jenny Hotzkow, and Andreas Zeller. 2018. Transferring tests across Web applications. In ICWE. 50--64.
[70]
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).
[71]
Abhinav Shrivastava, Abhinav Gupta, and Ross Girshick. 2016. Training region-based object detectors with online hard example mining. In CVPR. 761--769.
[72]
Helge Spieker, Arnaud Gotlieb, Dusica Marijan, and Morten Mossige. 2017. Reinforcement learning for automatic test case prioritization and selection in continuous integration. In ISSTA. 12--22.
[73]
Jianlin Su, Jiarun Cao, Weijie Liu, and Yangyiwen Ou. 2021. Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316 (2021).
[74]
Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, and Zhendong Su. 2017. Guided, stochastic model-based GUI testing of Android apps. In ESEC/FSE. 245--256.
[75]
Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, and Denny Zhou. 2020. MobileBERT: a compact task-agnostic BERT for resource-limited devices. arXiv preprint arXiv:2004.02984 (2020).
[76]
K-K Sung and Tomaso Poggio. 1998. Example-based learning for view-based human face detection. TPAMI 20, 1 (1998), 39--51.
[77]
Saghar Talebipour, Yixue Zhao, Luka Dojcilović, Chenggang Li, and Nenad Medvidović. 2021. UI test migration across mobile platforms. In ASE. 756--767.
[78]
Najam us Saqib and Sara Shahzad. 2018. Functionality, performance, and compatibility testing: a model based approach. In FIT. 170--175.
[79]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NIPS. 6000--6010.
[80]
Xiaolong Wang, Abhinav Shrivastava, and Abhinav Gupta. 2017. A-Fast-RCNN: hard positive generation via qdversary for object detection. In CVPR. 2606--2615.
[81]
Junfang Wu, Chunyang Ye, and Hui Zhou. 2021. BERT for sentiment classification in software engineering. In ICSS. 115--121.
[82]
Wei Yang, Mukul R Prasad, and Tao Xie. 2013. A grey-box approach for automated GUI-model generation of mobile applications. In FASE. 250--265.
[83]
Shengcheng Yu, Chunrong Fang, Yexiao Yun, and Yang Feng. 2021. Layout and image recognition driving cross-platform automated mobile testing. In ICSE. 1561--1571.
[84]
Xia Zeng, Dengfeng Li, Wujie Zheng, Fan Xia, Yuetang Deng, Wing Lam, Wei Yang, and Tao Xie. 2016. Automated test input generation for Android: are we really there yet in an industrial case?. In FSE. 987--992.
[85]
Yixue Zhao, Justin Chen, Adriana Sejfia, Marcelo Schmitt Laser, Jie Zhang, Federica Sarro, Mark Harman, and Nenad Medvidovic. 2020. FrUITeR: a framework for evaluating UI test reuse. In ESEC/FSE. 1190--1201.
[86]
Haibing Zheng, Dengfeng Li, Beihai Liang, Xia Zeng, Wujie Zheng, Yuetang Deng, Wing Lam, Wei Yang, and Tao Xie. 2017. Automated test input generation for Android: towards getting there in an industrial case. In ICSE-SEIP. 253--262.
[87]
Yan Zheng, Xiaofei Xie, Ting Su, Lei Ma, Jianye Hao, Zhaopeng Meng, Yang Liu, Ruimin Shen, Yingfeng Chen, and Changjie Fan. 2019. Wuji: automatic online combat game testing using evolutionary deep reinforcement learning. In ASE. 772--784.
[88]
Tianyang Zhong, Yaonai Wei, Li Yang, Zihao Wu, Zhengliang Liu, Xiaozheng Wei, Wenjun Li, Junjie Yao, Chong Ma, Xiang Li, et al. 2023. ChatABL: abductive learning via natural language interaction with ChatGPT. arXiv preprint arXiv:2304.11107 (2023).

Cited By

View all
  • (2024)Synthesis-Based Enhancement for GUI Test Case MigrationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680327(869-881)Online publication date: 11-Sep-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
May 2024
2942 pages
ISBN:9798400702174
DOI:10.1145/3597503
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 February 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. test migration
  2. GUI testing
  3. deep learning

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)315
  • Downloads (Last 6 weeks)31
Reflects downloads up to 07 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Synthesis-Based Enhancement for GUI Test Case MigrationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680327(869-881)Online publication date: 11-Sep-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media