skip to main content
10.1145/3324884.3416583acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Detecting and explaining self-admitted technical debts with attention-based neural networks

Published: 27 January 2021 Publication History

Abstract

Self-Admitted Technical Debt (SATD) is a sub-type of technical debt. It is introduced to represent such technical debts that are intentionally introduced by developers in the process of software development. While being able to gain short-term benefits, the introduction of SATDs often requires to be paid back later with a higher cost, e.g., introducing bugs to the software or increasing the complexity of the software.
To cope with these issues, our community has proposed various machine learning-based approaches to detect SATDs. These approaches, however, are either not generic that usually require manual feature engineering efforts or do not provide promising means to explain the predicted outcomes. To that end, we propose to the community a novel approach, namely HATD (Hybrid Attention-based method for self-admitted Technical Debt detection), to detect and explain SATDs using attention-based neural networks. Through extensive experiments on 445,365 comments in 20 projects, we show that HATD is effective in detecting SATDs on both in-the-lab and in-the-wild datasets under both within-project and cross-project settings. HATD also outperforms the state-of-the-art approaches in detecting and explaining SATDs.

References

[1]
Nicolli S.R. Alves, Leilane F. Ribeiro, Vivyane Caires, Thiago S. Mendes, and Rodrigo O. Spinola. 2014. Towards an Ontology of Terms on Technical Debt. In 2014 Sixth International Workshop on Managing Technical Debt. 1--7.
[2]
Nicolli S R Alves, Thiago Souto Mendes, Manoel Mendonca, Rodrigo O Spinola, Forrest Shull, and Carolyn Seaman. 2016. Identification and management of technical debt. Information and Software Technology 70, 70 (2016), 100--121.
[3]
Areti Ampatzoglou, Apostolos Ampatzoglou, Alexander Chatzigeorgiou, and Paris Avgeriou. 2015. The financial aspect of managing technical debt. Information and Software Technology 64, 64 (2015), 52--73.
[4]
G. Bavota and B. Russo. 2016. A Large-Scale Empirical Study on Self-Admitted Technical Debt. In 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR). 315--326.
[5]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135--146.
[6]
Ward Cunningham. 1993. The WyCash portfolio management system. ACM SIGPLAN OOPS Messenger 4, 2 (1993), 29--30.
[7]
Everton da Silva Maldonado, Emad Shihab, and Nikolaos Tsantalis. 2017. Using natural language processing to automatically detect self-admitted technical debt. IEEE Transactions on Software Engineering 43, 11 (2017), 1044--1062.
[8]
Mário André de Freitas Farias, Manoel Gomes de Mendonça Neto, Marcos Kalinowski, and Rodrigo Oliveira Spínola. 2020. Identifying self-admitted technical debt through code comment analysis with a contextualized vocabulary. Inf. Softw. Technol. 121 (2020), 106270.
[9]
Jernej Flisar and Vili Podgorelec. 2018. Enhanced Feature Selection Using Word Embeddings for Self-Admitted Technical Debt Identification. In Euromicro Conference on Software Engineering and Advanced Applications.
[10]
Jernej Flisar and Vili Podgorelec. 2019. Identification of Self-Admitted Technical Debt Using Enhanced Feature Selection Based on Word Embedding. IEEE Access 7 (2019), 106475--106494.
[11]
B. Fluri, M. Wursch, and H.C. Gall. 2007. Do Code and Comments Co-Evolve? On the Relation between Source Code and Comment Changes. In 14th Working Conference on Reverse Engineering (WCRE 2007). 70--79.
[12]
Matthieu Foucault, Xavier Blanc, Margaretanne Storey, Jeanremy Falleri, and Cedric Teyton. 2018. Gamification: a Game Changer for Managing Technical Debt? A Design Study. arXiv: Software Engineering (2018).
[13]
Sávio Freire, Nicolli Rios, Boris Gutierrez, Darío Torres, Manoel G. Mendonça, Clemente Izurieta, Carolyn B. Seaman, and Rodrigo O. Spínola. 2020. Surveying Software Practitioners on Technical Debt Payment Practices and Reasons for not Paying off Debt Items. In EASE. 210--219.
[14]
Zhaoqiang Guo, Shiran Liu, Jinping Liu, Yanhui Li, Lin Chen, Hongmin Lu, Yuming Zhou, and Baowen Xu. 2019. MAT: A simple yet strong baseline for identifying self-admitted technical debt. arXiv: Software Engineering (2019).
[15]
Matthew J. Howard, Samir Gupta, Lori Pollock, and K. Vijay-Shanker. 2013. Automatically mining software-based, semantically-similar words from comment-code mappings. In MSR. 377--386.
[16]
Qiao Huang, Emad Shihab, Xin Xia, David Lo, and Shanping Li. 2018. Identifying self-admitted technical debt in open source projects using text mining. Empirical Software Engineering 23, 1 (2018), 418--451.
[17]
Martina Iammarino, Fiorella Zampetti, Lerina Aversano, and Massimiliano Di Penta. 2019. Self-Admitted Technical Debt Removal and Refactoring Actions: Co-Occurrence or More?. In 2019 IEEE International Conference on Software Maintenance and Evolution, ICSME. 186--190.
[18]
Clemente Izurieta, Ipek Ozkaya, Carolyn B. Seaman, Philippe Kruchten, Robert L. Nord, Will Snipes, and Paris Avgeriou. 2016. Perspectives on Managing Technical Debt: A Transition Point and Roadmap from Dagstuhl. In Joint Proceedings of the 4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016), Vol. 1771. 84--87.
[19]
Armand Joulin, Edouard Grave, and Piotr Bojanowski Tomas Mikolov. 2017. Bag of Tricks for Efficient Text Classification. EACL 2017 (2017), 427.
[20]
Michael Kampffmeyer, Arnt-Borre Salberg, and Robert Jenssen. 2016. Semantic segmentation of small objects and modeling of uncertainty in urban remote sensing images using deep convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 1--9.
[21]
Kisub Kim, Dongsun Kim, Tegawendé F Bissyandé, Eunjong Choi, Li Li, Jacques Klein, and Yves Le Traon. 2018. FaCoY - A Code-to-Code Search Engine. In ICSE 2018.
[22]
Pingfan Kong, Li Li, Jun Gao, Kui Liu, Tegawendé F Bissyandé, and Jacques Klein. 2018. Automated Testing of Android Apps: A Systematic Literature Review. IEEE Transactions on Reliability (2018).
[23]
Philippe Kruchten, Robert L Nord, and Ipek Ozkaya. 2012. Technical Debt: From Metaphor to Theory and Practice. IEEE Software 29, 6 (2012), 18--21.
[24]
Philippe Kruchten, Robert L Nord, Ipek Ozkaya, and Davide Falessi. 2013. Technical debt: towards a crisper definition report on the 4th international workshop on managing technical debt. ACM SIGSOFT Software Engineering Notes 38, 5 (2013), 51--54.
[25]
Valentina Lenarduzzi, Nyyti Saarimäki, and Davide Taibi. 2019. The Technical Debt Dataset. In Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering, PROMISE 2019, Recife, Brazil, September 18, 2019. ACM, 2--11.
[26]
Li Li, Tegawendé F Bissyandé, Mike Papadakis, Siegfried Rasthofer, Alexandre Bartel, Damien Octeau, Jacques Klein, and Yves Le Traon. 2017. Static Analysis of Android Apps: A Systematic Literature Review. Information and Software Technology (2017).
[27]
Zengyang Li, Paris Avgeriou, and Peng Liang. 2015. A systematic mapping study on technical debt and its management. Journal of Systems and Software 101 (2015), 193--220.
[28]
Zhongxin Liu, Qiao Huang, Xin Xia, Emad Shihab, David Lo, and Shanping Li. 2018. SATD detector: a text-mining-based self-admitted technical debt detection tool. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018. 9--12.
[29]
Rungroj Maipradit, Christoph Treude, Hideaki Hata, and Kenichi Matsumoto. 2019. Wait For It: Identifying "On-Hold" Self-Admitted Technical Debt. arXiv: Software Engineering (2019).
[30]
E. D. S. Maldonado, R. Abdalkareem, E. Shihab, and A. Serebrenik. 2017. An Empirical Study on the Removal of Self-Admitted Technical Debt. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). 238--248.
[31]
Everton da S Maldonado and Emad Shihab. 2015. Detecting and quantifying different types of self-admitted technical debt. In 2015 IEEE 7th International Workshop on Managing Technical Debt (MTD). IEEE, 9--15.
[32]
Solomon Mensah, Jacky Keung, Jeffery Svajlenko, Kwabena Ebo Bennin, and Qing Mi. 2018. On the value of a prioritization scheme for resolving Self-admitted technical debt. Journal of Systems and Software 135 (2018), 37--54.
[33]
Solomon Mensah, Jacky W. Keung, Michael Franklin Bosu, and Kwabena Ebo Bennin. 2016. Rework Effort Estimation of Self-admitted Technical Debt. In Joint Proceedings of the 4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016) Hamilton, New Zealand, December 6, 2016, Vol. 1771. 72--75.
[34]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[35]
Robert L. Nord, Ipek Ozkaya, Edward J. Schwartz, Forrest Shull, and Rick Kazman. 2016. Can Knowledge of Technical Debt Help Identify Software Vulnerabilities?. In 9th Workshop on Cyber Security Experimentation and Test.
[36]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In EMNLP. 1532--1543.
[37]
Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of NAACL-HLT. 2227--2237.
[38]
Aniket Potdar and Emad Shihab. 2014. An exploratory study on self-admitted technical debt. In ICSME. IEEE, 91--100.
[39]
Xiaoxue Ren, Zhenchang Xing, Xin Xia, David Lo, Xinyu Wang, and John Grundy. 2019. Neural Network-based Detection of Self-Admitted Technical Debt: From Performance to Explainability. ACM Transactions on Software Engineering and Methodology 28, 3 (2019), 15.
[40]
Carolyn Seaman, Yuepu Guo, Nico Zazworka, Forrest Shull, Clemente Izurieta, Yuanfang Cai, and Antonio Vetrò. 2012. Using technical debt data in decision making: Potential decision approaches. In 2012 Third International Workshop on Managing Technical Debt (MTD). IEEE, 45--48.
[41]
Giancarlo Sierra, Emad Shihab, and Yasutaka Kamei. 2019. A survey of self-admitted technical debt. Journal of Systems and Software 152 (2019), 70--82.
[42]
Giancarlo Sierra, Ahmad Tahmid, Emad Shihab, and Nikolaos Tsantalis. 2019. Is Self-Admitted Technical Debt a Good Indicator of Architectural Divergences?. In SANER, Xinyu Wang, David Lo, and Emad Shihab (Eds.). 534--543.
[43]
Rodrigo O. Spínola, Nico Zazworka, Antonio Vetro, Forrest Shull, and Carolyn B. Seaman. 2019. Understanding automated and human-based technical debt identification approaches-a two-phase study. J. Braz. Comp. Soc. 25, 1 (2019), 5:1--5:21.
[44]
Daniela Steidl, Benjamin Hummel, and Elmar Juergens. 2013. Quality analysis of source code comments. In 2013 21st International Conference on Program Comprehension (ICPC). 83--92.
[45]
Ben Stopford, Ken Wallace, and John Allspaw. 2017. Technical Debt: Challenges and Perspectives. IEEE Software 34, 4 (2017), 79--81.
[46]
Xiaobing Sun, Qiang Geng, David Lo, Yucong Duan, Xiangyue Liu, and Bin Li. 2016. Code Comment Quality Analysis and Improvement Recommendation: An Automated Approach. International Journal of Software Engineering and Knowledge Engineering 26, 6 (2016), 981--1000.
[47]
Edith Tom, Aybuke Aurum, and Richard Vidgen. 2013. An exploration of technical debt. Journal of Systems and Software 86, 6 (2013), 1498--1516.
[48]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[49]
Bradley L. Vinz and Letha H. Etzkorn. 2008. Improving program comprehension by combining code understanding with comment understanding. Knowledge Based Systems 21, 8 (2008), 813--825.
[50]
Supatsara Wattanakriengkrai, Rungroj Maipradit, Hideki Hata, Morakot Choetkiertikul, and Kenichi Matsumoto. 2018. Identifying Design and Requirement Self-Admitted Technical Debt Using N-gram IDF. In IWESEP.
[51]
Supatsara Wattanakriengkrai, Napat Srisermphoak, Sahawat Sintoplertchaikul, Morakot Choetkiertikul, Chaiyong Ragkhitwetsagul, Thanwadee Sunetnanta, Hideaki Hata, and Kenichi Matsumoto. 2019. Automatic Classifying Self-Admitted Technical Debt Using N-Gram IDF. (2019), 316--322.
[52]
Sultan Wehaibi, Emad Shihab, and Latifa Guerrouj. 2016. Examining the impact of self-admitted technical debt on software quality. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. IEEE, 179--188.
[53]
Will, Snipes, Carolyn, Seaman, Ipek, Ozkaya, Clemente, and Izurieta. 2017. Technical Debt: A Research Roadmap Report on the Eighth Workshop on Managing Technical Debt (MTD 2016). Software Engineering Notes Acm Sigsoft (2017).
[54]
Jifeng Xuan, Yan Hu, and Jiang He. 2012. Debt-Prone Bugs: Technical Debt in Software Maintenance. International Journal of Advancements in Computing Technology 4 (10 2012), 453--461.
[55]
Meng Yan, Xin Xia, Emad Shihab, David Lo, Jianwei Yin, and Xiaohu Yang. 2018. Automating change-level self-admitted technical debt determination. IEEE Transactions on Software Engineering (2018).
[56]
Meng Yan, Xin Xia, Emad Shihab, David Lo, Jianwei Yin, and Xiaohu Yang. 2019. Automating Change-Level Self-Admitted Technical Debt Determination. IEEE Transactions on Software Engineering 45, 12 (2019), 1211--1229.
[57]
Zhe Yu, Fahmid Morshed Fahid, Huy Tu, and Tim Menzies. 2020. Identifying Self-Admitted Technical Debts with Jitterbug: A Two-step Approach. CoRR abs/2002.11049 (2020).
[58]
Fiorella Zampetti, Cedric Noiseux, Giuliano Antoniol, Foutse Khomh, and Massimiliano Di Penta. 2017. Recommending when Design Technical Debt Should be Self-Admitted. In 2017 IEEE International Conference on Software Maintenance and Evolution, ICSME. 216--226.
[59]
Fiorella Zampetti, Alexander Serebrenik, and Massimiliano Di Penta. 2018. Was Self-Admitted Technical Debt Removal a Real Removal? An In-Depth Perspective. In MSR.
[60]
Fiorella Zampetti, Alexander Serebrenik, and Massimiliano Di Penta. 2020. Automatically Learning Patterns for Self-Admitted Technical Debt Removal. In SANER.
[61]
Nico Zazworka, Michele A. Shaw, Forrest Shull, and Carolyn Seaman. 2011. Investigating the impact of design debt on software quality. In Proceedings of the 2nd Workshop on Managing Technical Debt. 17--23.
[62]
Nico Zazworka, Rodrigo O. Spínola, Antonio Vetro, Forrest Shull, and Carolyn Seaman. 2013. A case study on effectively identifying technical debt. In Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering. 42--47.
[63]
Nico Zazworka, Antonio Vetro, Clemente Izurieta, Sunny Wong, Yuanfang Cai, Carolyn Seaman, and Forrest Shull. 2014. Comparing four approaches for technical debt identification. Software Quality Journal 22, 3 (2014), 403--426.
[64]
Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi, Bingchen Li, Hongwei Hao, and Bo Xu. 2016. Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 207--212.

Cited By

View all

Index Terms

  1. Detecting and explaining self-admitted technical debts with attention-based neural networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering
    December 2020
    1449 pages
    ISBN:9781450367684
    DOI:10.1145/3324884
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • IEEE CS

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 January 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SATD
    2. attention-based neural networks
    3. self-admitted technical debt
    4. word embedding

    Qualifiers

    • Research-article

    Funding Sources

    • NSFC

    Conference

    ASE '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 82 of 337 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)26
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 13 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media