skip to main content
10.1145/3457913.3457924acmotherconferencesArticle/Chapter ViewAbstractPublication PagesinternetwareConference Proceedingsconference-collections
research-article

An Empirical Study of Architectural Changes in Code Commits

Published: 21 July 2021 Publication History

Abstract

The maintenance of software architecture is challenged by fast-delivery code changes since developers are rarely aware of the architectural impacts of their code changes. To ease the burdens of architects, in this work, we proposed a light-weight framework to identify changes in architectures from code commits automatically. The framework identifies architectural changes without heavy architecture recovery techniques. Instead, it only takes a code commit as input. The framework, on the one hand, can be integrated into prevalent continuous integration systems to monitor architectural changes. On the other hand, it can be plugged into code review systems to help developers realize the architectural changes they introduce. Based on the framework, we further conducted a large-scale empirical study on 368,847 commits of 16 Apache open projects to study architectural changes. Our study reveals several new findings regarding the frequency of architectural change commits, the common and risky intents under which developers introduce architectural changes, and the correlations of architectural changes with lines of code and number of modified source files in commits. Our findings provide practical implications for software contributors and shed light on potential research directions on architecture maintenance.

References

[1]
2019. Cassandra. https://cassandra.apache.org
[2]
2019. eclipse. https://help.eclipse.org
[3]
2019. Git. https://git-scm.com
[4]
2019. Git Log. https://git-scm.com/docs/git-log
[5]
2019. GumTree. https://github.com/GumTreeDiff/gumtree
[6]
2019. JGit. https://www.eclipse.org/jgit
[7]
2019. Jira. https://issues.apache.org/jira
[8]
Muhammad Asaduzzaman, Chanchal K. Roy, Kevin A. Schneider, and Massimiliano Di Penta. 2013. LHDiff: A Language-Independent Hybrid Approach for Tracking Source Code Lines. (2013).
[9]
Gabriele Bavota, Bogdan Dit, Rocco Oliveto, Massimiliano Di Penta, Denys Poshyvanyk, and Andrea De Lucia. 2013. An empirical study on the developers’ perception of software coupling. In Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, 692–701.
[10]
Pooyan Behnamghader, Duc Minh Le, Joshua Garcia, Daniel Link, Arman Shahbazian, and Nenad Medvidovic. 2017. A large-scale study of architectural evolution in open-source software systems. Empirical Software Engineering 22, 3 (2017), 1146–1193.
[11]
Raymond P. L. Buse and Westley R. Weimer. 2010. Automatically documenting program changes. In IEEE/ACM International Conference on Ase.
[12]
Ivan Candela, Gabriele Bavota, Barbara Russo, and Rocco Oliveto. 2016. Using Cohesion and Coupling for Software Remodularization: Is It Enough?ACM Transactions on Software Engineering and Methodology (TOSEM) 25, 3(2016), 24.
[13]
Gerardo Canfora, Luigi Cerulo, and Massimiliano Di Penta. 2008. Tracking Your Changes: A Language-Independent Approach. IEEE Software 26, 1 (2008), 50–57.
[14]
Gerardo Canfora, Luigi Cerulo, and Massimiliano Di Penta. 2009. Ldiff: An enhanced line differencing tool. In IEEE International Conference on Software Engineering.
[15]
Patricia Cohen, Stephen G West, and Leona S Aiken. 2014. Applied multiple regression/correlation analysis for the behavioral sciences. Psychology Press.
[16]
Di Cui, Ting Liu, Yuanfang Cai, Qinghua Zheng, Qiong Feng, Wuxia Jin, Jiaqi Guo, and Yu Qu. 2019. Investigating the impact of multiple dependency structures on software defects. In Proceedings of the 41st International Conference on Software Engineering. IEEE Press, 584–595.
[17]
Jeanr E My Falleri, Flor E Al Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. automated software engineering(2014), 313–324.
[18]
Beat Fluri, M. Wursch, Martin Pinzger, and Harald C. Gall. 2007. Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction. IEEE Transactions on Software Engineering 33, 11 (2007), 725–743.
[19]
Ron N. Forthofer and Robert G. Lehnen. 1981. Rank Correlation Methods. Springer US, Boston, MA. 146–163 pages. https://doi.org/10.1007/978-1-4684-6683-6_9
[20]
Joshua Garcia, Daniel Popescu, Chris Mattmann, Nenad Medvidovic, and Yuanfang Cai. 2011. Enhancing architectural recovery using concerns. In Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, 552–555.
[21]
Emanuel Giger, Martin Pinzger, and Harald C Gall. 2012. Can we predict types of code changes? an empirical analysis. In 2012 9th IEEE Working Conference on Mining Software Repositories (MSR). IEEE, 217–226.
[22]
Masatomo Hashimoto and Akira Mori. 2008. Diff/TS: A Tool for Fine-Grained Structural Change Analysis. In Conference on Reverse Engineering.
[23]
Kaifeng Huang, Bihuan Chen, Xin Peng, Daihong Zhou, Ying Wang, Yang Liu, and Wenyun Zhao. 2018. ClDiff: Generating Concise Linked Code Differences. In ACM/IEEE International Conference on Automated Software Engineering. ACM, 679–690.
[24]
Siyuan Jiang, Ameer Armaly, and Collin Mcmillan. 2017. Automatically Generating Commit Messages from Diffs using Neural Machine Translation. (2017).
[25]
Miryung Kim and David Notkin. 2009. Discovering and representing systematic code changes. In IEEE International Conference on Software Engineering.
[26]
Miryung Kim, David Notkin, and Grossman Dan. 2007. Automatic Inference of Structural Changes for Matching across Program Versions. In International Conference on Software Engineering.
[27]
Miryung Kim, David Notkin, Grossman Dan, and Jr Wilson, Gary. 2013. Identifying and Summarizing Systematic Code Changes via Rule Inference. IEEE Transactions on Software Engineering 39, 1 (2013), 45–62.
[28]
Duc Minh Le, Pooyan Behnamghader, Joshua Garcia, Daniel Link, Arman Shahbazian, and Nenad Medvidovic. 2015. An empirical study of architectural change in open-source software systems. In Proceedings of the 12th Working Conference on Mining Software Repositories. IEEE Press, 235–245.
[29]
Duc Minh Le, Daniel Link, Arman Shahbazian, and Nenad Medvidovic. 2018. An empirical study of architectural decay in open-source software. In 2018 IEEE International Conference on Software Architecture (ICSA). IEEE, 176–17609.
[30]
Mario Linares-Vasquez, Luis Fernando Cortes-Coy, Jairo Aponte, and Denys Poshyvanyk. 2015. ChangeScribe: A Tool for Automatically Generating Commit Messages. In IEEE/ACM IEEE International Conference on Software Engineering.
[31]
Huihui Liu, Yijun Yu, Bixin Li, Yibiao Yang, and Ru Jia. 2018. Are Smell-Based Metrics Actually Useful in Effort-Aware Structural Change-Proneness Prediction? An Empirical Study. In 2018 25th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 315–324.
[32]
Hongmin Lu, Yuming Zhou, Baowen Xu, Hareton Leung, and Lin Chen. 2012. The ability of object-oriented metrics to predict change-proneness: a meta-analysis. Empirical software engineering 17, 3 (2012), 200–242.
[33]
Paul W Mcburney and Collin Mcmillan. 2016. Automatic Source Code Summarization of Context for Java Methods. IEEE Transactions on Software Engineering 42, 2 (2016), 103–119.
[34]
Ran Mo, Yuanfang Cai, Rick Kazman, and Lu Xiao. 2015. Hotspot patterns: The formal definition and automatic detection of architecture smells. In Software Architecture (WICSA), 2015 12th Working IEEE/IFIP Conference on. IEEE, 51–60.
[35]
Ran Mo, Yuanfang Cai, Rick Kazman, Lu Xiao, and Qiong Feng. 2016. Decoupling level: a new metric for architectural maintenance complexity. In Proceedings of the 38th International Conference on Software Engineering. ACM, 499–510.
[36]
Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrian Marcus, and Gerardo Canfora. 2017. ARENA: An Approach for the Automated Generation of Release Notes. IEEE Transactions on Software Engineering 43, 2 (2017), 106–127.
[37]
Meng Na, Miryung Kim, and Kathryn S. Mckinley. 2011. Systematic editing: generating program transformations from an example.
[38]
Meng Na, Miryung Kim, and K. S. Mckinley. 2013. Lase: Locating and applying systematic edits by learning from examples. In International Conference on Software Engineering.
[39]
Mel Ó Cinnéide, Laurence Tratt, Mark Harman, Steve Counsell, and Iman Hemati Moghadam. 2012. Experimental assessment of software metrics using automated refactoring. In Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement. 49–58.
[40]
Matheus Paixao, Jens Krinke, Dong Gyun Han, Chaiyong Ragkhitwetsagul, and Mark Harman. 2017. Are developers aware of the architectural impact of their changes?. In IEEE/ACM International Conference on Automated Software Engineering.
[41]
Renaud Pawlak, Martin Monperrus, Nicolas Petitprez, Carlos Noguera, and Lionel Seinturier. 2015. Spoon: A Library for Implementing Analyses and Transformations of Java Source Code. Software: Practice and Experience 46 (2015), 1155–1179. https://doi.org/10.1002/spe.2346
[42]
S. Rastkar and G. C. Murphy. 2013. Why did this code change?. In International Conference on Software Engineering.
[43]
Steven P. Reiss. 2008. Tracking source locations. (2008).
[44]
Daniele Romano and Martin Pinzger. 2011. Using source code metrics to predict change-prone java interfaces. In 2011 27th IEEE International Conference on Software Maintenance (ICSM). IEEE, 303–312.
[45]
Arman Shahbazian, Youn Kyu Lee, Duc Le, Yuriy Brun, and Nenad Medvidovic. 2018. Recovering architectural design decisions. In 2018 IEEE International Conference on Software Architecture (ICSA). IEEE, 95–9509.
[46]
Suresh Thummalapenta, Luigi Cerulo, Lerina Aversano, and Massimiliano Di Penta. 2010. An empirical study on the maintenance of source code clones. Empirical Software Engineering 15, 1 (2010), 1–34.
[47]
Irene Tollin, Francesca Arcelli Fontana, Marco Zanoni, and Riccardo Roveda. 2017. Change prediction through coding rules violations. In Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering. 61–64.
[48]
Vassilios Tzerpos and Richard C Holt. 2000. ACDC: An Algorithm for Comprehension-Driven Clustering. In wcre. 258–267.
[49]
Ye Wang, Na Meng, and Hao Zhang. 2018. An Empirical Study of Multi-entity Changes in Real Bug Fixes. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). 287–298. https://doi.org/10.1109/ICSME.2018.00038
[50]
Lu Xiao, Yuanfang Cai, and Rick Kazman. 2014. Design rule spaces: A new form of architecture insight. In Proceedings of the 36th International Conference on Software Engineering. ACM, 967–977.
[51]
Lu Xiao, Yuanfang Cai, Rick Kazman, Ran Mo, and Qiong Feng. 2016. Identifying and quantifying architectural debt. In Proceedings of the 38th International Conference on Software Engineering. ACM, 488–498.
[52]
A. T. T. Ying, G. C. Murphy, R. Ng, and M. C. Chu-Carroll. 2004. Predicting source code changes by mining change history. IEEE Transactions on Software Engineering 30, 9 (2004), 574–586.
[53]
Farida El Zanaty, Toshiki Hirao, Shane McIntosh, Akinori Ihara, and Kenichi Matsumoto. 2018. An Empirical Study of Design Discussions in Code Review. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. ACM, New York, NY, USA, Article 11, 10 pages. https://doi.org/10.1145/3239235.3239525
[54]
Hao Zhong and Zhendong Su. 2015. An Empirical Study on Real Bug Fixes. In Proceedings of the 37th International Conference on Software Engineering - Volume 1(ICSE ’15). IEEE Press, 913–923.
[55]
Thomas Zimmermann, Andreas Zeller, P Weissgerber, and Stephan Diehl. 2005. Mining version histories to guide software changes. IEEE Transactions on Software Engineering 31, 6 (2005), 429–445.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
Internetware '20: Proceedings of the 12th Asia-Pacific Symposium on Internetware
November 2020
264 pages
ISBN:9781450388191
DOI:10.1145/3457913
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 July 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Empirical Study
  2. Software Architecture
  3. Software Quality

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

Internetware'20
Internetware'20: 12th Asia-Pacific Symposium on Internetware
November 1 - 3, 2020
Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 55 of 111 submissions, 50%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 101
    Total Downloads
  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)1
Reflects downloads up to 26 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media