skip to main content
10.1145/3368089.3409689acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Interactive, effort-aware library version harmonization

Published: 08 November 2020 Publication History

Abstract

As a mixed result of intensive dependency on third-party libraries, flexible mechanisms to declare dependencies and increased number of modules in a project, different modules of a project directly depend on multiple versions of the same third-party library. Such library version inconsistencies could increase dependency maintenance cost, or even lead to dependency conflicts when modules are inter-dependent. Although automated build tools (e.g., Maven's enforcer plugin) provide partial support to detect library version inconsistencies, they do not provide any support to harmonize inconsistent library versions.
We first conduct a survey with 131 Java developers from GitHub to retrieve first-hand information about the root causes, detection methods, reasons for fixing or not fixing, fixing strategies, fixing efforts, and tool expectations on library version inconsistencies. Then, based on the insights from our survey, we propose LibHarmo, an interactive, effort-aware library version harmonization technique, to detect library version inconsistencies, interactively suggest a harmonized version with the least harmonization efforts based on library API usage analysis, and refactor build configuration files.
LibHarmo is currently developed for Java Maven projects. Our experimental study on 443 highly-starred Java Maven projects from GitHub shows that i) LibHarmo detected 621 library version inconsistencies in 152 (34.3%) projects with a false positive rate of 16.8%, while Maven's enforcer plugin only detected 219 of them; and ii) LibHarmo saved 87.5% of the harmonization efforts. Further, 31 library version inconsistencies have been confirmed, and 17 of them have been already harmonized by developers.

Supplementary Material

Auxiliary Teaser Video (fse20main-p187-p-teaser.mp4)
As different modules of a project directly depend on multiple versions of the same third-party library, such library version inconsistencies could increase dependency maintenance cost, or even lead to dependency conflicts when modules are inter-dependent. We first conduct a survey with 131 Java developers from GitHub to retrieve first-hand information on library version inconsistencies. Then, based on the insights from our survey, we propose LibHarmo. Our experimental study on 443 highly-starred Java Maven projects shows that i) LibHarmo detected 621 library version inconsistencies in 152 (34.3%) projects with a false positive rate of 16.8%, while Maven's enforcer plugin only detected 219 of them; and ii) LibHarmo saved 87.5% of the harmonization efforts. Further, 31 library version inconsistencies have been confirmed, and 17 of them have been already harmonized by developers.
Auxiliary Presentation Video (fse20main-p187-p-video.mp4)
As different modules of a project directly depend on multiple versions of the same third-party library, such library version inconsistencies could increase dependency maintenance cost, or even lead to dependency conflicts when modules are inter-dependent. We first conduct a survey with 131 Java developers from GitHub to retrieve first-hand information on library version inconsistencies. Then, based on the insights from our survey, we propose LibHarmo. Our experimental study on 443 highly-starred Java Maven projects shows that i) LibHarmo detected 621 library version inconsistencies in 152 (34.3%) projects with a false positive rate of 16.8%, while Maven's enforcer plugin only detected 219 of them; and ii) LibHarmo saved 87.5% of the harmonization efforts. Further, 31 library version inconsistencies have been confirmed, and 17 of them have been already harmonized by developers.

References

[1]
[n.d.]. HADOOP-6800. Retrieved March 01, 2020 from https://issues.apache.org/ jira/browse/HADOOP-6800
[2]
[n.d.]. Introduction to the Dependency Mechanism. Retrieved March 01, 2020 from https://maven.apache.org/guides/introduction/introduction-todependency-mechanism.html
[3]
[n.d.]. Introduction to the POM. Retrieved March 01, 2020 from https://maven. apache.org/guides/introduction/introduction-to-the-pom.html
[4]
[n.d.]. Sample Size Calculator. Retrieved March 01, 2020 from https://www. surveysystem.com/sscalc.htm
[5]
[n.d.]. LibHarmo. Retrieved March 01, 2020 from https://libharmo.github.io
[6]
Rabe Abdalkareem, Olivier Nourry, Sultan Wehaibi, Suhaib Mujahid, and Emad Shihab. 2017. Why do developers use trivial packages? an empirical case study on npm. In FSE. 385-395.
[7]
Ittai Balaban, Frank Tip, and Robert Fuhrer. 2005. Refactoring Support for Class Library Migration. In OOPSLA. 265-279.
[8]
Veronika Bauer and Lars Heinemann. 2012. Understanding API usage to support informed decision making in software maintenance. In CSMR. 435-440.
[9]
Veronika Bauer, Lars Heinemann, and Florian Deissenboeck. 2012. A structured approach to assess third-party library usage. In ICSM. 483-492.
[10]
Gabriele Bavota, Gerardo Canfora, Massimiliano Di Penta, Rocco Oliveto, and Sebastiano Panichella. 2013. The evolution of project inter-dependencies in a software ecosystem: The case of apache. In ICSM. 280-289.
[11]
Gabriele Bavota, Gerardo Canfora, Massimiliano Di Penta, Rocco Oliveto, and Sebastiano Panichella. 2015. How the Apache community upgrades dependencies: an evolutionary study. Empirical Software Engineering 20, 5 ( 2015 ), 1275-1317.
[12]
Remco Bloemen, Chintan Amrit, Stefan Kuhlmann, and Gonzalo OrdóñezMatamoros. 2014. Gentoo package dependencies over time. In MSR. 404-407.
[13]
Christopher Bogart, Christian Kästner, James Herbsleb, and Ferdian Thung. 2016. How to Break an API: Cost Negotiation and Community Values in Three Software Ecosystems. In FSE. 109-120.
[14]
Aline Brito, Laerte Xavier, Andre Hora, and Marco Tulio Valente. 2018. APIDif: Detecting API breaking changes. In SANER. 507-511.
[15]
Aline Brito, Laerte Xavier, Andre Hora, and Marco Tulio Valente. 2018. Why and how Java developers break APIs. In SANER. 255-265.
[16]
Mircea Cadariu, Eric Bouwers, Joost Visser, and Arie van Deursen. 2015. Tracking known security vulnerabilities in proprietary software systems. In SANER. 516-519.
[17]
Kingsum Chow and David Notkin. 1996. Semi-automatic Update of Applications in Response to Library Changes. In ICSM. 359-368.
[18]
Bradley E Cossette and Robert J Walker. 2012. Seeking the ground truth: a retroactive study on the evolution and migration of software libraries. In FSE. 55.
[19]
Joël Cox, Eric Bouwers, Marko van Eekelen, and Joost Visser. 2015. Measuring dependency freshness in software systems. In ICSE, Vol. 2. 109-118.
[20]
Barthelemy Dagenais and Martin P Robillard. 2009. SemDif: Analysis and recommendation support for API evolution. In ICSE. 599-602.
[21]
Barthélémy Dagenais and Martin P Robillard. 2011. Recommending adaptive changes for framework evolution. ACM Transactions on Software Engineering and Methodology 20, 4 ( 2011 ), 19.
[22]
Coen De Roover, Ralf Lammel, and Ekaterina Pek. 2013. Multi-dimensional exploration of api usage. In ICPC. 152-161.
[23]
Alexandre Decan, Tom Mens, and Eleni Constantinou. 2018. On the Evolution of Technical Lag in the npm Package Dependency Network. In ICSME. 404-414.
[24]
Alexandre Decan, Tom Mens, and Eleni Constantinou. 2018. On the Impact of Security Vulnerabilities in the Npm Package Dependency Network. In MSR. 181-191.
[25]
Alexandre Decan, Tom Mens, and Philippe Grosjean. 2019. An empirical comparison of dependency network evolution in seven software packaging ecosystems. Empirical Software Engineering 24, 1 ( 2019 ), 381-416.
[26]
Erik Derr, Sven Bugiel, Sascha Fahl, Yasemin Acar, and Michael Backes. 2017. Keep Me Updated: An Empirical Study of Third-Party Library Updatability on Android. In CCS. 2187-2200.
[27]
Danny Dig and Ralph Johnson. 2006. How Do APIs Evolve? A Story of Refactoring: Research Articles. J. Softw. Maint. Evol. 18, 2 ( 2006 ), 83-107.
[28]
Mattia Fazzini, Qi Xin, and Alessandro Orso. 2019. Automated API-usage update for Android apps. In ISSTA. 204-215.
[29]
Dennis Felsing, Sarah Grebing, Vladimir Klebanov, Philipp Rümmer, and Mattias Ulbrich. 2014. Automating regression verification. In ASE. 349-360.
[30]
Benny Godlin and Ofer Strichman. 2013. Regression verification: proving the equivalence of similar programs. Software Testing, Verification and Reliability 23, 3 ( 2013 ), 241-258.
[31]
Alex Gyori, Owolabi Legunsen, Farah Hariri, and Darko Marinov. 2018. Evaluating Regression Test Selection Opportunities in a Very Large Open-Source Ecosystem. In ISSRE. 112-122.
[32]
Johannes Henkel and Amer Diwan. 2005. CatchUp! Capturing and replaying refactorings to support API evolution. In ICSE. 274-283.
[33]
André Hora, Romain Robbes, Nicolas Anquetil, Anne Etien, Stéphane Ducasse, and Marco Tulio Valente. 2015. How do developers react to API evolution? The Pharo ecosystem case. In ICSME. 251-260.
[34]
Andre Hora and Marco Tulio Valente. 2015. apiwave: Keeping track of API popularity and migration. In ICSME. 321-323.
[35]
Humble, Jez, and David Farley. 2010. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (Adobe Reader). Pearson Education.
[36]
Michael Jang. 2006. Linux Annoyances for Geeks: Getting the Most Flexible System in the World Just the Way You Want It. O' Reilly Media, Inc.
[37]
Kamil Jezek, Jens Dietrich, and Premek Brada. 2015. How Java APIs break-an empirical study. Information and Software Technology 65 ( 2015 ), 129-146.
[38]
Shahedul Huq Khandkar. 2009. Open coding. Technical Report. University of Calgary.
[39]
Riivo Kikas, Georgios Gousios, Marlon Dumas, and Dietmar Pfahl. 2017. Structure and evolution of package dependency networks. In MSR. 102-112.
[40]
Miryung Kim, Dongxiang Cai, and Sunghun Kim. 2011. An Empirical Investigation into the Role of API-level Refactorings During Software Evolution. In ICSE. 151-160.
[41]
Raula Gaikovina Kula, Daniel M German, Takashi Ishio, and Katsuro Inoue. 2015. Trusting a library: A study of the latency to adopt the latest maven release. In SANER. 520-524.
[42]
Raula Gaikovina Kula, Daniel M German, Takashi Ishio, Ali Ouni, and Katsuro Inoue. 2017. An exploratory study on library aging by monitoring client usage in a software ecosystem. In SANER. 407-411.
[43]
Raula Gaikovina Kula, Daniel M German, Ali Ouni, Takashi Ishio, and Katsuro Inoue. 2018. Do developers update their library dependencies? Empirical Software Engineering 23, 1 ( 2018 ), 384-417.
[44]
Raula Gaikovina Kula, Ali Ouni, Daniel M. German, and Katsuro Inoue. 2018. An Empirical Study on the Impact of Refactoring Activities on Evolving Client-used APIs. Inf. Softw. Technol. 93, C ( 2018 ), 186-199.
[45]
Shuvendu K Lahiri, Chris Hawblitzel, Ming Kawaguchi, and Henrique Rebêlo. 2012. Symdif: A language-agnostic semantic dif tool for imperative programs. In CAV. 712-717.
[46]
Ralf Lämmel, Ekaterina Pek, and Jürgen Starek. 2011. Large-scale, AST-based API-usage analysis of open-source Java projects. In SAC. 1317-1324.
[47]
Tobias Lauinger, Abdelberi Chaabane, Sajjad Arshad, William Robertson, Christo Wilson, and Engin Kirda. 2017. Thou shalt not depend on me: Analysing the use of outdated javascript libraries on the web. In NDSS.
[48]
Li Li, Tegawendé F Bissyandé, Jacques Klein, and Yves Le Traon. 2016. An investigation into the use of common libraries in android apps. In SANER. 403-414.
[49]
Mario Linares-Vásquez, Gabriele Bavota, Carlos Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto, and Denys Poshyvanyk. 2013. API Change and Fault Proneness: A Threat to the Success of Android Apps. In ESEC/FSE. 477-487.
[50]
Stephen McCamant and Michael D. Ernst. 2003. Predicting Problems Caused by Component Upgrades. In ESEC/FSE. 287-296.
[51]
Stephen McCamant and Michael D Ernst. 2004. Early identification of incompatibilities in multi-component upgrades. In ECOOP. 440-464.
[52]
Tyler McDonnell, Baishakhi Ray, and Miryung Kim. 2013. An Empirical Study of API Stability and Adoption in the Android Ecosystem. In ICSM. 70-79.
[53]
Gianluca Mezzetti, Anders Møller, and Martin Toldam Torp. 2018. Type regression testing to detect breaking changes in Node. js libraries. In ECOOP.
[54]
Yana Momchilova Mileva, Valentin Dallmeier, Martin Burger, and Andreas Zeller. 2009. Mining trends of library usage. In IWPSE-Evol. 57-62.
[55]
Yana Momchilova Mileva, Valentin Dallmeier, and Andreas Zeller. 2010. Mining API Popularity. In Testing-Practice and Research Techniques. 173-180.
[56]
Samim Mirhosseini and Chris Parnin. 2017. Can automated pull requests encourage software developers to upgrade out-of-date dependencies?. In ASE. 84-94.
[57]
Anders Møller and Martin Toldam Torp. 2019. Model-based testing of breaking changes in Node. js libraries. ( 2019 ).
[58]
Federico Mora, Yi Li, Julia Rubin, and Marsha Chechik. 2018. Client-specific Equivalence Checking. In ASE. 441-451.
[59]
Hoan Anh Nguyen, Tung Thanh Nguyen, Gary Wilson, Jr., Anh Tuan Nguyen, Miryung Kim, and Tien N. Nguyen. 2010. A Graph-based Approach to API Usage Adaptation. In OOPSLA. 302-321.
[60]
Eugen Paraschiv. 2018. The State of Java in 2018. Retrieved March 01, 2020 from https://www.baeldung.com/java-in-2018
[61]
Jibesh Patra, Pooja N Dixit, and Michael Pradel. 2018. ConflictJS: finding and understanding conflicts between JavaScript libraries. In ICSE. 741-751.
[62]
Henrik Plate, Serena Elisa Ponta, and Antonino Sabetta. 2015. Impact assessment for vulnerabilities in open-source software libraries. In ICSME. 411-420.
[63]
Serena Elisa Ponta, Henrik Plate, and Antonino Sabetta. 2018. Beyond Metadata: Code-Centric and Usage-Based Analysis of Known Vulnerabilities in OpenSource Software. In ICSME. 449-460.
[64]
Dong Qiu, Bixin Li, and Hareton Leung. 2016. Understanding the API usage in Java. Information and software technology 73 ( 2016 ), 81-100.
[65]
Steven Raemaekers, Arie van Deursen, and Joost Visser. 2012. Measuring software library stability through historical version analysis. In ICSM. 378-387.
[66]
Steven Raemaekers, Arie van Deursen, and Joost Visser. 2017. Semantic versioning and impact of breaking changes in the Maven repository. Journal of Systems and Software 129 ( 2017 ), 140-158.
[67]
Romain Robbes, Mircea Lungu, and David Röthlisberger. 2012. How do developers react to API deprecation?: the case of a smalltalk ecosystem. In FSE. 56 : 1-56 : 11.
[68]
Anand Ashok Sawant, Romain Robbes, and Alberto Bacchelli. 2016. On the reaction to deprecation of 25,357 clients of 4+ 1 popular Java APIs. In ICSME. 400-410.
[69]
Thorsten Schäfer, Jan Jonas, and Mira Mezini. 2008. Mining framework usage changes from instantiation code. In ICSE. 471-480.
[70]
Schlosser, Gerhard, and Günter P. Wagner. 2004. Modularity in development and evolution. University of Chicago Press.
[71]
Nicholas Smith, Danny van Bruggen, and Federico Tomassetti. 2017. JavaParser: Visited. Leanpub, oct. de ( 2017 ).
[72]
Gustavo Soares, Rohit Gheyi, Dalton Serey, and Tiago Massoni. 2010. Making program refactoring safer. IEEE software 27, 4 ( 2010 ), 52-57.
[73]
Anna Trostanetski, Orna Grumberg, and Daniel Kroening. 2017. Modular demanddriven analysis of semantic diference for program versions. In SAS. 405-427.
[74]
Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Sundaresan. 1999. Soot: A Java bytecode optimization framework. In CASCON. 13-.
[75]
Ying Wang, Bihuan Chen, Kaifeng Huang, Bowen Shi, Congying Xu, Xin Peng, Yijian Wu, and Yang Liu. 2020. An Empirical Study of Usages, Updates and Risks of Third-Party Libraries in Java Projects. In ICSME.
[76]
Ying Wang, Ming Wen, Zhenwei Liu, Rongxin Wu, Rui Wang, Bo Yang, Hai Yu, Zhiliang Zhu, and Shing-Chi Cheung. 2018. Do the Dependency Conflicts in My Project Matter?. In ESEC/FSE. 319-330.
[77]
Ying Wang, Ming Wen, Rongxin Wu, Zhenwei Liu, Shin Hwei Tan, Zhiliang Zhu, Hai Yu, and Shing-Chi Cheung. 2019. Could I Have a Stack Trace to Examine the Dependency Conflict Issue. In ICSE. 572-583.
[78]
Erik Wittern, Philippe Suter, and Shriram Rajagopalan. 2016. A look at the dynamics of the JavaScript package ecosystem. In MSR. 351-361.
[79]
Wei Wu, Yann-Gaël Guéhéneuc, Giuliano Antoniol, and Miryung Kim. 2010. Aura: a hybrid approach to identify framework evolution. In ICSE. 325-334.
[80]
Wei Wu, Foutse Khomh, Bram Adams, Yann-Gaël Guéhéneuc, and Giuliano Antoniol. 2016. An exploratory study of api changes and usages based on apache and eclipse ecosystems. Empirical Software Engineering 21, 6 ( 2016 ), 2366-2412.
[81]
Wei Wu, Adrien Serveaux, Yann-Gaël Guéhéneuc, and Giuliano Antoniol. 2015. The impact of imperfect change rules on framework api evolution identification: an empirical study. Empirical Software Engineering 20, 4 ( 2015 ), 1126-1158.
[82]
Laerte Xavier, Aline Brito, Andre Hora, and Marco Tulio Valente. 2017. Historical and impact analysis of API breaking changes: A large-scale study. In SANER. 138-147.
[83]
Laerte Xavier, Andre Hora, and Marco Tulio Valente. 2017. Why do we break APIs? first answers from developers. In SANER. 392-396.
[84]
Zhenchang Xing and Eleni Stroulia. 2007. API-evolution support with DifCatchUp. IEEE Transactions on Software Engineering 33, 12 ( 2007 ), 818-836.
[85]
Markus Zimmermann, Cristian-Alexandru Staicu, Cam Tenny, and Michael Pradel. 2019. Small World with High Risks: A Study of Security Threats in the npm Ecosystem. In USENIX Security.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
November 2020
1703 pages
ISBN:9781450370431
DOI:10.1145/3368089
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Library Version Harmonization
  2. Third-Party Libraries

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China

Conference

ESEC/FSE '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)5
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media