research-article

MARBLE: mining for boilerplate code to identify API usability problems

Authors:

Andrew Macvean,

Bogdan VasilescuAuthors Info & Claims

ASE '19: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering

Pages 615 - 627

https://doi.org/10.1109/ASE.2019.00063

Published: 07 February 2020 Publication History

Abstract

Designing usable APIs is critical to developers' productivity and software quality, but is quite difficult. One of the challenges is that anticipating API usability barriers and real-world usage is difficult, due to a lack of automated approaches to mine usability data at scale. In this paper, we focus on one particular grievance that developers repeatedly express in online discussions about APIs: "boilerplate code." We investigate what properties make code count as boilerplate, the reasons for boilerplate, and how programmers can reduce the need for it. We then present MARBLE, a novel approach to automatically mine boilerplate code candidates from API client code repositories. MARBLE adapts existing techniques, including an API usage mining algorithm, an AST comparison algorithm, and a graph partitioning algorithm. We evaluate MARBLE with 13 Java APIs, and show that our approach successfully identifies both already-known and new API-related boilerplate code instances.

References

[1]

J. Bloch, "How to design a good API and why it matters," in Companion to Conference on Object-oriented Programming Systems, Languages, and Applications. ACM, 2006, pp. 506--507.

[2]

E. Mosqueira-Rey, D. Alonso-Ríos, V. Moret-Bonillo, I. Fernández-Varela, and D. Álvarez-Estévez, "A systematic approach to API usability: Taxonomy-derived criteria and a case study," Information and Software Technology, vol. 97, pp. 46--63, 2018.

[3]

M. Reddy, API Design for C++. Elsevier, 2011.

[4]

J. Stylos and B. A. Myers, "The implications of method placement on API learnability," in International Symposium on Foundations of software engineering. ACM, 2008, pp. 105--112.

[5]

U. Farooq and D. Zirkler, "API peer reviews: A method for evaluating usability of application programming interfaces," in Conference on Computer Supported Cooperative Work. ACM, 2010, pp. 207--210.

[6]

A. Macvean, M. Maly, and J. Daughtry, "API design reviews at scale," in Extended Abstracts on Human Factors in Computing Systems. ACM, 2016, pp. 849--858.

[7]

L. Murphy, M. B. Kery, O. Alliyu, A. Macvean, and B. A. Myers, "API designers in the field: Design practices and challenges for creating usable APIs," in Symposium on Visual Languages and Human-Centric Computing. IEEE, 2018, pp. 249--258.

[8]

B. A. Myers and J. Stylos, "Improving API usability," Communications of the ACM, vol. 59, no. 6, pp. 62--69, 2016.

Digital Library

[9]

J. Bloch, "How to design a good API and why it matters," https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/32713.pdf, 2005.

[10]

M. P. Robillard, E. Bodden, D. Kawrykow, M. Mezini, and T. Ratchford, "Automated API property inference techniques," Transactions on Software Engineering, vol. 39, no. 5, pp. 613--637, 2013.

Digital Library

[11]

T. Zhang, G. Upadhyaya, A. Reinhardt, H. Rajan, and M. Kim, "Are code examples on an online Q&A forum reliable?" in International Conference on Software Engineering. ACM, 2018, pp. 886--896.

[12]

E. L. Glassman, T. Zhang, B. Hartmann, and M. Kim, "Visualizing API usage examples at scale," in Human Factors in Computing Systems. ACM, 2018, pp. 580:1--580:12.

[13]

J. Fowkes and C. Sutton, "Parameter-free probabilistic API mining across GitHub," in International Symposium on Foundations of Software Engineering. ACM, 2016, pp. 254--265.

[14]

"Collect GeometrySystem → drake_visualizer boilerplate by SeanCurtis-TRI pull request #8526 RobotLocomotion/drake," https://github.com/RobotLocomotion/drake/pull/8526.

[15]

"Reduce boilerplate for subclasses issue #172 parse-community/Parse-SDK-Android," https://github.com/parse-community/Parse-SDK-Android/issues/172.

[16]

"Can java help me avoid boilerplate code in equals()?" https://stackoverflow.com/questions/25183872/can-java-help-me-avoid-boilerplate-code-in-equals.

[17]

"Boilerplate code definition of stackoverflow," https://stackoverflow.com/questions/3992199/what-is-boilerplate-code.

[18]

"Boilerplate code definition of wikipedia," https://en.wikipedia.org/wiki/Boilerplate\_code.

[19]

J. Tulach, Practical API design: Confessions of a Java framework architect. Apress, 2008.

[20]

D. Nam, A. Horvath, A. Macvean, B. Myers, and B. Vasilescu, "Marble source code and the result," 2019.

[21]

J. Gerken, H.-C. Jetter, and H. Reiterer, "Using concept maps to evaluate the usability of APIs," in Extended Abstracts on Human Factors in Computing Systems. ACM, 2010, pp. 3937--3942.

[22]

M. Piccioni, C. A. Furia, and B. Meyer, "An empirical study of API usability," in International Symposium on Empirical Software Engineering and Measurement. ACM, 2013, pp. 5--14.

[23]

J. Bloch, Effective Java. Addison-Wesley Professional, 2017.

[24]

K. Cwalina and B. Abrams, Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries. Pearson Education, 2008.

[25]

A. Faulring, B. A. Myers, Y. Oren, and K. Rotenberg, "A case study of using HCI methods to improve tools for programmers," in International Workshop on Co-operative and Human Aspects of Software Engineering. IEEE, 2012, pp. 37--39.

[26]

T. Grill, O. Polacek, and M. Tscheligi, "Methods towards API usability: A structural analysis of usability problem categories," in International Conference on Human-Centred Software Engineering, 2012, pp. 164--180.

Digital Library

[27]

G. M. Rama and A. Kak, "Some structural measures of API usability," SoftwarePractice and Experience., vol. 45, no. 1, pp. 75--110, 2013.

Digital Library

[28]

T. Scheller and E. Kuhn, "Automated measurement of API usability: The API concepts framework," Information and Software Technology, vol. 61, pp. 145--162, 2015.

Digital Library

[29]

M. Monperrus and M. Mezini, "Detecting missing method calls as violations of the majority rule," Transactions on Software Engineering and Methodology, vol. 22, no. 1, pp. 1--25, 2013.

Digital Library

[30]

A. Wasylkowski, A. Zeller, and C. Lindig, "Detecting object usage anomalies," in Joint Meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering. ACM, 2007, pp. 35--44.

[31]

T. T. Nguyen, H. A. Nguyen, N. H. Pham, J. M. Al-Kofahi, and T. N. Nguyen, "Graph-based mining of multiple object usage patterns," in Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 2009, pp. 383--392.

[32]

A. Sven, H. A. Nguyen, S. Nadi, T. N. Nguyen, and M. Mezini, "Investigating next steps in static API-misuse detection," in International Conference on Mining Software Repositories. IEEE, 2019, pp. 265--275.

[33]

H. Zhong, T. Xie, L. Zhang, J. Pei, and H. Mei, "MAPO: Mining and recommending API usage patterns," in European Conference on Object-Oriented Programming. Springer, 2009, pp. 318--343.

[34]

N. Katirtzis, T. Diamantopoulos, and C. Sutton, "Summarizing software API usage examples using clustering techniques," in International Conference on Fundamental Approaches to Software Engineering. Springer, 2018, pp. 189--206.

[35]

J. Wang, J. Han, and C. Li, "Frequent closed sequence mining without candidate maintenance," Transactions on Knowledge and Data Engineering, vol. 19, no. 8, pp. 1042--1056, 2007.

Digital Library

[36]

E. Murphy-Hill, C. Sadowski, A. Head, J. Daughtry, A. Macvean, C. Jaspan, and C. Winter, "Discovering API usability problems at scale," in International Workshop on API Usage and Evolution. ACM, 2018, pp. 14--17.

[37]

C. J. Kapser and M. W. Godfrey, ""cloning considered harmful" considered harmful: patterns of cloning in software," Empirical Software Engineering, vol. 13, no. 6, p. 645, 2008.

Digital Library

[38]

C. K. Roy, J. R. Cordy, and R. Koschke, "Comparison and evaluation of code clone detection techniques and tools: A qualitative approach," Science of computer programming, vol. 74, no. 7, pp. 470--495, 2009.

Digital Library

[39]

H. Sajnani, V. Saini, J. Svajlenko, C. K. Roy, and C. V. Lopes, "Sourcerercc: scaling code clone detection to big-code," in International Conference on Software Engineering. ACM, 2016, pp. 1157--1168.

[40]

M. Kim, V. Sazawal, D. Notkin, and G. Murphy, "An empirical study of code clone genealogies," in Software Engineering Notes. ACM, 2005, pp. 187--196.

[41]

L. Jiang, G. Misherghi, Z. Su, and S. Glondu, "Deckard: Scalable and accurate tree-based detection of code clones," in International Conference on Software Engineering. IEEE Computer Society, 2007, pp. 96--105.

[42]

M. White, M. Tufano, C. Vendome, and D. Poshyvanyk, "Deep learning code fragments for code clone detection," in International Conference on Automated Software Engineering. ACM, 2016, pp. 87--98.

[43]

R. Lämmel and S. P. Jones, Scrap your boilerplate: a practical design pattern for generic programming. ACM, 2003.

Digital Library

[44]

"html5-boilerplate," https://github.com/h5bp/html5-boilerplate.

[45]

"Boilerplate code definition of quora," https://www.quora.com/What-is-boilerplate-code.

[46]

"How to avoid writing duplicate boilerplate code for requesting permissions?" https://stackoverflow.com/questions/39080095/how-to-avoid-writing-duplicate-boilerplate-code-for-requesting-permissions.

[47]

"How to avoid writing boilerplate code in java swing mvc?" https://stackoverflow.com/questions/26154225/how-to-avoid-writing-boilerplate-code-in-java-swing-mvc.

[48]

C. K. Roy and J. R. Cordy, "A survey on software clone detection research," Queens School of Computing TR, vol. 541, no. 115, pp. 64--68, 2007.

[49]

"Project lombok," https://projectlombok.org,.

[50]

T. K. Moon, "The expectation-maximization algorithm," Signal Processing Magazine, vol. 13, no. 6, pp. 47--60, 1996.

[51]

M. Pawlik and N. Augsten, "Tree edit distance: Robust and memory-efficient," Information Systems, vol. 56, pp. 157 -- 173, 2016.

Digital Library

[52]

J.-R. Falleri, F. Morandat, X. Blanc, M. Martinez, and M. Monperrus, "Fine-grained and accurate source code differencing," in International Conference on Automated Software Engineering. ACM, 2014, pp. 313--324.

[53]

V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, "Fast unfolding of communities in large networks," Journal of Statistical Mechanics: Theory and Experiment, vol. 2008, no. 10, p. P10008, 2008.

[54]

A. G. Koru and H. Liu, "Building effective defect-prediction models in practice," Software, vol. 22, no. 6, pp. 23--29, 2005.

Digital Library

[55]

B. Johnson, Y. Song, E. Murphy-Hill, and R. Bowdidge, "Why don't software developers use static analysis tools to find bugs?" in International Conference on Software Engineering. IEEE, 2013, pp. 672--681.

[56]

G. Gousios and D. Spinellis, "GHTorrent: GitHub's data from a fire-hose," in Internatioanl Conference on Mining Software Repositories. IEEE, 2012, pp. 12--21.

[57]

"Probabilistic API mining implementation," https://github.com/mast-group/API-mining.

[58]

"Ap-ted implementation," https://github.com/DatabaseGroup/apted.

[59]

"Community detection package," https://python-louvain.readthedocs.io.

[60]

"How to avoid boilerplate code when loading images with picasso library," https://stackoverflow.com/questions/32167948/how-to-avoid-boilerplate-code-when-loading-images-with-picasso-library.

[61]

"Spring framework," https://spring.io.

[62]

"Greendao," http://greenrobot.org/greendao/documentation/introduction.

[63]

"Android API 26 release note," https://developer.android.com/about/versions/oreo/android-8.0-changes#fvbi-signature.

[64]

"Butterknife," https://jakewharton.github.io/butterknife/.

[65]

M. Weiser, "Program slicing," in International Conference on Software Engineering. IEEE, 1981, pp. 439--449.

Cited By

Nam DMacvean AMyers BVasilescu B(2024)Understanding Documentation Use Through Log Analysis: A Case Study of Four Cloud ServicesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642721(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642721
Mahmoud MWalker RDenzinger J(2024)API usage templates via structural generalizationJournal of Systems and Software10.1016/j.jss.2024.111974210:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.jss.2024.111974
Nam DMyers BVasilescu BHellendoorn VGrundy JPollock LPenta M(2023)Improving API Knowledge Discovery with ML: A Case Study of Comparable API MethodsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00161(1890-1906)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00161
Show More Cited By

Recommendations

MARBLE. A business process archeology tool
ICSM '11: Proceedings of the 2011 27th IEEE International Conference on Software Maintenance

Modernization of legacy information systems is usually triggered by the need of introducing new business requirements or due to the technology obsolescence. During modernization software projects, there exists a lot of business knowledge that was ...
MARBLE: Modernization approach for recovering business processes from legacy information systems
ICSM '12: Proceedings of the 2012 IEEE International Conference on Software Maintenance (ICSM)

The volatile IT industry often tempts companies to replace legacy information systems with new ones. However, these systems cannot always be completely discarded because they gradually store a significant amount of valuable business knowledge as a ...
Java EE 5 Development using GlassFish Application Server: The complete guide to installing and configuring the GlassFish Application Server and developing ... 5 applications to be deployed to this server

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '19: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering

November 2019

1333 pages

ISBN:9781728125084

General Chair:
Thomas Zimmermann
Microsoft Research
,
Program Chairs:
Julia Lawall
Inria/LIP6, France
,
Darko Marinov
University of Illinois at Urbana-Champaign

Sponsors

In-Cooperation

IEEE CS

Publisher

IEEE Press

Publication History

Published: 07 February 2020

Check for updates

Qualifiers

Research-article

Conference

ASE '19

Sponsor:

ASE '19: 34nd IEEE/ACM International Conference on Automated Software Engineering

November 10 - 15, 2019

California, San Diego

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
52
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nam DMacvean AMyers BVasilescu B(2024)Understanding Documentation Use Through Log Analysis: A Case Study of Four Cloud ServicesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642721(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642721
Mahmoud MWalker RDenzinger J(2024)API usage templates via structural generalizationJournal of Systems and Software10.1016/j.jss.2024.111974210:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.jss.2024.111974
Nam DMyers BVasilescu BHellendoorn VGrundy JPollock LPenta M(2023)Improving API Knowledge Discovery with ML: A Case Study of Comparable API MethodsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00161(1890-1906)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00161
Nam DRay BKim SQu XChandra SChaudhuri SSutton C(2022)Predictive synthesis of API-centric codeProceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming10.1145/3520312.3534866(40-49)Online publication date: 13-Jun-2022
https://dl.acm.org/doi/10.1145/3520312.3534866
Yang YXia XLo DBi TGrundy JYang X(2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022
https://dl.acm.org/doi/10.1145/3503509
Xu FVasilescu BNeubig G(2022)In-IDE Code Generation from Natural Language: Promise and ChallengesACM Transactions on Software Engineering and Methodology10.1145/348756931:2(1-47)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3487569

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents