skip to main content
10.1109/ASE.2019.00063acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

MARBLE: mining for boilerplate code to identify API usability problems

Published: 07 February 2020 Publication History

Abstract

Designing usable APIs is critical to developers' productivity and software quality, but is quite difficult. One of the challenges is that anticipating API usability barriers and real-world usage is difficult, due to a lack of automated approaches to mine usability data at scale. In this paper, we focus on one particular grievance that developers repeatedly express in online discussions about APIs: "boilerplate code." We investigate what properties make code count as boilerplate, the reasons for boilerplate, and how programmers can reduce the need for it. We then present MARBLE, a novel approach to automatically mine boilerplate code candidates from API client code repositories. MARBLE adapts existing techniques, including an API usage mining algorithm, an AST comparison algorithm, and a graph partitioning algorithm. We evaluate MARBLE with 13 Java APIs, and show that our approach successfully identifies both already-known and new API-related boilerplate code instances.

References

[1]
J. Bloch, "How to design a good API and why it matters," in Companion to Conference on Object-oriented Programming Systems, Languages, and Applications. ACM, 2006, pp. 506--507.
[2]
E. Mosqueira-Rey, D. Alonso-Ríos, V. Moret-Bonillo, I. Fernández-Varela, and D. Álvarez-Estévez, "A systematic approach to API usability: Taxonomy-derived criteria and a case study," Information and Software Technology, vol. 97, pp. 46--63, 2018.
[3]
M. Reddy, API Design for C++. Elsevier, 2011.
[4]
J. Stylos and B. A. Myers, "The implications of method placement on API learnability," in International Symposium on Foundations of software engineering. ACM, 2008, pp. 105--112.
[5]
U. Farooq and D. Zirkler, "API peer reviews: A method for evaluating usability of application programming interfaces," in Conference on Computer Supported Cooperative Work. ACM, 2010, pp. 207--210.
[6]
A. Macvean, M. Maly, and J. Daughtry, "API design reviews at scale," in Extended Abstracts on Human Factors in Computing Systems. ACM, 2016, pp. 849--858.
[7]
L. Murphy, M. B. Kery, O. Alliyu, A. Macvean, and B. A. Myers, "API designers in the field: Design practices and challenges for creating usable APIs," in Symposium on Visual Languages and Human-Centric Computing. IEEE, 2018, pp. 249--258.
[8]
B. A. Myers and J. Stylos, "Improving API usability," Communications of the ACM, vol. 59, no. 6, pp. 62--69, 2016.
[9]
J. Bloch, "How to design a good API and why it matters," https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/32713.pdf, 2005.
[10]
M. P. Robillard, E. Bodden, D. Kawrykow, M. Mezini, and T. Ratchford, "Automated API property inference techniques," Transactions on Software Engineering, vol. 39, no. 5, pp. 613--637, 2013.
[11]
T. Zhang, G. Upadhyaya, A. Reinhardt, H. Rajan, and M. Kim, "Are code examples on an online Q&A forum reliable?" in International Conference on Software Engineering. ACM, 2018, pp. 886--896.
[12]
E. L. Glassman, T. Zhang, B. Hartmann, and M. Kim, "Visualizing API usage examples at scale," in Human Factors in Computing Systems. ACM, 2018, pp. 580:1--580:12.
[13]
J. Fowkes and C. Sutton, "Parameter-free probabilistic API mining across GitHub," in International Symposium on Foundations of Software Engineering. ACM, 2016, pp. 254--265.
[14]
"Collect GeometrySystem → drake_visualizer boilerplate by SeanCurtis-TRI pull request #8526 RobotLocomotion/drake," https://github.com/RobotLocomotion/drake/pull/8526.
[15]
"Reduce boilerplate for subclasses issue #172 parse-community/Parse-SDK-Android," https://github.com/parse-community/Parse-SDK-Android/issues/172.
[16]
"Can java help me avoid boilerplate code in equals()?" https://stackoverflow.com/questions/25183872/can-java-help-me-avoid-boilerplate-code-in-equals.
[17]
"Boilerplate code definition of stackoverflow," https://stackoverflow.com/questions/3992199/what-is-boilerplate-code.
[18]
"Boilerplate code definition of wikipedia," https://en.wikipedia.org/wiki/Boilerplate\_code.
[19]
J. Tulach, Practical API design: Confessions of a Java framework architect. Apress, 2008.
[20]
D. Nam, A. Horvath, A. Macvean, B. Myers, and B. Vasilescu, "Marble source code and the result," 2019.
[21]
J. Gerken, H.-C. Jetter, and H. Reiterer, "Using concept maps to evaluate the usability of APIs," in Extended Abstracts on Human Factors in Computing Systems. ACM, 2010, pp. 3937--3942.
[22]
M. Piccioni, C. A. Furia, and B. Meyer, "An empirical study of API usability," in International Symposium on Empirical Software Engineering and Measurement. ACM, 2013, pp. 5--14.
[23]
J. Bloch, Effective Java. Addison-Wesley Professional, 2017.
[24]
K. Cwalina and B. Abrams, Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries. Pearson Education, 2008.
[25]
A. Faulring, B. A. Myers, Y. Oren, and K. Rotenberg, "A case study of using HCI methods to improve tools for programmers," in International Workshop on Co-operative and Human Aspects of Software Engineering. IEEE, 2012, pp. 37--39.
[26]
T. Grill, O. Polacek, and M. Tscheligi, "Methods towards API usability: A structural analysis of usability problem categories," in International Conference on Human-Centred Software Engineering, 2012, pp. 164--180.
[27]
G. M. Rama and A. Kak, "Some structural measures of API usability," SoftwarePractice and Experience., vol. 45, no. 1, pp. 75--110, 2013.
[28]
T. Scheller and E. Kuhn, "Automated measurement of API usability: The API concepts framework," Information and Software Technology, vol. 61, pp. 145--162, 2015.
[29]
M. Monperrus and M. Mezini, "Detecting missing method calls as violations of the majority rule," Transactions on Software Engineering and Methodology, vol. 22, no. 1, pp. 1--25, 2013.
[30]
A. Wasylkowski, A. Zeller, and C. Lindig, "Detecting object usage anomalies," in Joint Meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering. ACM, 2007, pp. 35--44.
[31]
T. T. Nguyen, H. A. Nguyen, N. H. Pham, J. M. Al-Kofahi, and T. N. Nguyen, "Graph-based mining of multiple object usage patterns," in Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 2009, pp. 383--392.
[32]
A. Sven, H. A. Nguyen, S. Nadi, T. N. Nguyen, and M. Mezini, "Investigating next steps in static API-misuse detection," in International Conference on Mining Software Repositories. IEEE, 2019, pp. 265--275.
[33]
H. Zhong, T. Xie, L. Zhang, J. Pei, and H. Mei, "MAPO: Mining and recommending API usage patterns," in European Conference on Object-Oriented Programming. Springer, 2009, pp. 318--343.
[34]
N. Katirtzis, T. Diamantopoulos, and C. Sutton, "Summarizing software API usage examples using clustering techniques," in International Conference on Fundamental Approaches to Software Engineering. Springer, 2018, pp. 189--206.
[35]
J. Wang, J. Han, and C. Li, "Frequent closed sequence mining without candidate maintenance," Transactions on Knowledge and Data Engineering, vol. 19, no. 8, pp. 1042--1056, 2007.
[36]
E. Murphy-Hill, C. Sadowski, A. Head, J. Daughtry, A. Macvean, C. Jaspan, and C. Winter, "Discovering API usability problems at scale," in International Workshop on API Usage and Evolution. ACM, 2018, pp. 14--17.
[37]
C. J. Kapser and M. W. Godfrey, ""cloning considered harmful" considered harmful: patterns of cloning in software," Empirical Software Engineering, vol. 13, no. 6, p. 645, 2008.
[38]
C. K. Roy, J. R. Cordy, and R. Koschke, "Comparison and evaluation of code clone detection techniques and tools: A qualitative approach," Science of computer programming, vol. 74, no. 7, pp. 470--495, 2009.
[39]
H. Sajnani, V. Saini, J. Svajlenko, C. K. Roy, and C. V. Lopes, "Sourcerercc: scaling code clone detection to big-code," in International Conference on Software Engineering. ACM, 2016, pp. 1157--1168.
[40]
M. Kim, V. Sazawal, D. Notkin, and G. Murphy, "An empirical study of code clone genealogies," in Software Engineering Notes. ACM, 2005, pp. 187--196.
[41]
L. Jiang, G. Misherghi, Z. Su, and S. Glondu, "Deckard: Scalable and accurate tree-based detection of code clones," in International Conference on Software Engineering. IEEE Computer Society, 2007, pp. 96--105.
[42]
M. White, M. Tufano, C. Vendome, and D. Poshyvanyk, "Deep learning code fragments for code clone detection," in International Conference on Automated Software Engineering. ACM, 2016, pp. 87--98.
[43]
R. Lämmel and S. P. Jones, Scrap your boilerplate: a practical design pattern for generic programming. ACM, 2003.
[44]
"html5-boilerplate," https://github.com/h5bp/html5-boilerplate.
[45]
"Boilerplate code definition of quora," https://www.quora.com/What-is-boilerplate-code.
[46]
"How to avoid writing duplicate boilerplate code for requesting permissions?" https://stackoverflow.com/questions/39080095/how-to-avoid-writing-duplicate-boilerplate-code-for-requesting-permissions.
[47]
"How to avoid writing boilerplate code in java swing mvc?" https://stackoverflow.com/questions/26154225/how-to-avoid-writing-boilerplate-code-in-java-swing-mvc.
[48]
C. K. Roy and J. R. Cordy, "A survey on software clone detection research," Queens School of Computing TR, vol. 541, no. 115, pp. 64--68, 2007.
[49]
"Project lombok," https://projectlombok.org,.
[50]
T. K. Moon, "The expectation-maximization algorithm," Signal Processing Magazine, vol. 13, no. 6, pp. 47--60, 1996.
[51]
M. Pawlik and N. Augsten, "Tree edit distance: Robust and memory-efficient," Information Systems, vol. 56, pp. 157 -- 173, 2016.
[52]
J.-R. Falleri, F. Morandat, X. Blanc, M. Martinez, and M. Monperrus, "Fine-grained and accurate source code differencing," in International Conference on Automated Software Engineering. ACM, 2014, pp. 313--324.
[53]
V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, "Fast unfolding of communities in large networks," Journal of Statistical Mechanics: Theory and Experiment, vol. 2008, no. 10, p. P10008, 2008.
[54]
A. G. Koru and H. Liu, "Building effective defect-prediction models in practice," Software, vol. 22, no. 6, pp. 23--29, 2005.
[55]
B. Johnson, Y. Song, E. Murphy-Hill, and R. Bowdidge, "Why don't software developers use static analysis tools to find bugs?" in International Conference on Software Engineering. IEEE, 2013, pp. 672--681.
[56]
G. Gousios and D. Spinellis, "GHTorrent: GitHub's data from a fire-hose," in Internatioanl Conference on Mining Software Repositories. IEEE, 2012, pp. 12--21.
[57]
"Probabilistic API mining implementation," https://github.com/mast-group/API-mining.
[58]
"Ap-ted implementation," https://github.com/DatabaseGroup/apted.
[59]
"Community detection package," https://python-louvain.readthedocs.io.
[60]
"How to avoid boilerplate code when loading images with picasso library," https://stackoverflow.com/questions/32167948/how-to-avoid-boilerplate-code-when-loading-images-with-picasso-library.
[61]
"Spring framework," https://spring.io.
[62]
"Greendao," http://greenrobot.org/greendao/documentation/introduction.
[63]
"Android API 26 release note," https://developer.android.com/about/versions/oreo/android-8.0-changes#fvbi-signature.
[64]
"Butterknife," https://jakewharton.github.io/butterknife/.
[65]
M. Weiser, "Program slicing," in International Conference on Software Engineering. IEEE, 1981, pp. 439--449.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASE '19: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering
November 2019
1333 pages
ISBN:9781728125084

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 07 February 2020

Check for updates

Qualifiers

  • Research-article

Conference

ASE '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media