skip to main content
10.1145/2970276.2970363acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
short-paper

Mining revision histories to detect cross-language clones without intermediates

Published: 25 August 2016 Publication History

Abstract

To attract more users on different platforms, many projects release their versions in multiple programming languages (e.g., Java and C#). They typically have many code snippets that implement similar functionalities, i.e., cross-language clones. Programmers often need to track and modify cross-language clones consistently to maintain similar functionalities across different language implementations. In literature, researchers have proposed approaches to detect cross- language clones, mostly for languages that share a common intermediate language (such as the .NET language family) so that techniques for detecting single-language clones can be applied. As a result, those approaches cannot detect cross-language clones for many projects that are not implemented in a .NET language. To overcome the limitation, in this paper, we propose a novel approach, CLCMiner, that detects cross-language clones automatically without the need of an intermediate language. Our approach mines such clones from revision histories, which reflect how programmers maintain cross-language clones in practice. We have implemented a prototype tool for our approach and conducted an evaluation on five open source projects that have versions in Java and C#. The results show that CLCMiner achieves high accuracy and point to promising future work.

References

[1]
Antlr. http://www.antlr.org.
[2]
Lucene. http://lucene.apache.org.
[3]
F. Al-Omari, I. Keivanloo, C. K. Roy, and J. Rilling. Detecting clones across microsoft .net programming languages. In Proc. WCRE, pages 405–414, 2012.
[4]
S. Bellon, R. Koschke, G. Antoniol, J. Krinke, and E. Merlo. Comparison and evaluation of clone detection tools. TSE, 33(9):577–591, 2007.
[5]
C. Bird, P. C. Rigby, E. T. Barr, D. J. Hamilton, D. M. Germán, and P. T. Devanbu. The promises and perils of mining git. In MSR, pages 1–10, 2009.
[6]
R. Fanta and V. Rajlich. Removing clones from the code. J. of Software Maintenance, 11(4):223–243, 1999.
[7]
T. Gˆırba, S. Ducasse, A. Kuhn, R. Marinescu, and D. Ratiu. Using concept analysis to detect co-change patterns. In Proc. ESEC/FSE, pages 83–89, 2007.
[8]
Z. S. Harris. Distributional structure. Word, 10(2-3):146–162, 1954.
[9]
L. Jiang, G. Misherghi, Z. Su, and S. Glondu. DECKARD: scalable and accurate tree-based detection of code clones. In ICSE, pages 96–105, 2007.
[10]
L. Jiang and Z. Su. Automatic mining of functionally equivalent code fragments via random testing. In Proc. ISSTA, pages 81–92, 2009.
[11]
E. Jürgens, F. Deissenboeck, and B. Hummel. Clonedetective - A workbench for clone detection research. In Proc. ICSE, pages 603–606, 2009.
[12]
T. Kamiya, S. Kusumoto, and K. Inoue. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. IEEE Transaction on Software Engineering, 28(7):654–670, 2002.
[13]
C. J. Kapser and M. W. Godfrey. “cloning considered harmful” considered harmful: Patterns of cloning in software. ESME, 13(6):645–692, Dec 2008.
[14]
K. Kontogiannis, P. K. Linos, and K. Wong. Comprehension and maintenance of large-scale multi-language software applications. In Proc. ICSM, pages 497–500, 2006.
[15]
N. A. Kraft, B. W. Bonds, and R. K. Smith. Cross-language clone detection. In Proc. SEKE, pages 54–59, 2008.
[16]
S. McIntosh, B. Adams, M. Nagappan, and A. E. Hassan. Mining co-change information to understand when build changes are necessary. In Proc. ICSME, pages 241–250, 2014.
[17]
S. Meng, X. Wang, L. Zhang, and H. Mei. A history-based matching approach to identification of framework evolution. In ICSE, pages 353–363, 2012.
[18]
H. Peng, L. Mou, G. Li, Y. Liu, L. Zhang, and Z. Jin. Building program vector representations for deep learning. In Knowledge Science, Engineering and Management (KSEM), pages 547–553, 2015.
[19]
C. K. Roy and J. R. Cordy. A survey on software clone detection research. Queen’s School of Computing TR, 541(115):64–68, 2007.
[20]
S. Wang, D. Lo, and L. Jiang. Understanding widespread changes: A taxonomic study. In 17th CSMR, pages 5–14, 2013.
[21]
T. Zimmermann, P. Weißgerber, S. Diehl, and A. Zeller. Mining version histories to guide software changes. In Proc. ICSE, pages 563–572, 2004.

Cited By

View all

Index Terms

  1. Mining revision histories to detect cross-language clones without intermediates

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering
      August 2016
      899 pages
      ISBN:9781450338455
      DOI:10.1145/2970276
      • General Chair:
      • David Lo,
      • Program Chairs:
      • Sven Apel,
      • Sarfraz Khurshid
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 August 2016

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. cross-language clone
      2. diff
      3. revision history

      Qualifiers

      • Short-paper

      Conference

      ASE'16
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 82 of 337 submissions, 24%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)8
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 06 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media