skip to main content
10.1145/1595696.1595727acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Api hyperlinking via structural overlap

Published: 24 August 2009 Publication History

Abstract

This paper presents a tool Altair that automatically generates API function cross-references, which emphasizes reliable structural measures and does not depend on specific client code. Altair ranks related API functions for a given query according to pair-wise overlap, i.e., how they share state, and clusters tightly related ones into meaningful modules.
Experiments against several popular C software packages show that Altair recommends related API functions for a given query with remarkably more precise and complete results than previous tools, that it can extract modules from moderate-sized software (e.g., Apache with 1000+ functions) at high precision and recall rates (e.g., both exceeding 70% for two modules in Apache), and that the computation can finish within a few seconds.

References

[1]
Apache HTTP server. http://httpd.apache.org/.
[2]
bzip2. http://www.bzip.org/.
[3]
doxygen. http://www.doxygen.org/.
[4]
Subversion. http://subversion.tigris.org/.
[5]
time - time command execution. http://www.manpages.info/freebsd/time.1.html.
[6]
G. Ammons, R. Bodík, and J. R. Larus. Mining specifications. In POPL, 2002.
[7]
I. Antonellis, H. Garcia-Molina, and C.-C. Chang. Simrank++: Query rewriting through link analysis of the click graph. In VLDB, 2008.
[8]
L. A. Belady and C. J. Evangelisti. System partitioning and its measure. Journal of Systems and Software, 2(1):23--29, 1981.
[9]
G. Canfora, A. Cimitile, and M. Munro. An improved algorithm for identifying objects in code. Software - Practice&Experience, 26(1):25--48, 1996.
[10]
M. S. Charikar. Similarity estimation techniques from rounding algorithms. In STOC, 2002.
[11]
S. Chong, J. Liu, A. C. Myers, X. Qi, K. Vikram, L. Zheng, and X. Zheng. Secure web applications via automatic partitioning. In SOSP, 2007.
[12]
F. R. K. Chung. Spectral Graph Theory. AMS, 1997.
[13]
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience, 2nd edition, 2000.
[14]
R. Holmes and G. C. Murphy. Using structural context to recommend source code examples. In ICSE, 2005.
[15]
S. Horwitz, T. Reps, and D. Binkley. Interprocedural slicing using dependence graphs. In PLDI, 1988.
[16]
D. H. Hutchens and V. R. Basili. System structure analysis: Clustering with data bindings. IEEE Transactions on Software Engineering, 11(8):749--757, 1985.
[17]
D. F. Huynh, D. R. Karger, and R. C. Miller. Exhibit: Lightweight structured data publishing. In WWW, 2007.
[18]
K. Inoue, R. Yokomori, H. Fujiwara, T. Yamamoto, M. Matsushita, and S. Kusumoto. Component rank: Relative significance rank for software component search. In ICSE, 2003.
[19]
G. Jeh and J. Widom. SimRank: A measure of structural-context similarity. In KDD, 2002.
[20]
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. In SODA, 1997.
[21]
C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis&transformation. In CGO, 2004.
[22]
Z. Li and Y. Zhou. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code. In ESEC/FSE, 2005.
[23]
B. Liskov and S. Zilles. Programming with abstract data types. In ACM SIGPLAN Symposium on Very High Level Languages, 1974.
[24]
S. P. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129--137, 1982.
[25]
D. Lo and S.-C. Khoo. SMArTIC: Towards building an accurate, robust and scalable specification miner. In FSE, 2006.
[26]
S. Lu, S. Park, C. Hu, X. Ma, W. Jiang, Z. Li, R. A. Popa, and Y. Zhou. MUVI: Automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs. In SOSP, 2007.
[27]
D. Mandelin, L. Xu, R. Bodík, and D. Kimelman. Jungloid mining: Helping to navigate the API jungle. In PLDI, 2005.
[28]
O. Maqbool and H. A. Babri. Hierarchical clustering for software architecture recovery. IEEE Transactions on Software Engineering, 33(11):759--780, 2007.
[29]
S. McCamant and M. D. Ernst. Quantitative information flow as network flow capacity. In PLDI, 2008.
[30]
A. Michail. Data mining library reuse patterns using generalized association rules. In ICSE, 2000.
[31]
A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, 2001.
[32]
E. Nuutila. Efficient Transitive Closure Computation in Large Digraphs. PhD thesis, Helsinki University of Technology, 1995.
[33]
D. J. Pearce, P. H. J. Kelly, and C. Hankin. Efficient field-sensitive pointer analysis of C. In PASTE, 2004.
[34]
S. Phattarsukol and P. Muenchaisri. Identifying candidate objects using hierarchical clustering analysis. In APSEC, 2001.
[35]
M. P. Robillard. Automatic generation of suggestions for program investigation. In ESEC/FSE, 2005.
[36]
Z. M. Saul, V. Filkov, P. Devanbu, and C. Bird. Recommending random walks. In ESEC/FSE, 2007.
[37]
J. Shi and J. Malik. Normalized cuts and image segmentation. In CVPR, 1997.
[38]
M. Siff and T. Reps. Identifying modules via concept analysis. In ICSM, 1997.
[39]
H. Small. Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4):265--269, 1973.
[40]
S. Srivastava, M. Hicks, and J. S. Foster. Modular information hiding and type-safe linking for C. In TLDI, 2007.
[41]
N. Tansalarak and K. Claypool. XSnippet: Mining for sample code. In OOPSLA, 2006.
[42]
A. van Deursen and T. Kuipers. Identifying objects using cluster and concept analysis. In ICSE, 1999.
[43]
C. J. van Rijsbergen. Information Retrieval. Butterworths, 2nd edition, 1979.
[44]
M. Weiser. Program slicing. In ICSE, 1981.
[45]
J. Whaley, M. C. Martin, and M. S. Lam. Automatic extraction of object-oriented component interfaces. In ISSTA, 2002.
[46]
S. Xanthos. Clustering object-oriented software systems using spectral graph partitioning. ACM Student Research Competition, 2005.
[47]
J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das. Perracotta: Mining temporal API rules from imperfect traces. In ICSE, 2006.
[48]
H. Zhong, T. Xie, L. Zhang, J. Pei, and H. Mei. MAPO: Mining and recommending API usage pattern. In ECOOP, 2009.
[49]
T. Zimmermann, P. Weisgerber, S. Diehl, and A. Zeller. Mining version histories to guide software changes. In ICSE, 2004.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE '09: Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
August 2009
408 pages
ISBN:9781605580012
DOI:10.1145/1595696
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. api recommendation
  2. module clustering
  3. overlap rank

Qualifiers

  • Research-article

Conference

ESEC/FSE09
Sponsor:
ESEC/FSE09: Joint 12th European Software Engineering Conference
August 24 - 28, 2009
Amsterdam, The Netherlands

Acceptance Rates

ESEC/FSE '09 Paper Acceptance Rate 32 of 217 submissions, 15%;
Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media