skip to main content
10.5555/1298455.1298471acmconferencesArticle/Chapter ViewAbstractPublication PagesosdiConference Proceedingsconference-collections
Article

From uncertainty to belief: inferring the specification within

Published: 06 November 2006 Publication History

Abstract

Automatic tools for finding software errors require a set of specifications before they can check code: if they do not know what to check, they cannot find bugs. This paper presents a novel framework based on factor graphs for automatically inferring specifications directly from programs. The key strength of the approach is that it can incorporate many disparate sources of evidence, allowing us to squeeze significantly more information from our observations than previously published techniques.
We illustrate the strengths of our approach by applying it to the problem of inferring what functions in C programs allocate and release resources. We evaluated its effectiveness on five codebases: SDL, OpenSSH, GIMP, and the OS kernels for Linux and Mac OS X (XNU). For each codebase, starting with zero initially provided annotations, we observed an inferred annotation accuracy of 80--90%, with often near perfect accuracy for functions called as little asfive times. Many of the inferred allocator and deallocator functions are functions for which we both lack the implementation and are rarely called---in some cases functions with at most one or two callsites. Finally, with the inferred annotations we quickly found both missing and incorrect properties in a specification used by a commercial static bug-finding tool.

References

[1]
R. Alur, P. Černý, P. Madhusudan, and W. Nam. Synthesis of interface specifications for java classes. In Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, 2005.
[2]
G. Ammons, R. Bodík, and J. R. Larus. Mining specifications. In Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 4--16, New York, NY, USA, 2002.
[3]
G. Ammons, D. Mandelin, R. Bodík, and J. R. Larus. Debugging temporal specifications with concept analysis. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, pages 182--195, New York, NY, USA, 2003. ACM Press.
[4]
W. Bush, J. Pincus, and D. Sielaff. A static analyzer for finding dynamic programming errors. Software: Practice and Experience, 30(7):775--802, 2000.
[5]
H. Chen and D. Wagner. MOPS: an infrastructure for examining security properties of software. In Proceedings of the 9th ACM conference on Computer and communications security, pages 235--244. ACM Press, 2002.
[6]
A. Chou. Static Analysis for Finding Bugs in Systems Software. PhD thesis, Stanford University, 2003.
[7]
Coverity Prevent. http://www.coverity.com.
[8]
M. Das, S. Lerner, and M. Seigle. Path-sensitive program verification in polynomial time. In ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation, Berlin, Germany, June 2002.
[9]
D. Engler, D. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as deviant behavior: A general approach to inferring errors in systems code. In Eighteenth ACM Symposium on Operating Systems Principles, 2001.
[10]
M. Ernst, J. Cockrell, W. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. In IEEE Transactions on Software Engineering, Feb. 2001.
[11]
T. Fawcett. ROC Graphs: Notes and practical considerations for data mining researchers. Technical Report HPL-2003-4, Intelligent Enterprise Technologies Laboratory, HP Laboratories Palo Alto, January 2003.
[12]
C. Flanagan, K. Leino, M. Lillibridge, G. Nelson, J. Saxe, and R. Stata. Extended static checking for Java. In PLDI 2002, pages 234--245, 2002.
[13]
W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, editors. Markov Chain Monte Carlo in Practice. Chapman and Hall/CRC, 1996.
[14]
B. Hackett, M. Das, D. Wang, and Z. Yang. Modular checking for buffer overflows in the large. In L. J. Osterweil, H. D. Rombach, and M. L. Soffa, editors, ICSE, pages 232--241. ACM, 2006.
[15]
S. Hallem, B. Chelf, Y. Xie, and D. Engler. A system and language for building system-specific, static analyses. In ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation, 2002.
[16]
S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In International Conference on Software Engineering, May 2002.
[17]
R. Hastings and B. Joyce. Purify: Fast detection of memory leaks and access errors. In Proceedings of the Winter USENIX Conference, Dec. 1992.
[18]
D. L. Heine and M. S. Lam. A practical flow-sensitive and context-sensitive C and C++ memory leak detector. In ACM SIGPLAN 2003 conference on Programming language design and implementation, 2003.
[19]
G. E. Hinton. Products of experts. Technical report, Gatsby Computational Neuroscience Unit, University College London.
[20]
T. Kremenek, A. Ng, and D. Engler. A factor graph model for software bug finding. Technical report, Stanford University, 2006.
[21]
F. R. Kschischang, B. J. Frey, and H. A. Loeliger. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 2001.
[22]
Z. Li and Y. Zhou. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code. In 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Sept. 2005.
[23]
B. Liblit, A. Aiken, A. X. Zheng, and M. I. Jordan. Bug isolation via remote program sampling. In ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, San Diego, CA USA, 2003.
[24]
B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan. Scalable statistical bug isolation. In PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pages 15--26, New York, NY, USA, 2005. ACM Press.
[25]
B. Livshits and T. Zimmermann. Dynamine: finding common error patterns by mining software revision histories. In 13th ACM SIGSOFT international symposium on Foundations of software engineering, pages 296--305, New York, NY, USA, 2005. ACM Press.
[26]
R. Manevich, M. Sridharan, S. Adams, M. Das, and Z. Yang. PSE: explaining program failures via postmortem static analysis. In Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering, pages 63--72, New York, NY, USA, 2004.
[27]
G. C. Necula, S. McPeak, S. Rahul, and W. Weimer. CIL: Intermediate language and tools for analysis and transformation of c programs. In Proceedings of Conference on Compilier Construction, 2002.
[28]
N. Nethercote. Dynamic Binary Analysis and Instrumentation. PhD thesis, University of Cambridge, 2004.
[29]
J. H. Perkins and M. D. Ernst. Efficient incremental algorithms for dynamic detection of likely invariants. In 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering, 2004.
[30]
W. Weimer and G. Necula. Mining temporal specifications for error detection. In 11th International Conference on Tools and Algorithms For The Construction And Analysis Of Systems, 2005.
[31]
J. Whaley, M. C. Martin, and M. S. Lam. Automatic extraction of object-oriented component interfaces. In Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis, 2002.
[32]
Y. Xie and A. Aiken. Context-and path-sensitive memory leak detection. In 13th ACM SIGSOFT international symposium on Foundations of software engineering, pages 115--125, New York, NY, USA, 2005.
[33]
Y. Xie and A. Aiken. Scalable error detection using boolean satisfiability. In Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 351--363, New York, NY, USA, 2005.
[34]
J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das. Perracotta: mining temporal api rules from imperfect traces. In 28th international conference on Software engineering, pages 282--291, New York, NY, USA, 2006.
[35]
J. Yang, T. Kremenek, Y. Xie, and D. Engler. MECA: an extensible, expressive system and language for statically checking security properties. In 10th ACM conference on Computer and communications security, 2003.
[36]
J. S. Yedidia, W. T. Freeman, and Y. Weiss. Understanding belief propagation and its generalizations. In Exploring Artificial Intelligence in the New Millennium, pages 239--269. Morgan Kaufmann Publishers Inc., 2003.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
OSDI '06: Proceedings of the 7th symposium on Operating systems design and implementation
November 2006
407 pages
ISBN:1931971471

Sponsors

Publisher

USENIX Association

United States

Publication History

Published: 06 November 2006

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media