Article

From uncertainty to belief: inferring the specification within

Authors:

Dawson EnglerAuthors Info & Claims

OSDI '06: Proceedings of the 7th symposium on Operating systems design and implementation

Pages 161 - 176

Published: 06 November 2006 Publication History

Abstract

Automatic tools for finding software errors require a set of specifications before they can check code: if they do not know what to check, they cannot find bugs. This paper presents a novel framework based on factor graphs for automatically inferring specifications directly from programs. The key strength of the approach is that it can incorporate many disparate sources of evidence, allowing us to squeeze significantly more information from our observations than previously published techniques.

We illustrate the strengths of our approach by applying it to the problem of inferring what functions in C programs allocate and release resources. We evaluated its effectiveness on five codebases: SDL, OpenSSH, GIMP, and the OS kernels for Linux and Mac OS X (XNU). For each codebase, starting with zero initially provided annotations, we observed an inferred annotation accuracy of 80--90%, with often near perfect accuracy for functions called as little asfive times. Many of the inferred allocator and deallocator functions are functions for which we both lack the implementation and are rarely called---in some cases functions with at most one or two callsites. Finally, with the inferred annotations we quickly found both missing and incorrect properties in a specification used by a commercial static bug-finding tool.

References

[1]

R. Alur, P. Černý, P. Madhusudan, and W. Nam. Synthesis of interface specifications for java classes. In Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, 2005.

Digital Library

[2]

G. Ammons, R. Bodík, and J. R. Larus. Mining specifications. In Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 4--16, New York, NY, USA, 2002.

Digital Library

[3]

G. Ammons, D. Mandelin, R. Bodík, and J. R. Larus. Debugging temporal specifications with concept analysis. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, pages 182--195, New York, NY, USA, 2003. ACM Press.

Digital Library

[4]

W. Bush, J. Pincus, and D. Sielaff. A static analyzer for finding dynamic programming errors. Software: Practice and Experience, 30(7):775--802, 2000.

Digital Library

[5]

H. Chen and D. Wagner. MOPS: an infrastructure for examining security properties of software. In Proceedings of the 9th ACM conference on Computer and communications security, pages 235--244. ACM Press, 2002.

Digital Library

[6]

A. Chou. Static Analysis for Finding Bugs in Systems Software. PhD thesis, Stanford University, 2003.

Digital Library

[7]

Coverity Prevent. http://www.coverity.com.

[8]

M. Das, S. Lerner, and M. Seigle. Path-sensitive program verification in polynomial time. In ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation, Berlin, Germany, June 2002.

Digital Library

[9]

D. Engler, D. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as deviant behavior: A general approach to inferring errors in systems code. In Eighteenth ACM Symposium on Operating Systems Principles, 2001.

Digital Library

[10]

M. Ernst, J. Cockrell, W. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. In IEEE Transactions on Software Engineering, Feb. 2001.

Digital Library

[11]

T. Fawcett. ROC Graphs: Notes and practical considerations for data mining researchers. Technical Report HPL-2003-4, Intelligent Enterprise Technologies Laboratory, HP Laboratories Palo Alto, January 2003.

[12]

C. Flanagan, K. Leino, M. Lillibridge, G. Nelson, J. Saxe, and R. Stata. Extended static checking for Java. In PLDI 2002, pages 234--245, 2002.

Digital Library

[13]

W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, editors. Markov Chain Monte Carlo in Practice. Chapman and Hall/CRC, 1996.

[14]

B. Hackett, M. Das, D. Wang, and Z. Yang. Modular checking for buffer overflows in the large. In L. J. Osterweil, H. D. Rombach, and M. L. Soffa, editors, ICSE, pages 232--241. ACM, 2006.

Digital Library

[15]

S. Hallem, B. Chelf, Y. Xie, and D. Engler. A system and language for building system-specific, static analyses. In ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation, 2002.

Digital Library

[16]

S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In International Conference on Software Engineering, May 2002.

Digital Library

[17]

R. Hastings and B. Joyce. Purify: Fast detection of memory leaks and access errors. In Proceedings of the Winter USENIX Conference, Dec. 1992.

[18]

D. L. Heine and M. S. Lam. A practical flow-sensitive and context-sensitive C and C++ memory leak detector. In ACM SIGPLAN 2003 conference on Programming language design and implementation, 2003.

Digital Library

[19]

G. E. Hinton. Products of experts. Technical report, Gatsby Computational Neuroscience Unit, University College London.

[20]

T. Kremenek, A. Ng, and D. Engler. A factor graph model for software bug finding. Technical report, Stanford University, 2006.

[21]

F. R. Kschischang, B. J. Frey, and H. A. Loeliger. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 2001.

Digital Library

[22]

Z. Li and Y. Zhou. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code. In 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Sept. 2005.

Digital Library

[23]

B. Liblit, A. Aiken, A. X. Zheng, and M. I. Jordan. Bug isolation via remote program sampling. In ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, San Diego, CA USA, 2003.

Digital Library

[24]

B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan. Scalable statistical bug isolation. In PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pages 15--26, New York, NY, USA, 2005. ACM Press.

Digital Library

[25]

B. Livshits and T. Zimmermann. Dynamine: finding common error patterns by mining software revision histories. In 13th ACM SIGSOFT international symposium on Foundations of software engineering, pages 296--305, New York, NY, USA, 2005. ACM Press.

Digital Library

[26]

R. Manevich, M. Sridharan, S. Adams, M. Das, and Z. Yang. PSE: explaining program failures via postmortem static analysis. In Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering, pages 63--72, New York, NY, USA, 2004.

Digital Library

[27]

G. C. Necula, S. McPeak, S. Rahul, and W. Weimer. CIL: Intermediate language and tools for analysis and transformation of c programs. In Proceedings of Conference on Compilier Construction, 2002.

Digital Library

[28]

N. Nethercote. Dynamic Binary Analysis and Instrumentation. PhD thesis, University of Cambridge, 2004.

[29]

J. H. Perkins and M. D. Ernst. Efficient incremental algorithms for dynamic detection of likely invariants. In 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering, 2004.

Digital Library

[30]

W. Weimer and G. Necula. Mining temporal specifications for error detection. In 11th International Conference on Tools and Algorithms For The Construction And Analysis Of Systems, 2005.

Digital Library

[31]

J. Whaley, M. C. Martin, and M. S. Lam. Automatic extraction of object-oriented component interfaces. In Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis, 2002.

Digital Library

[32]

Y. Xie and A. Aiken. Context-and path-sensitive memory leak detection. In 13th ACM SIGSOFT international symposium on Foundations of software engineering, pages 115--125, New York, NY, USA, 2005.

Digital Library

[33]

Y. Xie and A. Aiken. Scalable error detection using boolean satisfiability. In Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 351--363, New York, NY, USA, 2005.

Digital Library

[34]

J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das. Perracotta: mining temporal api rules from imperfect traces. In 28th international conference on Software engineering, pages 282--291, New York, NY, USA, 2006.

Digital Library

[35]

J. Yang, T. Kremenek, Y. Xie, and D. Engler. MECA: an extensible, expressive system and language for statically checking security properties. In 10th ACM conference on Computer and communications security, 2003.

Digital Library

[36]

J. S. Yedidia, W. T. Freeman, and Y. Weiss. Understanding belief propagation and its generalizations. In Exploring Artificial Intelligence in the New Millennium, pages 239--269. Morgan Kaufmann Publishers Inc., 2003.

Digital Library

Cited By

Fu XKim WShreepathi AIsmail MWadkar SLee DMin C(2021)WitcherProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483556(100-115)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3477132.3483556
Paletov RTsankov PRaychev VVechev M(2018)Inferring crypto API rules from code changesACM SIGPLAN Notices10.1145/3296979.319240353:4(450-464)Online publication date: 11-Jun-2018
https://dl.acm.org/doi/10.1145/3296979.3192403
Paletov RTsankov PRaychev VVechev MFoster JGrossman D(2018)Inferring crypto API rules from code changesProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3192366.3192403(450-464)Online publication date: 11-Jun-2018
https://dl.acm.org/doi/10.1145/3192366.3192403
Show More Cited By

From uncertainty to belief: inferring the specification within

Recommendations

From uncertainty to belief: inferring the specification within
OSDI '06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7

Automatic tools for finding software errors require a set of specifications before they can check code: if they do not know what to check, they cannot find bugs. This paper presents a novel framework based on factor graphs for automatically inferring ...
From uncertainty to bugs: inferring defects in software systems with static analysis, statistical methods, and probabilistic graphical models
Tractable Bayesian learning of tree belief networks

In this paper we present decomposable priors , a family of priors over structure and parameters of tree belief nets for which Bayesian learning with complete observations is tractable, in the sense that the posterior is also decomposable and can be ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

OSDI '06: Proceedings of the 7th symposium on Operating systems design and implementation

November 2006

407 pages

ISBN:1931971471

Program Chairs:
Brian Bershad
University of Washington
,
Jeff Mogul
Hewlett-Packard Labs

Sponsors

VMware
NSF: National Science Foundation
Google Inc.
Infosys
SIGOPS: ACM Special Interest Group on Operating Systems
Sun Microsystems
Intel: Intel
Microsoft Research: Microsoft Research
DoCoMo USA Labs
USENIX Assoc: USENIX Assoc
HP invent
Ask.com
IBM: IBM

Publisher

USENIX Association

United States

Publication History

Published: 06 November 2006

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

54
Total Citations
View Citations
301
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fu XKim WShreepathi AIsmail MWadkar SLee DMin C(2021)WitcherProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483556(100-115)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3477132.3483556
Paletov RTsankov PRaychev VVechev M(2018)Inferring crypto API rules from code changesACM SIGPLAN Notices10.1145/3296979.319240353:4(450-464)Online publication date: 11-Jun-2018
https://dl.acm.org/doi/10.1145/3296979.3192403
Paletov RTsankov PRaychev VVechev MFoster JGrossman D(2018)Inferring crypto API rules from code changesProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3192366.3192403(450-464)Online publication date: 11-Jun-2018
https://dl.acm.org/doi/10.1145/3192366.3192403
Zhang DMyers AVytiniotis DPeyton-Jones S(2017)SHErrLocACM Transactions on Programming Languages and Systems10.1145/312113739:4(1-47)Online publication date: 17-Aug-2017
https://dl.acm.org/doi/10.1145/3121137
Murali VChaudhuri SJermaine CBodden ESchäfer WDeursen AZisman A(2017)Bayesian specification learning for finding API usage errorsProceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering10.1145/3106237.3106284(151-162)Online publication date: 21-Aug-2017
https://dl.acm.org/doi/10.1145/3106237.3106284
Aliabadi MKamath AGascon-Samson JPattabiraman KBodden ESchäfer WDeursen AZisman A(2017)ARTINALI: dynamic invariant detection for cyber-physical system securityProceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering10.1145/3106237.3106282(349-361)Online publication date: 21-Aug-2017
https://dl.acm.org/doi/10.1145/3106237.3106282
Bichsel BRaychev VTsankov PVechev MWeippl EKatzenbeisser SKruegel CMyers AHalevi S(2016)Statistical Deobfuscation of Android ApplicationsProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security10.1145/2976749.2978422(343-355)Online publication date: 24-Oct-2016
https://dl.acm.org/doi/10.1145/2976749.2978422
Liang BBian PZhang YShi WYou WCai YDillon LVisser WWilliams L(2016)AntMinerProceedings of the 38th International Conference on Software Engineering10.1145/2884781.2884870(333-344)Online publication date: 14-May-2016
https://dl.acm.org/doi/10.1145/2884781.2884870
Ray BHellendoorn VGodhane STu ZBacchelli ADevanbu PDillon LVisser WWilliams L(2016)On the "naturalness" of buggy codeProceedings of the 38th International Conference on Software Engineering10.1145/2884781.2884848(428-439)Online publication date: 14-May-2016
https://dl.acm.org/doi/10.1145/2884781.2884848
Song FTouili T(2016)Model-checking software library API usage rulesSoftware and Systems Modeling (SoSyM)10.1007/s10270-015-0473-115:4(961-985)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1007/s10270-015-0473-1
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents