skip to main content
research-article

Fast and Precise Static Null Exception Analysis With Synergistic Preprocessing

Published: 01 November 2024 Publication History

Abstract

Pointer operations are common in programs written in modern programming languages such as C/C++ and Java. While widely used, pointer operations often suffer from bugs like null pointer exceptions that make software systems vulnerable and unstable. However, precisely verifying the absence of null pointer exceptions is notoriously slow as we need to inspect a huge number of pointer-dereferencing operations one by one via expensive techniques like SMT solving. We observe that, among all pointer-dereferencing operations in a program, a large number can be proven to be safe by lightweight preprocessing. Thus, we can avoid employing costly techniques to verify their nullity. The impacts of lightweight preprocessing techniques are significantly less studied and ignored by recent works. In this paper, we propose a new technique, BONA, which leverages the synergistic effects of two classic preprocessing analyses. The synergistic effects between the two preprocessing analyses allow us to recognize a lot more safe pointer operations before a follow-up costly nullity verification, thus improving the scalability of the whole null exception analysis. We have implemented our synergistic preprocessing procedure in two state-of-the-art static analyzers, KLEE and Pinpoint. The evaluation results demonstrate that BONA itself is fast and can finish in a few seconds for programs that KLEE and Pinpoint may require several minutes or even hours to analyze. Compared to the vanilla versions of KLEE and Pinpoint, BONA respectively enables them to achieve up to 1.6x and 6.6x speedup (1.2x and 3.8x on average) with less than 0.5% overhead. Such a speedup is significant enough as it allows KLEE and Pinpoint to check more pointer-dereferencing operations in a given time budget and, thus, discover over a dozen previously unknown null pointer exceptions in open-source projects.

References

[1]
“Stubborn weaknesses in the CWE top 25,” Accessed: Sep. 18, 2023. [Online]. Available: https://bit.ly/3Wdmi4E
[2]
J. L. Henning, “SPEC cpu2006 benchmark descriptions,” ACM SIGARCH Comput. Archit. News, vol. 34, no. 4, pp. 1–17, 2006.
[3]
B. Meyer, “Ending null pointer crashes,” Commun. ACM, vol. 60, no. 5, pp. 8–9, 2017.
[4]
“CVE - search results,” Accessed: Feb. 6, 2024. [Online]. Available: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=null+pointer
[5]
Y. Xie and A. Aiken, “Scalable error detection using Boolean satisfiability,” in Proc. 32nd ACM SIGPLAN-SIGACT Symp. Princ. Program. Lang. (POPL), New York, NY, USA: ACM, 2005, pp. 351–363.
[6]
Q. Shi, X. Xiao, R. Wu, J. Zhou, G. Fan, and C. Zhang, “Pinpoint: Fast and precise sparse value flow analysis for million lines of code,” in Proc. 39th ACM SIGPLAN Conf. Program. Lang. Des. Implementation, (PLDI), New York, NY, USA: ACM, 2018, pp. 693–706.
[7]
D. Babic and A. J. Hu, “Calysto: Scalable and precise extended static checking,” in Proc. 30th Int. Conf. Softw. Eng. (ICSE), Piscataway, NJ, USA: IEEE Press, 2008, pp. 211–220.
[8]
I. Dillig, T. Dillig, A. Aiken, and M. Sagiv, “Precise and compact modular procedure summaries for heap manipulating programs,” in Proc. 32nd ACM SIGPLAN Conf. Program. Lang. Des. Implementation, (PLDI), New York, NY, USA: ACM, 2011, pp. 567–577.
[9]
C. Cadar et al., “KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs,” in Proc. 8th USENIX Symp. Oper. Syst. Des. Implementation (OSDI), USENIX, 2008, pp. 209–224.
[10]
S. Banerjee, L. Clapp, and M. Sridharan, “NullAway: Practical type-based null safety for Java,” in Proc. 27th ACM Joint Meeting Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng. (ESEC/FSE), New York, NY, USA: ACM, 2019, pp. 740–750.
[11]
M. G. Nanda and S. Sinha, “Accurate interprocedural null-dereference analysis for Java,” in Proc. 31st Int. Conf. Softw. Eng. (ICSE), Piscataway, NJ, USA: IEEE Press, 2009, pp. 133–143.
[12]
A. Loginov, E. Yahav, S. Chandra, S. Fink, N. Rinetzky, and M. Nanda, “Verifying dereference safety via expanding-scope analysis,” in Proc. Int. Symp. Softw. Testing Anal. (ISSTA), New York, NY, USA: ACM, 2008, pp. 213–224.
[13]
R. Madhavan and R. Komondoor, “Null dereference verification via over-approximated weakest pre-conditions analysis,” ACM Sigplan Notices, vol. 46, no. 10, pp. 1033–1052, 2011.
[14]
Y. Sui and J. Xue, “SVF: Interprocedural static value-flow analysis in LLVM,” in Proc. 25th Int. Conf. Compiler Construction (CC), New York, NY, USA: ACM, 2016, pp. 265–266.
[15]
S. Cherem, L. Princehouse, and R. Rugina, “Practical memory leak detection using guarded value-flow analysis,” in Proc. 28th ACM SIGPLAN Conf. Program. Lang. Des. Implementation (PLDI), New York, NY, USA: ACM, 2007, pp. 480–491.
[16]
Q. Shi and C. Zhang, “Pipelining bottom-up data flow analysis,” in Proc. 42nd Int. Conf. Softw. Eng. (ICSE), New York, NY, USA: ACM, 2020, pp. 835–847.
[17]
L. Ciortea, C. Zamfir, S. Bucur, V. Chipounov, and G. Candea, “Cloud9: A software testing service,” ACM SIGOPS Oper. Syst. Rev., vol. 43, no. 4, pp. 5–10, 2010.
[18]
A. Albarghouthi, R. Kumar, A. V. Nori, and S. K. Rajamani, “Parallelizing top-down interprocedural analyses,” in Proc. 33rd ACM SIGPLAN Conf. Program. Lang. Des. Implementation(PLDI), New York, NY, USA: ACM, 2012, pp. 217–228.
[19]
B. Steensgaard, “Points-to analysis in almost linear time,” in Proc. 23rd ACM SIGPLAN-SIGACT Symp. Princ. Program. Lang.(POPL), New York, NY, USA: ACM, 1996, pp. 32–41.
[20]
B. Hardekopf and C. Lin, “Flow-sensitive pointer analysis for millions of lines of code,” in Proc. 9th Int. Symp. Code Gener. Optim. (CGO), Piscataway, NJ, USA: IEEE Press, 2011, pp. 289–298.
[21]
H. Oh, K. Heo, W. Lee, W. Lee, and K. Yi, “Design and implementation of sparse global analyses for C-like languages,” in Proc. 33rd ACM SIGPLAN Conf. Program. Lang. Des. Implementation (PLDI), New York, NY, USA: ACM, 2012, pp. 229–238.
[22]
M. N. Wegman and F. K. Zadeck, “Constant propagation with conditional branches,” ACM Trans. Program. Lang. Syst. (TOPLAS), vol. 13, no. 2, pp. 181–210, 1991.
[23]
C. Click and K. D. Cooper, “Combining analyses, combining optimizations,” ACM Trans. Program. Lang. Syst. (TOPLAS), vol. 17, no. 2, pp. 181–196, 1995.
[24]
C. Chambers and D. Ungar, “Iterative type analysis and extended message splitting; optimizing dynamically-typed object-oriented programs,” in Proc. 11th ACM SIGPLAN Conf. Program. Lang. Des. Implementation (PLDI), 1990, pp. 150–164.
[25]
S. Lerner, D. Grove, and C. Chambers, “Composing dataflow analyses and transformations,” in Proc. 29th ACM SIGPLAN-SIGACT Symp. Princ. Program. Lang. (POPL), New York, NY, USA: ACM, 2002, pp. 270–282.
[26]
S.-A.-A. Touati and D. Barthou, “On the decidability of phase ordering problem in optimizing compilation,” in Proc. 3rd Conf. Comput. Frontiers, (CF), New York, NY, USA: ACM, 2006, pp. 147–156.
[27]
C. Lattner and V. Adve, “LLVM: A compilation framework for lifelong program analysis & transformation,” in Proc. 2nd Int. Symp. Code Gener. Optim. (CGO), Piscataway, NJ, USA: IEEE Press, 2004, pp. 75:1–75:12.
[28]
A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman, Compilers: Principles, Techniques, and Tools. Reading, MA, USA: Addison-Wesley, 2007. [Online]. Available: https://bit.ly/3fkAEKs
[29]
Y. Sui, D. Ye, and J. Xue, “Static memory leak detection using full-sparse value-flow analysis,” in Proc. Int. Symp. Softw. Testing Anal. (ISSTA), New York, NY, USA: ACM, 2012, pp. 254–264.
[30]
Q. Zhang, M. R. Lyu, H. Yuan, and Z. Su, “Fast algorithms for Dyck-CFL-reachability with applications to alias analysis,” in Proc. 34th ACM SIGPLAN Conf. Program. Lang. Des. Implementation (PLDI), New York, NY, USA: ACM, 2013, pp. 435–446.
[31]
H. Yildirim, V. Chaoji, and M. J. Zaki, “GRAIL: Scalable reachability index for large graphs,” Proc. VLDB Endowment, vol. 3, nos. 1–2, pp. 276–284, 2010.
[32]
B. Livshits et al., “In defense of soundiness: A manifesto,” Commun. ACM, vol. 58, no. 2, pp. 44–46, 2015.
[33]
Q. Shi, P. Yao, R. Wu, and C. Zhang, “Path-sensitive sparse analysis without path conditions,” in Proc. 42nd ACM SIGPLAN Int. Conf. Program. Lang. Des. Implementation (PLDI), New York, NY, USA: ACM, 2021, pp. 930–943.
[34]
I. Dillig, T. Dillig, and A. Aiken, “Sound, complete and scalable path-sensitive analysis,” in Proc. 29th ACM SIGPLAN Conf. Program. Lang. Des. Implementation (PLDI), New York, NY, USA: ACM, 2008, pp. 270–280.
[35]
Q. Shi, R. Wu, G. Fan, and C. Zhang, “Conquering the extensional scalability problem for value-flow analysis frameworks,” in Proc. 42nd Int. Conf. Softw. Eng. (ICSE), New York, NY, USA: ACM, 2020, pp. 812–823.
[36]
G. Fan, R. Wu, Q. Shi, X. Xiao, J. Zhou, and C. Zhang, “SMOKE: Scalable path-sensitive memory leak detection for millions of lines of code,” in Proc. 41st Int. Conf. Softw. Eng. (ICSE), Piscataway, NJ, USA: IEEE Press, 2019, pp. 72–82.
[37]
Y. Xie and A. Aiken, “Context- and path-sensitive memory leak detection,” in Proc. 10th Eur. Softw. Eng. Conf. (ESEC/FSE), New York, NY, USA: ACM, 2005, pp. 115–125.
[38]
S. Chaki, E. M. Clarke, A. Groce, S. Jha, and H. Veith, “Modular verification of software components in C,” IEEE Trans. Softw. Eng., vol. 30, no. 6, pp. 388–402, Jun. 2004.
[39]
C. Y. Cho, V. D’Silva, and D. Song, “BLITZ: Compositional bounded model checking for real-world programs,” in Proc. 28th Int. Conf. Automated Softw. Eng. (ASE), Piscataway, NJ, USA: IEEE Press, 2013, pp. 136–146.
[40]
T. Ball, V. Levin, and S. K. Rajamani, “A decade of software model checking with slam,” Commun. ACM, vol. 54, no. 7, pp. 68–76, 2011.
[41]
T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre, “Lazy abstraction,” in Proc. 29th ACM SIGPLAN-SIGACT Symp. Princ. Program. Lang. (POPL), New York, NY, USA: ACM, 2002, pp. 58–70.
[42]
E. Clarke, D. Kroening, and K. Yorav, “Behavioral consistency of C and verilog programs using bounded model checking,” in Proc. 40th Des. Automat. Conf. (DAC), New York, NY, USA: ACM, 2003, pp. 368–371.
[43]
E. Clarke, D. Kroening, N. Sharygina, and K. Yorav, “Predicate abstraction of ansi-C programs using SAT,” Formal Methods Syst. Des., vol. 25, no. 2, pp. 105–127, 2004.
[44]
Y. Sui, D. Ye, and J. Xue, “Detecting memory leaks statically with full-sparse value-flow analysis,” IEEE Trans. Softw. Eng., vol. 40, no. 2, pp. 107–122, 2014.
[45]
M. Kellogg, D. Daskiewicz, L. N. Duc Nguyen, M. Ahmed, and M. D. Ernst, “Pluggable type inference for free,” in Proc. 38th IEEE/ACM Int. Conf. Automated Softw. Eng. (ASE), 2023, pp. 1542–1554.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering  Volume 50, Issue 11
Nov. 2024
379 pages

Publisher

IEEE Press

Publication History

Published: 01 November 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media