skip to main content
10.5555/3314872.3314890acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
Article

Generation of in-bounds inputs for arrays in memory-unsafe languages

Published: 16 February 2019 Publication History

Abstract

This paper presents a technique to generate in-bounds inputs for arrays used in memory-unsafe programming languages, such as C and C++. We show that most memory indexation found in actual C programs follows patterns that are easy to analyze statically. Based on this observation, we show how symbolic range analysis can be used to establish contracts between the arguments of a function and the arrays used within that function. To demonstrate the effectiveness of our ideas, we use them to implement Griffin-TG, a tool to stress-test C programs whose source code might be partially available. We show how Griffin-TG improves Aprof, a well-known algorithmic profiling tool, and we show how it lets us enrich Polybench with a large set of new inputs.

References

[1]
H. Tuch, G. Klein, and M. Norrish, “Types, bytes, and separation logic,” in POPL, (Washington, DC, USA), pp. 97–108, ACM, 2007.
[2]
D. M. Ritchie, “The development of the c language,” in HOPL-II, (New York, NY, USA), pp. 201–208, ACM, 1993.
[3]
R. Bod´ık, R. Gupta, and V. Sarkar, “ABCD: Eliminating array bounds checks on demand,” in PLDI, (New York, NY, USA), pp. 321–333, ACM, 2000.
[4]
G. C. Necula, S. McPeak, and W. Weimer, “CCured: Type-safe retrofitting of legacy code,” in POPL, (New York, NY, USA), pp. 128– 139, ACM, 2002.
[5]
J. Lakos, Large-scale C++ Software Design. Redwood City, CA, USA: Addison Wesley Longman Publishing Co., Inc., 1996.
[6]
C. Cadar, D. Dunbar, and D. Engler, “KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs,” in OSDI, (Berkeley, CA, USA), pp. 209–224, USENIX Association, 2008.
[7]
P. Godefroid, N. Klarlund, and K. Sen, “DART: Directed automated random testing,” in Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, (New York, NY, USA), pp. 213–223, ACM, 2005.
[8]
K. Lakhotia, M. Harman, and H. Gross, “AUSTIN: An open source tool for search based software testing of C programs,” Inf. Softw. Technol., vol. 55, no. 1, pp. 112–125, 2013.
[9]
J. Condit, M. Harren, Z. Anderson, D. Gay, and G. C. Necula, “Dependent types for low-level programming,” in ESOP, (Berlin, Heidelberg), pp. 520–535, Springer-Verlag, 2007.
[10]
P. Godefroid, M. Y. Levin, and D. Molnar, “SAGE: Whitebox fuzzing for security testing,” Queue, vol. 10, pp. 20:20–20:27, Jan. 2012.
[11]
K. Serebryany, D. Bruening, A. Potapenko, and D. Vyukov, “Address-Sanitizer: A fast address sanity checker,” in USENIX, (Berkeley, CA, USA), pp. 28–28, USENIX Association, 2012.
[12]
D. Bruening and Q. Zhao, “Practical memory checking with dr. memory,” in CGO, (Washington, DC, USA), pp. 213–223, IEEE, 2011.
[13]
S. M. Blackburn, A. Diwan, M. Hauswirth, P. F. Sweeney, J. N. Amaral, T. Brecht, L. Bulej, C. Click, L. Eeckhout, S. Fischmeister, D. Frampton, L. J. Hendren, M. Hind, A. L. Hosking, R. E. Jones, T. Kalibera, N. Keynes, N. Nystrom, and A. Zeller, “The truth, the whole truth, and nothing but the truth: A pragmatic guide to assessing empirical evaluations,” ACM Trans. Program. Lang. Syst., vol. 38, no. 4, pp. 15:1– 15:20, 2016.
[14]
E. Coppa, C. Demetrescu, and I. Finocchi, “Input-sensitive profiling,” in PLDI, (New York, NY, USA), pp. 89–98, ACM, 2012.
[15]
H. N. Santos, P. Alves, I. Costa, and F. M. Quintao Pereira, “Just-intime value specialization,” in CGO, (Washington, DC, USA), pp. 1–11, IEEE, 2013.
[16]
P. Feautrier, “Automatic parallelization in the polytope model,” in The Data Parallel Programming Model: Foundations, HPF Realization, and Scientific Applications, (London, UK, UK), pp. 79–103, Springer-Verlag, 1996.
[17]
M. Griebl, C. Lengauer, and S. Wetzel, “Code generation in the polytope model,” in PACT, (Washington, DC, USA), pp. 106–115, IEEE, 1998.
[18]
J. Robinson, “The undecidability of exponential diophantine equations,” Studies in Logic and the Foundations of Mathematics, vol. 44, no. 8, pp. 12–13, 1966.
[19]
A. Wiles, “Modular elliptic curves and fermat’s last theorem,” Annals of Mathematics, vol. 141, no. 3, pp. 443–551, 1995.
[20]
J. Ferrante, K. J. Ottenstein, and J. D. Warren, “The program dependence graph and its use in optimization,” TOPLAS, vol. 9, no. 3, pp. 319–349, 1987.
[21]
B. Hardekopf and C. Lin, “The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code,” in PLDI, (New York, NY, USA), pp. 290–299, ACM, 2007.
[22]
F. M. Q. Pereira and D. Berlin, “Wave propagation and deep propagation for pointer analysis,” in CGO, (Washington, DC, USA), pp. 126–135, IEEE, 2009.
[23]
L. T. C. Melo, R. G. Ribeiro, M. R. de Ara´ujo, and F. M. Q. a. Pereira, “Inference of static semantics for incomplete C programs,” Proc. ACM Program. Lang., vol. 2, no. POPL, pp. 29:1–29:28, 2018.
[24]
P. Alves, F. Gruber, J. Doerfert, A. Lamprineas, T. Grosser, F. Rastello, and F. M. Q. Pereira, “Runtime pointer disambiguation,” in OOPSLA, (New York, NY, USA), pp. 589–606, ACM, 2015.
[25]
W. Blume and R. Eigenmann, “Symbolic range propagation,” in IPPS, (Washington, DC, USA), pp. 357–363, IEEE, 1994.
[26]
H. Nazaré, I. Maffra, W. Santos, L. Barbosa, L. Gonnord, and F. M. Q. Pereira, “Validation of memory accesses through symbolic analyses,” in OOPSLA, (New York, NY, USA), pp. 791–809, ACM, 2014.
[27]
R. Rugina and M. C. Rinard, “Symbolic bounds analysis of pointers, array indices, and accessed memory regions,” TOPLAS, vol. 27, no. 2, pp. 185–235, 2005.
[28]
S. Rus, L. Rauchwerger, and J. Hoeflinger, “Hybrid analysis: Static and dynamic memory reference analysis,” in ICS, (Washington, DC, USA), pp. 251–283, IEEE, 2002.
[29]
P. Cousot and R. Cousot, “Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints,” in POPL, (New York, NY, USA), pp. 238–252, ACM, 1977.
[30]
F. Nielson, H. R. Nielson, and C. Hankin, Principles of program analysis. Berlin, Heidelberg: Springer-Verlag, 2005.
[31]
F. Logozzo and M. Fähndrich, “Pentagons: A weakly relational abstract domain for the efficient validation of array accesses,” Sci. Comput. Program., vol. 75, no. 9, pp. 796–807, 2010.
[32]
A. Miné, “The octagon abstract domain,” Higher Order Symbol. Comput., vol. 19, no. 1, pp. 31–100, 2006.
[33]
L. O. Andersen, Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, University of Copenhagen, 1994.
[34]
B. Steensgaard, “Points-to analysis in almost linear time,” in POPL, (New York, NY, USA), pp. 32–41, ACM, 1996.
[35]
Q. Liu, X. Wu, L. Kittinger, M. Levy, and C. Jung, “Benchprime: Effective building of a hybrid benchmark suite,” ACM Trans. Embed. Comput. Syst., vol. 16, pp. 179:1–179:22, Sept. 2017.
[36]
N. Nethercote and J. Seward, “Valgrind: a framework for heavyweight dynamic binary instrumentation,” in PLDI, (New York, NY, USA), pp. 89–100, ACM, 2007.
[37]
J. Jaeger, P. Carribault, and M. Pérache, “Fine-grain data management directory for openmp 4.0 and openacc,” Concurr. Comput. : Pract. Exper., vol. 27, no. 6, pp. 1528–1539, 2015.
[38]
S. Wienke, P. Springer, C. Terboven, and D. an Mey, “OpenACC: First experiences with real-world applications,” in Euro-Par, (Berlin, Heidelberg), pp. 859–870, Springer-Verlag, 2012.
[39]
J. M. Andi´on, M. Arenaz, F. Bodin, G. Rodr´ıguez, and J. T. no, “Locality-aware automatic parallelization for GPGPU with OpenHMPP directives,” Inter. Journal of Parallel Programming, vol. 44, no. 3, pp. 620–643, 2016.
[40]
C. Meenderinck and B. Juurlink, “Nexus: Hardware support for taskbased programming,” in DSD, (Berlin, Heidelberg), pp. 442–445, Springer-Verlag, 2011.
[41]
T. B. Jablin, P. Prabhu, J. A. Jablin, N. P. Johnson, S. R. Beard, and D. I. August, “Automatic cpu-gpu communication management and optimization,” in PLDI, (New York, NY, USA), pp. 142–151, ACM, 2011.
[42]
G. Mendonc¸a, B. Guimar˜aes, P. Alves, M. Pereira, G. Ara´ujo, and F. M. Q. a. Pereira, “DawnCC: Automatic annotation for data parallelism and offloading,” Trans. Archit. Code Optim., vol. 14, no. 2, pp. 13:1–13:25, 2017.
[43]
T. Su, K. Wu, W. Miao, G. Pu, J. He, Y. Chen, and Z. Su, “A survey on data-flow testing,” ACM Comput. Surv., vol. 50, no. 1, pp. 5:1–5:35, 2017.
[44]
S. Anand, E. K. Burke, T. Y. Chen, J. Clark, M. B. Cohen, W. Grieskamp, M. Harman, M. J. Harrold, and P. Mcminn, “An orchestrated survey of methodologies for automated software test case generation,” J. Syst. Softw., vol. 86, no. 8, pp. 1978–2001, 2013.
[45]
Y. Jia and M. Harman, “An analysis and survey of the development of mutation testing,” Trans. Softw. Eng., vol. 37, no. 5, pp. 649–678, 2011.
[46]
P. McMinn, “Search-based software test data generation: A survey: Research articles,” Softw. Test. Verif. Reliab., vol. 14, no. 2, pp. 105–156, 2004.
[47]
V. G. Yusifo˘glu, Y. Amannejad, and A. B. Can, “Software test-code engineering: A systematic mapping,” Information and Software Technology, vol. 58, pp. 123–147, 2015.
[48]
D. Graham and M. Fewster, Experiences of Test Automation: Case Studies of Software Test Automation. Boston, MA, US: Addison-Wesley Professional, 1st ed., 2012.
[49]
J. Edvardsson, “A survey on automatic test data generation,” in Compse, (Begijnhoflaan, Belgium), pp. 21–28, EAI, 1999.
[50]
E. Kit and S. Finzi, Software Testing in the Real World: Improving the Process. New York, NY, USA: ACM, 1995.
[51]
C. Lattner and V. Adve, “LLVM: A compilation framework for lifelong program analysis & transformation,” in CGO, (Washington, DC, USA), pp. 75–, IEEE Computer Society, 2004.
[52]
K. Mao, M. Harman, and Y. Jia, “Sapienz: Multi-objective automated testing for android applications,” in ISSTA, (New York, NY, USA), pp. 94–105, ACM, 2016.
[53]
G. Jin, L. Song, X. Shi, J. Scherpelz, and S. Lu, “Understanding and detecting real-world performance bugs,” in PLDI, (New York, NY, USA), pp. 77–88, ACM, 2012.
[54]
R. Mudduluru and M. K. Ramanathan, “Efficient flow profiling for detecting performance bugs,” in ISSTA, (New York, NY, USA), pp. 413– 424, ACM, 2016.
[55]
A. Nistor, L. Song, D. Marinov, and S. Lu, “Toddler: Detecting performance problems via similar memory-access patterns,” in ICSE, (Piscataway, NJ, USA), pp. 562–571, IEEE Press, 2013.
[56]
O. Olivo, I. Dillig, and C. Lin, “Static detection of asymptotic performance bugs in collection traversals,” in PLDI, (New York, NY, USA), pp. 369–378, ACM, 2015.
[57]
L. Fang, L. Dou, and G. Xu, “PerfBlower: Quickly detecting memoryrelated performance problems via amplification,” in ECOOP (J. T. Boyland, ed.), vol. 37 of LIPIcs, (Dagstuhl, Germany), pp. 296–320, Schloss Dagstuhl, 2015.
[58]
M. Grechanik, C. Fu, and Q. Xie, “Automatically finding performance problems with feedback-directed learning software testing,” in ICSE, (Piscataway, NJ, USA), pp. 156–166, IEEE Press, 2012.
[59]
P. Zhang, S. Elbaum, and M. B. Dwyer, “Automatic generation of load tests,” in ASE, (Washington, DC, USA), pp. 43–52, IEEE, 2011.
[60]
J. Brock, C. Ding, R. Lavaee, F. Liu, and L. Yuan, “Prediction and bounds on shared cache demand from memory access interleaving,” in ISMM, (New York, NY, USA), pp. 96–108, ACM, 2018.
[61]
C. Dubach, T. M. Jones, E. V. Bonilla, and M. F. P. O’Boyle, “A predictive model for dynamic microarchitectural adaptivity control,” in MICRO, (Washington, DC, USA), pp. 485–496, IEEE, 2010.
[62]
E. Duesterwald, C. Cascaval, and S. Dwarkadas, “Characterizing and predicting program behavior and its variability,” in PACT, (Washington, DC, USA), pp. 220–231, IEEE, 2003.
[63]
C. Dubach, T. M. Jones, and M. F. P. O’Boyle, “Exploring and predicting the effects of microarchitectural parameters and compiler optimizations on performance and energy,” Trans. Embedded Comput. Syst., vol. 11, no. S1, p. 24, 2012.
[64]
A. E. Helal, W. Feng, C. Jung, and Y. Y. Hanafy, “AutoMatch: An automated framework for relative performance estimation and workload distribution on heterogeneous HPC systems,” in IISWC, (Washington, DC, USA), pp. 32–42, IEEE, 2017.
[65]
B. Dagenais and L. Hendren, “Enabling static analysis for partial java programs,” in OOPSLA, (New York, NY, USA), pp. 313–328, ACM, 2008.
[66]
P. Godefroid, “Micro execution,” in ICSE, (New York, NY, USA), pp. 539–549, ACM, 2014.
[67]
C. Yao, Y.-W. Wang, F. li, and Y.-Z. Gong, “A method of function modeling in accurate stub generation,” in ICSAI, (Washington, DC, USA), pp. 1–8, IEEE, 2014.
[68]
G. Meszaros, xUnit test patterns: Refactoring test code. London, United Kingdom: Pearson Education, 2007.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CGO 2019: Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization
February 2019
286 pages
ISBN:9781728114361

Sponsors

Publisher

IEEE Press

Publication History

Published: 16 February 2019

Check for updates

Author Tags

  1. Arrays
  2. Range Analysis
  3. Static Analysis
  4. Test

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media