skip to main content
10.1145/3445814.3446751acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
Article
Public Access

Language-parametric compiler validation with application to LLVM

Published: 17 April 2021 Publication History

Abstract

We propose a new design for a Translation Validation (TV) system geared towards practical use with modern optimizing compilers, such as LLVM. Unlike existing TV systems, which are custom-tailored for a particular sequence of transformations and a specific, common language for input and output programs, our design clearly separates the transformation-specific components from the rest of the system, and generalizes the transformation-independent components. Specifically, we present Keq, the first program equivalence checker that is parametric to the input and output language semantics and has no dependence on the transformation between the input and output programs. The Keq algorithm is based on a rigorous formalization, namely cut-bisimulation, and is proven correct. We have prototyped a TV system for the Instruction Selection pass of LLVM, being able to automatically prove equivalence for translations from LLVM IR to the MachineIR used in compiling to x86-64. This transformation uses different input and output languages, and as such has not been previously addressed by the state of the art. An experimental evaluation shows that Keq successfully proves correct the translation of over 90% of 4732 supported functions in GCC from SPEC 2006.

References

[1]
2015. Instruction Selection load narrowing miscompilation. https://bugs.llvm. org/show_bug.cgi?id= 4737.
[2]
2015. Instruction Selection WAW miscompilation. https://bugs.llvm.org/show_bug.cgi?id= 25154.
[3]
Mike Barnett, Bor-Yuh Evan Chang, Robert DeLine, Bart Jacobs, and K. Rustan M. Leino. 2006. Boogie: A Modular Reusable Verifier for Object-Oriented Programs. Springer Berlin Heidelberg, Berlin, Heidelberg, 364-387. https://doi.org/10.1007/ 11804192_17
[4]
Adam Chlipala. 2007. A Certified Type-preserving Compiler from Lambda Calculus to Assembly Language. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (San Diego, California, USA) ( PLDI '07). ACM, New York, NY, USA, 54-65. https://doi.org/10.1145/1250734. 1250742
[5]
Ştefan Ciobâcă, Dorel Lucanu, Vlad Rusu, and Grigore Roşu. 2014. A LanguageIndependent Proof System for Mutual Program Equivalence. Springer International Publishing, Cham, 75-90. https://doi.org/10.1007/978-3-319-11737-9_6
[6]
Clang. 2020. Clang: C Language Family Frontend for LLVM. http://clang.llvm.org. Accessed: February 25, 2021.
[7]
Apple Corp. 2019. iOS App Distribution Guide. https://developer.apple.com/ library/etc/redirect/DTS/iOSAppDistGuide. Accessed: February 2019.
[8]
Sandeep Dasgupta, Sushant Dinesh, Deepan Venkatesh, Vikram S. Adve, and Christopher W. Fletcher. 2020. Scalable Validation of Binary Lifters. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (London, UK) ( PLDI 2020 ). Association for Computing Machinery, New York, NY, USA, 655-671. https://doi.org/10.1145/3385412.3385964
[9]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: an eficient SMT solver. In TACAS'08 (LNCS, Vol. 4963 ). 337-340.
[10]
Free Software Foundation. 2019. GNU Compiler Collection Internals. https: //gcc.gnu.org/onlinedocs/gccint/Passes.html. Accessed: August 2020.
[11]
GCC. 2020. GNU Compiler Collection. https://gcc.gnu.org. Accessed: February 25, 2021.
[12]
Jan Friso Groote, David N. Jansen, Jeroen J. A. Keiren, and Anton J. Wijs. 2017. An O(mlogn) Algorithm for Computing Stuttering Equivalence and Branching Bisimulation. ACM Trans. Comput. Logic 18, 2, Article 13 ( June 2017 ), 34 pages. https://doi.org/10.1145/3060140
[13]
Chris Hawblitzel, Shuvendu K. Lahiri, Kshama Pawar, Hammad Hashmi, Sedar Gokbulut, Lakshan Fernando, Dave Detlefs, and Scott Wadsworth. 2013. Will You Still Compile Me Tomorrow? Static Cross-version Compiler Validation. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (Saint Petersburg, Russia) (ESEC/FSE 2013). ACM, New York, NY, USA, 191-201. https://doi.org/10.1145/2491411.2491442
[14]
Jeremy Horwitz. 2018. Apple Watch apps instantly went 64-bit thanks to obscure Bitcode option. VentureBeat ( 2018 ).
[15]
Intel Corporation. 2016. Intel 64 and IA-32 Architectures Software Developer's Manual. Intel Corporation.
[16]
Julia. 2020. The Julia Language. http://julialang.org. Accessed: February 25, 2021.
[17]
Ramana Kumar, Magnus O. Myreen, Michael Norrish, and Scott Owens. 2014. CakeML: A Verified Implementation of ML. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (San Diego, California, USA) ( POPL '14). ACM, New York, NY, USA, 179-191. https: //doi.org/10.1145/2535838.2535841
[18]
Sudipta Kundu, Zachary Tatlock, and Sorin Lerner. 2009. Proving Optimizations Correct Using Parameterized Program Equivalence. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (Dublin, Ireland) ( PLDI '09). ACM, New York, NY, USA, 327-337. https://doi.org/ 10.1145/1542476.1542513
[19]
Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO'04). Palo Alto, California.
[20]
Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-Kil Hur, Sanjoy Das, David Majnemer, John Regehr, and Nuno P. Lopes. 2017. Taming Undefined Behavior in LLVM. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (Barcelona, Spain) ( PLDI 2017 ). Association for Computing Machinery, New York, NY, USA, 633-647. https://doi.org/10. 1145/3062341.3062343
[21]
Xavier Leroy. 2009. Formal Verification of a Realistic Compiler. Commun. ACM 52, 7 ( July 2009 ), 107-115. https://doi.org/10.1145/1538788.1538814
[22]
LLVM. 2020. LLVM Language Reference Manual. http://llvm.org/docs/LangRef. html. Accessed: February 25, 2021.
[23]
LLVM. 2020. LLVM Target-Independent Code Generation: Instruction Selection. http://llvm.org/docs/CodeGenerator.html# instruction-selection-section. Accessed: February 25, 2021.
[24]
LLVM. 2020. LLVM Target-independent Code Generator. http://llvm.org/docs/ CodeGenerator.html# machine-code-representation. Accessed: February 25, 2021.
[25]
LLVM. 2020. TableGen. http://llvm.org/docs/TableGen/index.html. Accessed: February 25, 2021.
[26]
Nuno P. Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. 2015. Provably Correct Peephole Optimizations with Alive. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (Portland, OR, USA) ( PLDI '15). ACM, New York, NY, USA, 22-32. https://doi. org/10.1145/2737924.2737965
[27]
Kedar S. Namjoshi. 1997. A simple characterization of stuttering bisimulation. Springer Berlin Heidelberg, Berlin, Heidelberg, 284-296. https://doi.org/10.1007/ BFb0058037
[28]
Kedar S. Namjoshi and Lenore D. Zuck. 2013. Witnessing Program Transformations. Springer Berlin Heidelberg, Berlin, Heidelberg, 304-323. https://doi.org/10.1007/ 978-3-642-38856-9_17
[29]
George C. Necula. 2000. Translation Validation for an Optimizing Compiler. In Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation (Vancouver, British Columbia, Canada) ( PLDI '00). ACM, New York, NY, USA, 83-94. https://doi.org/10.1145/349299.349314
[30]
Amir Pnueli, Michael Siegel, and Eli Singerman. 1998. Translation Validation. In Proceedings of the 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS '98). Springer-Verlag, London, UK, UK, 151-166. http://dl.acm.org/citation.cfm?id= 646482. 691453
[31]
Hartley Rogers, Jr. 1987. Theory of Recursive Functions and Efective Computability. MIT Press, Cambridge, MA, USA.
[32]
Grigore Roşu and Traian Florin Şerbănuţă. 2010. An Overview of the K Semantic Framework. Journal of Logic and Algebraic Programming 79, 6 ( 2010 ), 397-434. https://doi.org/10.1016/j.jlap. 2010. 03.012
[33]
Hanan Samet. 1975. Automatically Proving the Correctness of Translations Involving Optimized Code. Ph.D. Dissertation. Stanford, CA, USA. AAI7525601.
[34]
Davide Sangiorgi. 2011. Introduction to Bisimulation and Coinduction. Cambridge University Press, New York, NY, USA.
[35]
Thomas Arthur Leck Sewell, Magnus O. Myreen, and Gerwin Klein. 2013. Translation Validation for a Verified OS Kernel. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (Seattle, Washington, USA) ( PLDI '13). ACM, New York, NY, USA, 471-482. https://doi.org/10.1145/2491956.2462183
[36]
Rahul Sharma, Eric Schkufza, Berkeley Churchill, and Alex Aiken. 2013. Datadriven Equivalence Checking. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications (Indianapolis, Indiana, USA) ( OOPSLA '13). ACM, New York, NY, USA, 391-406. https://doi.org/10.1145/2509136.2509509
[37]
SPEC. 2020. SPEC CPU 2006 Benchmark. https://www.spec.org/cpu2006/. Accessed: February 25, 2021.
[38]
Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016. Toward Understanding Compiler Bugs in GCC and LLVM. In Proceedings of the 25th International Symposium on Software Testing and Analysis (Saarbrücken, Germany) (ISSTA 2016). ACM, New York, NY, USA, 294-305. https://doi.org/10.1145/2931037. 2931074
[39]
Swift. 2020. Swift programming language. https://swift.org. Accessed: February 25, 2021.
[40]
Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality Saturation: A New Approach to Optimization. In Proceedings of the 36th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (Savannah, GA, USA) ( POPL '09). ACM, New York, NY, USA, 264-276. https: //doi.org/10.1145/1480881.1480915
[41]
Jean-Baptiste Tristan, Paul Govereau, and Greg Morrisett. 2011. Evaluating Value-graph Translation Validation for LLVM. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation (San Jose, California, USA) ( PLDI '11). ACM, New York, NY, USA, 295-305. https: //doi.org/10.1145/1993498.1993533
[42]
David Wheeler. 2015. To Bitcode, or Not to Bitcode? iovation ( 2015 ).
[43]
Jianzhou Zhao, Santosh Nagarakatte, Milo M.K. Martin, and Steve Zdancewic. 2013. Formal Verification of SSA-based Optimizations for LLVM. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (Seattle, Washington, USA) ( PLDI '13). ACM, New York, NY, USA, 175-186. https://doi.org/10.1145/2491956.2462164
[44]
Lenore Zuck, Amir Pnueli, Yi Fang, and Benjamin Goldberg. 2003. VOC: A methodology for the translation validation of optimizing compilers. Journal of Universal Computer Science 9 ( 2003 ), 2003.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
April 2021
1090 pages
ISBN:9781450383172
DOI:10.1145/3445814
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2021

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Compilers
  2. Program Equivalence
  3. Simulation
  4. Translation Validation

Qualifiers

  • Article

Funding Sources

Conference

ASPLOS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)171
  • Downloads (Last 6 weeks)34
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media