skip to main content
10.1145/3497775.3503679acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article

Formally verified superblock scheduling

Published: 11 January 2022 Publication History

Abstract

On in-order processors, without dynamic instruction scheduling, program running times may be significantly reduced by compile-time instruction scheduling. We present here the first effective certified instruction scheduler that operates over superblocks (it may move instructions across branches), along with its performance evaluation. It is integrated within the CompCert C compiler, providing a complete machine-checked proof of semantic preservation from C to assembly.
Our optimizer composes several passes designed by translation validation: program transformations are proposed by untrusted oracles, which are then validated by certified and scalable checkers. Our main checker is an architecture-independent simulation-test over superblocks modulo register liveness, which relies on hash-consed symbolic execution.

References

[1]
Pietro Alovisi. 2020. Static Branch Prediction through Representation Learning. Master’s thesis. KTH Stockholm. https://www.diva-portal.org/smash/get/diva2:1450658/FULLTEXT01.pdf
[2]
Thomas Ball and James R Larus. 1993. Branch prediction for free. ACM SIGPLAN Notices, 28, 6 (1993), 300–313.
[3]
Sylvain Boulmé. 2021. Formally Verified Defensive Programming (efficient Coq-verified computations from untrusted ML oracles). Université Grenoble Alpes. https://hal.archives-ouvertes.fr/tel-03356701
[4]
P. P. Chang and W. W. Hwu. 1988. Trace Selection for Compiling Large C Application Programs to Microcode. In Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture (MICRO 21). IEEE Computer Society Press, Washington, DC, USA. 21–29. isbn:0818619198
[5]
Brian L. Deitrich, Ben-Chung Cheng, and Wen-mei W. Hwu. 1998. Improving Static Branch Prediction in a Compiler. In Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, Paris, France, October 12-18, 1998. IEEE Computer Society, 214–221. https://doi.org/10.1109/PACT.1998.727253
[6]
Heiko Falk, Sebastian Altmeyer, Peter Hellinckx, Björn Lisper, Wolfgang Puffitsch, Christine Rochange, Martin Schoeberl, Rasmus Bo Sørensen, Peter Wägemann, and Simon Wegener. 2016. TACLeBench: A Benchmark Collection to Support Worst-Case Execution Time Research. In 16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016), Martin Schoeberl (Ed.) (OpenAccess Series in Informatics (OASIcs), Vol. 55). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany. 2:1–2:10.
[7]
Joseph A. Fisher. 1981. Trace scheduling: A technique for global microcode compaction. IEEE transactions on computers, 478–490.
[8]
David Gregg. 2001. Comparing Tail Duplication with Compensation Code in Single Path Global Instruction Scheduling. In Compiler Construction, 10th International Conference, CC 2001 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2001 Genova, Italy, April 2-6, 2001, Proceedings, Reinhard Wilhelm (Ed.) (Lecture Notes in Computer Science, Vol. 2027). Springer, 200–212. https://doi.org/10.1007/3-540-45306-7_14
[9]
Rajiv Gupta and Mary Lou Soffa. 1990. Region Scheduling: An Approach for Detecting and Redistributing Parallelism. IEEE Trans. Software Eng., 16, 4 (1990), 421–431. https://doi.org/10.1109/32.54294
[10]
Wen-mei Hwu, Scott Mahlke, William Chen, Pohua Chang, Nancy Warter, Roger Bringmann, Roland Ouellette, Richard Hank, Tokuzo Kiyohara, Grant Haab, John Holm, and Daniel Lavery. 1993. The Superblock: An Effective Technique for VLIW and Superscalar Compilation. The Journal of Supercomputing, 7 (1993), 05, 229–248. isbn:978-1-4613-6404-7 https://doi.org/10.1007/BF01205185
[11]
ISO. 2011. C11 Standard. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf ISO/IEC 9899:2011.
[12]
Justus Fasse. 2021. Code Transformations to Increase Prepass Scheduling Opportunities in CompCert. Université Grenoble Alpes. https://www-verimag.imag.fr/~boulme/CPP_2022/FASSE-Justus-MSc-Thesis_2021.pdf
[13]
Tanya M. Lattner. 2005. An Implementation of Swing Modulo Scheduling with Extensions for Superblocks. Master’s thesis. Computer Science Dept., University of Illinois at Urbana-Champaign. Urbana, IL. https://llvm.org/pubs/2005-06-17-LattnerMSThesis.html
[14]
M. Lee, P. Tirumalai, and T. Ngai. 1993. Software pipelining and superblock scheduling: compilation techniques for VLIW machines. In [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences. i, 202–213 vol.1. https://doi.org/10.1109/HICSS.1993.270744
[15]
Xavier Leroy. 2009. Formal verification of a realistic compiler. Commun. ACM, 52, 7 (2009), https://doi.org/10.1145/1538788.1538814
[16]
Xavier Leroy. 2009. A formally verified compiler back-end. Journal of Automated Reasoning, 43, 4 (2009), 363–446. http://xavierleroy.org/publi/compcert-backend.pdf
[17]
Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random testing for C and C++ compilers with YARPGen. Proc. ACM Program. Lang., 4, OOPSLA (2020), 196:1–196:25. https://doi.org/10.1145/3428264
[18]
David Monniaux and Cyril Six. 2021. Simple, light, yet formally verified, global common subexpression elimination and loop-invariant code motion. In LCTES ’21: 22nd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems, Virtual Event, Canada, 22 June, 2021, Jörg Henkel and Xu Liu (Eds.). ACM, 85–96. https://doi.org/10.1145/3461648.3463850
[19]
George C. Necula. 2000. Translation validation for an optimizing compiler. In Programming Language Design and Implementation (PLDI). ACM Press, 83–94. https://doi.org/10.1145/349299.349314
[20]
Nicolas Nardino. 2021. Register-Pressure-Aware Prepass-Scheduling for CompCert. ENS de Lyon. https://www-verimag.imag.fr/~boulme/CPP_2022/NARDINO-Nicolas-BSc-Thesis_2021.pdf
[21]
Louis-Noël Pouchet. 2012. the Polyhedral Benchmark suite. http://web.cs.ucla.edu/~pouchet/software/polybench/
[22]
Arun Rangasamy. 2021. Superblock Scheduler for Code-Size Sensitive Applications. Slides presented at LLVM developers’ meeting. https://llvm.org/devmtg/2021-02-28/slides/Arun-Superblock-sched.pdf
[23]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, Beijing, China - June 11 - 16, 2012, Jan Vitek, Haibo Lin, and Frank Tip (Eds.). ACM, 335–346. https://doi.org/10.1145/2254064.2254104
[24]
Silvain Rideau and Xavier Leroy. 2010. Validating register allocation and spilling. In Compiler Construction (CC 2010) (LNCS, Vol. 6011). Springer, 224–243. http://gallium.inria.fr/~xleroy/publi/validation-regalloc.pdf
[25]
Thomas Arthur Leck Sewell, Magnus O. Myreen, and Gerwin Klein. 2013. Translation validation for a verified OS kernel. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, Seattle, WA, USA, June 16-19, 2013, Hans-Juergen Boehm and Cormac Flanagan (Eds.). ACM, 471–482. https://doi.org/10.1145/2491956.2462183
[26]
Ghassan Shobaki, Maxim Shawabkeh, and Najm Eldeen Abu Rmaileh. 2013. Preallocation Instruction Scheduling with Register Pressure Minimization Using a Combinatorial Optimization Approach. ACM Trans. Archit. Code Optim., 10, 3 (2013), Article 14, September, 31 pages. issn:1544-3566 https://doi.org/10.1145/2512432
[27]
Cyril Six. 2021. Optimized and formally-verified compilation for a VLIW processor. Ph.D. Dissertation. Université Grenoble Alpes. https://hal.archives-ouvertes.fr/tel-03326923
[28]
Cyril Six, Sylvain Boulmé, and David Monniaux. 2020. Certified and efficient instruction scheduling: application to interlocked VLIW processors. Proc. ACM Program. Lang., 4, OOPSLA (2020), 129:1–129:29. https://doi.org/10.1145/3428197
[29]
Jean-Baptiste Tristan, Paul Govereau, and Greg Morrisett. 2011. Evaluating value-graph translation validation for LLVM. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011. ACM, 295–305. https://doi.org/10.1145/1993498.1993533
[30]
Jean-Baptiste Tristan and Xavier Leroy. 2008. Formal Verification of Translation Validators: a Case Study on Instruction Scheduling Optimizations. In Principles of Programming Languages (POPL). ACM Press, 17–27. https://doi.org/10.1145/1328438.1328444
[31]
Jean-Baptiste Tristan. 2009. Formal verification of translation validators. Ph.D. Dissertation. Université Paris 7 Diderot.
[32]
Youfeng Wu and James R. Larus. 1994. Static branch frequency and program profile analysis. In Proceedings of the 27th Annual International Symposium on Microarchitecture, San Jose, California, USA, November 30 - December 2, 1994, Hans Mulder and Matthew K. Farrens (Eds.). ACM / IEEE Computer Society, 1–11. https://doi.org/10.1109/MICRO.1994.717399
[33]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Programming Language Design and Implementation (PLDI). ACM Press, 283–294. https://doi.org/10.1145/1993498.1993532

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CPP 2022: Proceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs
January 2022
351 pages
ISBN:9781450391825
DOI:10.1145/3497775
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 January 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Instruction-level parallelism
  2. Symbolic execution
  3. Translation validation
  4. the COQ proof assistant

Qualifiers

  • Research-article

Funding Sources

  • LabEx PERSYVAL-Lab
  • IRT Nanoelec

Conference

CPP '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 18 of 26 submissions, 69%

Upcoming Conference

POPL '26

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media