skip to main content
10.1145/1669112.1669124acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

Portable compiler optimisation across embedded programs and microarchitectures using machine learning

Published: 12 December 2009 Publication History

Abstract

Building an optimising compiler is a difficult and time consuming task which must be repeated for each generation of a microprocessor. As the underlying microarchitecture changes from one generation to the next, the compiler must be retuned to optimise specifically for that new system. It may take several releases of the compiler to effectively exploit a processor's performance potential, by which time a new generation has appeared and the process starts again.
We address this challenge by developing a portable optimising compiler. Our approach employs machine learning to automatically learn the best optimisations to apply for any new program on a new microarchitectural configuration. It achieves this by learning a model off-line which maps a microarchitecture description plus the hardware counters from a single run of the program to the best compiler optimisation passes. Our compiler gains 67% of the maximum speedup obtainable by an iterative compiler search using 1000 evaluations. We obtain, on average, a 1.16x speedup over the highest default optimisation level across an entire microarchitecture configuration space, achieving a 4.3x speedup in the best case. We demonstrate the robustness of this technique by applying it to an extended microarchitectural space where we achieve comparable performance.

References

[1]
F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M. F. P. O'Boyle, J. Thomson, M. Toussaint, and C. K. I. Williams. Using machine learning to focus iterative optimization. In CGO, 2006.
[2]
L. Almagor, K. D. Cooper, A. Grosul, T. J. Harvey, S. W. Reeves, D. Subramanian, L. Torczon, and T. Waterman. Finding effective compilation sequences. SIGPLAN Not., 39(7), 2004.
[3]
J. Cavazos, G. Fursin, F. Agakov, E. Bonilla, M. F. P. O'Boyle, and O. Temam. Rapidly selecting good compiler optimizations using performance counters. In CGO, 2007.
[4]
J. Cavazos and M. F. P. O'Boyle. Method-specific dynamic compilation using logistic regression. In OOPSLA, 2006.
[5]
G. Contreras, M. Martonosi, J. Peng, R. Ju, and G. Lueh. XTREM: a power simulator for the Intel XScale core. In LCTES, 2004.
[6]
K. D. Cooper, A. Grosul, T. J. Harvey, S. Reeves, D. Subramanian, L. Torczon, and T. Waterman. Acme: adaptive compilation made efficient. SIGPLAN Not., 40(7), 2005.
[7]
K. D. Cooper, P. J. Schielke, and D. Subramanian. Optimizing for reduced code space using genetic algorithms. In LCTES, 1999.
[8]
V. Desmet, S. Girbal, and O. Temam. Archexplorer.org: Joint compiler/hardware exploration for fair comparison of architectures. In INTERACT workshop at HPCA, 2009.
[9]
C. Dubach, J. Cavazos, B. Franke, G. Fursin, M. F. O'Boyle, and O. Temam. Fast compiler optimisation evaluation using code-feature based performance prediction. In CF, 2007.
[10]
C. Dubach, T. M. Jones, and M. F. O'Boyle. Exploring and predicting the architecture/optimising compiler co-design space. In CASES, 2008.
[11]
C. Dubach, T. M. Jones, and M. F. P. O'Boyle. Microarchitectural design space exploration using an architecture-centric approach. In MICRO, 2007.
[12]
S. Eyerman, L. Eeckhout, T. Karkhanis, and J. E. Smith. A performance counter architecture for computing accurate cpi components. In ASPLOS, 2006.
[13]
D. Fischer, J. Teich, R. Weper, U. Kastens, and M. Thies. Design space characterization for architecture/compiler co-exploration. In CASES, 2001.
[14]
G. Fursin, C. Miranda, O. Temam, M. Namolaru, E. Yom-Tov, A. Zaks, B. Mendelson, E. Bonilla, J. Thomson, H. Leather, C. Williams, and M. O. Boyle. MILEPOST GCC: machine learning based research compiler. In GCC Summit, 2008.
[15]
M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown. MiBench: A free, commercially representative embedded benchmark suite. In WWC, 2001.
[16]
M. Haneda, P. Knijnenburg, and H. Wijshoff. Automatic selection of compiler options using non-parametric inferential statistics. PACT, 2005.
[17]
K. Hoste and L. Eeckhout. Cole: compiler optimization level exploration. In CGO, 2008.
[18]
E. İpek, B. R. de Supinski, M. Schulz, and S. A. McKee. An approach to performance prediction for parallel applications. In Euro-Par, 2005.
[19]
E. İpek, S. A. McKee, R. Caruana, B. R. de Supinski, and M. Schulz. Efficiently exploring architectural design spaces via predictive modeling. In ASPLOS-XII, 2006.
[20]
P. J. Joseph, K. Vaswani, and M. J. Thazhuthaveetil. Construction and use of linear regression models for processor performance analysis. In HPCA-12, February 2006.
[21]
P. J. Joseph, K. Vaswani, and M. J. Thazhuthaveetil. A predictve performance model for superscalar processors. In MICRO-39, 2006.
[22]
T. S. Karkhanis and J. E. Smith. A first-order superscalar processor model. In ISCA, 2004.
[23]
S. Khan, P. Xekalakis, J. Cavazos, and M. Cintra. Using predictive modeling for cross-program design space exploration in multicore systems. In PACT, 2007.
[24]
P. Kulkarni, S. Hines, J. Hiser, D. Whalley, J. Davidson, and D. Jones. Fast searches for effective optimization phase sequences. In PLDI, 2004.
[25]
C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis&transformation. In CGO, 2004.
[26]
B. C. Lee and D. Brooks. Illustrative design space studies with microarchitectural regression models. In HPCA-13, 2007.
[27]
B. C. Lee and D. M. Brooks. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In ASPLOS-XII, 2006.
[28]
B. C. Lee, D. M. Brooks, B. R. de Supinski, M. Schulz, K. Singh, and S. A. McKee. Methods of inference and learning for performance modeling of parallel applications. In PPoPP-12, 2007.
[29]
A. McGovern and J. E. B. Moss. Scheduling straight-line code using reinforcement learning and rollouts. In NIPS, 1998.
[30]
Z. Pan and R. Eigenmann. Fast and effective orchestration of compiler optimizations for automatic performance tuning. In CGO, 2006.
[31]
A. Phansalkar, A. Joshi, L. Eeckhout, and L. K. John. Measuring program similarity: Experiments with spec cpu benchmark suites. In ISPASS, 2005.
[32]
M. Puschel, J. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. Johnson, and N. Rizzolo. Spiral: Code generation for dsp transforms. Proceedings of the IEEE, 93(2):232--275, Feb. 2005.
[33]
M. Stephenson and S. Amarasinghe. Predicting unroll factors using supervised classification. In CGO, 2005.
[34]
M. Stephenson, S. Amarasinghe, M. Martin, and U. O'Reilly. Meta optimization: improving compiler heuristics with machine learning. In PLDI, 2003.
[35]
D. Tarjan, S. Thoziyoor, and N. P. Jouppi. Cacti 4.0. Technical Report HPL-2006-86, HP Laboratories Palo Alto, 2006.
[36]
S. Triantafyllis, M. Vachharajani, N. Vachharajani, and D. I. August. Compiler optimization-space exploration. In CGO, 2003.
[37]
Trimaran: An infrastructure for research in instruction-level parallelism. http://www.trimaran.org/, 2000.
[38]
K. Vaswani, M. J. Thazhuthaveetil, Y. N. Srikant, and P. J. Joseph. Microarchitecture sensitive empirical models for compiler optimizations. In CGO, 2007.
[39]
R. Vuduc, J. W. Demmel, and J. A. Bilmes. Statistical models for empirical search-based performance tuning. Int. J. High Perform. Comput. Appl., 18(1):65--94, 2004.
[40]
K. Yotov, X. Li, G. Ren, M. Cibulskis, G. DeJong, M. Garzaran, D. Padua, K. Pingali, P. Stodghill, and P. Wu. A comparison of empirical and model-driven optimization. SIGPLAN Not., 38(5):63--76, 2003.
[41]
M. Zhao, B. Childers, and M. L. Soffa. Predicting the impact of optimizations for embedded systems. SIGPLAN Not., 38(7), 2003.
[42]
W. Zhao, B. Cai, D. Whalley, M. W. Bailey, R. van Engelen, X. Yuan, J. D. Hiser, J. W. Davidson, K. Gallivan, and D. L. Jones. Vista: a system for interactive code improvement. In LCTES/SCOPES, 2002.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO 42: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
December 2009
601 pages
ISBN:9781605587981
DOI:10.1145/1669112
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 December 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. architecture/compiler co-design
  2. design-space exploration
  3. machine learning

Qualifiers

  • Research-article

Conference

Micro-42
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)1
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media