skip to main content
10.1145/781131.781162acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
Article

Optimizing indirect branch prediction accuracy in virtual machine interpreters

Published: 09 May 2003 Publication History

Abstract

Interpreters designed for efficiency execute a huge number of indirect branches and can spend more than half of the execution time in indirect branch mispredictions. Branch target buffers are the best widely available form of indirect branch prediction; however, their prediction accuracy for existing interpreters is only 2%--50%. In this paper we investigate two methods for improving the prediction accuracy of BTBs for interpreters: replicating virtual machine (VM) instructions and combining sequences of VM instructions into superinstructions. We investigate static (interpreter build-time) and dynamic (interpreter run-time) variants of these techniques and compare them and several combinations of these techniques. These techniques can eliminate nearly all of the dispatch branch mispredictions, and have other benefits, resulting in speedups by a factor of up to 3.17 over efficient threaded-code interpreters, and speedups by a factor of up to 1.3 over techniques relying on superinstructions alone.

References

[1]
J. R. Bell. Threaded code. Commun. ACM, 16(6):370--372, 1973.]]
[2]
T. C. Bell, J. G. Cleary, and I. H. Witten. Text Compression. Prentice-Hall, 1990.]]
[3]
B. Calder and D. Grunwald. Reducing branch costs via branch alignment. In Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), pages 242--251, 1994.]]
[4]
K. Driesen and U. Hölzle. Accurate indirect branch prediction. In Proceedings of the 25th Annual International Symposium on Computer Architecture (ISCA-98), pages 167--178, 1998.]]
[5]
K. Driesen and U. Hölzle. Multi-stage cascaded prediction. In EuroPar'99 Conference Proceedings, volume 1685 of LNCS, pages 1312--1321. Springer, 1999.]]
[6]
M. A. Ertl. Stack caching for interpreters. In SIGPLAN '95 Conference on Programming Language Design and Implementation, pages 315--327, 1995.]]
[7]
M. A. Ertl and D. Gregg. The behaviour of efficient virtual machine interpreters on modern architectures. In Euro-Par 2001, pages 403--412. Springer LNCS~2150, 2001.]]
[8]
M. A. Ertl, D. Gregg, A. Krall, and B. Paysan. vmgen --- a generator of efficient virtual machine interpreters. Software---Practice and Experience, 32(3):265--294, 2002.]]
[9]
J. Hoogerbrugge and L. Augusteijn. Pipelined Java virtual machine interpreters. In Proceedings of the 9th International Conference on Compiler Construction (CC' 00). Springer LNCS, 2000.]]
[10]
J. Hoogerbrugge, L. Augusteijn, J. Trum, and R. van~de Wiel. A code compression system based on pipelined interpreters. Software---Practice and Experience, 29(11):1005--1023, Sept. 1999.]]
[11]
J. Kalamatianos and D. Kaeli. Indirect branch prediction using data compression techniques. Journal of Instruction Level Parallelism, Dec. 1999.]]
[12]
A. Krall. Improving semi-static branch prediction by code replication. In Conference on Programming Language Design and Implementation, volume 29(7) of SIGPLAN, pages 97--106, Orlando, 1994. ACM.]]
[13]
I. Piumarta and F. Riccardi. Optimizing direct threaded code by selective inlining. In SIGPLAN '98 Conference on Programming Language Design and Implementation, pages 291--300, 1998.]]
[14]
T. A. Proebsting. Optimizing an ANSI~C interpreter with superoperators. In Principles of Programming Languages (POPL '95), pages 322--332, 1995.]]
[15]
T. H. Romer, D. Lee, G. M. Voelker, A. Wolman, W. A. Wong, J.-L. Baer, B. N. Bershad, and H. M. Levy. The structure and performance of interpreters. In Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII), pages 150--159, 1996.]]
[16]
V. Santos~Costa. Optimising bytecode emulation for Prolog. In LNCS 1702, Proceedings of PPDP'99, pages 261--267. Springer-Verlag, September 1999.]]
[17]
C. Young, N. Gloy, and M. D. Smith. A comparative analysis of schemes for correlated branch prediction. In 22nd Annual International Symposium on Computer Architecture, pages 276--286, 1995.]]
[18]
C. Young and M. D. Smith. Improving the accuracy of static branch prediction using branch correlation. In Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), pages 232--241, 1994.]]

Cited By

View all

Index Terms

  1. Optimizing indirect branch prediction accuracy in virtual machine interpreters

      Recommendations

      Reviews

      Soundararajan Ezekiel

      Ertl and Gregg investigate two methods for improving the prediction accuracy of branch target buffers: virtual machine instruction and combining sequences of virtual machine instruction into super instructions. They investigate combinations of the static and dynamic variants of each technique. By studying various performance monitoring counters, such as cycles, instructions, taken instructions, taken branch instructions that are mispredicted, instruction fetch misses, code bytes, and miss cycles, the authors claim their method eliminates most dispatch branch mispredictions, and increases the speed to 3.17 times faster than standard threaded-code interpreters. This paper is very well researched and written. Online Computing Reviews Service

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
      June 2003
      360 pages
      ISBN:1581136625
      DOI:10.1145/781131
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 09 May 2003

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. branch prediction
      2. branch target buffer
      3. code replication
      4. interpreter
      5. superinstruction

      Qualifiers

      • Article

      Conference

      PLDI03
      Sponsor:

      Acceptance Rates

      PLDI '03 Paper Acceptance Rate 28 of 131 submissions, 21%;
      Overall Acceptance Rate 406 of 2,067 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)24
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 17 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media