skip to main content
article

Automated design of finite state machine predictors for customized processors

Published: 01 May 2001 Publication History

Abstract

Customized processors use compiler analysis and design automation techniques to take a generalized architectural model and create a specific instance of it which is optimized to a given application or set of applications. These processors offer the promise of satisfying the high performance needs of the embedded community while simultaneously shrinking design times.
Finite State Machines (FSM) are a fundamental building block in computer architecture, and are used to control and optimize all types of prediction and speculation, now even in the embedded space. They are used for branch prediction, cache replacement policies, and confidence estimation and accuracy counters for a variety of optimizations.
In this paper, we present a framework for automated design of small FSM predictors for customized processors. Our approach can be used to automatically generate small FSM predictors to perform well over a suite of applications, tailored to a specific application, or even a specific instruction. We evaluate the use of these customized FSM predictors for branch prediction over a set of benchmarks.

References

[1]
S.G. Abraham and S. A. Mahlke. Automatic and efficient evaluation of memory hierarchies for embedded systems, In 32nd International Symposium on Microarchitecture, 1999.]]
[2]
M. Burtscher and B.G. Zorn. Prediction outcome history-based confidence estimation for load value prediction. JoLtrnal of Instruction- Level Parallelism, 1, 1999.]]
[3]
B. Calder and G. Reinman. A comparative survery of load speculation architectures. Journal of lnstruction Level Parallelism, 2, 2000.]]
[4]
B. Calder, G. Reinman, and D. Tullsen. Selective value prediction. In 26th Annual International Symposium on Cornputer Architecture, June 1999.]]
[5]
I.-C. Chert, J.T. Coffey, and T.N. Mudge. Analysis of branch prediction via data compression. In Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, October 1996.]]
[6]
J. Emer and N. Gloy. A language for describing predictors and its application to automatic synthesis. In 24th Annual International Symposium on Computer Architecture, June 1997.]]
[7]
G. Ezer. Xtensa with user defined dsp coprocessor microarchitectures. In Proceedings of the International Conference on Computer Design, 2000 (ICCD2000), pages 335-342, September 2000.]]
[8]
J.A. Fisher, P. Faraboschi, and G. Desoli. Custom-fit processors: Letting applications define architectures. In 29th International Symposium on Microarchitecture, pages 324-335, December 1996.]]
[9]
M.J. Flynn and R. I. Winner. Asic microprocessors. In 23th International Symposium on Microarchitecture, pages 237-243, 1990.]]
[10]
R. E. Gonzalez. Xtensa: A configurable and extensible processor. IEEE Micro, 20(2):60-70, March-April 2000.]]
[11]
D. Grunwald, A. Klauser, S. Manne, and A. Pleskun. Confidence estimation for speculation control. In 25th Annual International Symposium on Computer Architecture, June 1998.]]
[12]
J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory languages and Computation. Addison-Wesley, 1979.]]
[13]
E. Jacobsen, E. Rotenberg, and J.E. Smith. Assigning confidence to conditional branch predictions. In 29th International Symposium opt Microarchitecture, December 1996.]]
[14]
S. Leibson. Xscale (strongarm-2) muscles in. Microprocessor Report, September 2000.]]
[15]
S. Manne, A. Klauser, and D. Grunwald. Pipeline gating: Speculation control for energy reduction. In 25th Annual International Symposium on Computer Architecture, June 1998.]]
[16]
S. McFarling. Program optimization for instruction caches. In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS III), pages 183-191, April 1989.]]
[17]
S. McFarling. Combining branch predictors. Technical Report TN-36, Digital Equipment Corporation, Western Research Lab, June 1993.]]
[18]
H. Mulder and R. J. Portier. Cost-effective design of application specific vliw processors using the scarce framework. In 22th lnternational Symposium on Microarchitecture, 1989.]]
[19]
E. Musoll. Predicting the usefulness of a block result: a microarchitectural technique for high-performance low-power processors. In 32nd International Symposium on Microarchitecture, November 1999.]]
[20]
B. Ramakrishna Ran and Michael S. Schlansker. Embedded computing: New directions in architecture and automation. In 7th International Conference on High-Performance Computing (HiPC2000), 2000.]]
[21]
R. Razdan and M. D. Smith. A high-performance microarchitecture with hardware-programmable functional units. In 27th International Symposium on Microarchitecture, pages 172-180, 1994.]]
[22]
R. Rudell and A. Sangiovanni-Vincentelli. Multiple-valued minimization for pla optimization. IEEE Transactions on Computer Aided Design, 6(5):727-750, 1987.]]
[23]
R. Schreiber, S. Aditya, B.R. Ran, V. Kathall, S. Mahlke, S. Abraham, and G. Snider. High-level synthesis of nonprogrammable hardware accelerators. Technical report, Hewlett Packard Reseach Labs, 2000. HPL-2000-31.]]
[24]
T. Sherwood and B. Calder. Loop termination prediction. In 3rd International Symposium on High Performance Computing, October 2000.]]
[25]
T. Sherwood, S. Salr, and B. Calder. Predictor-directed stream buffers. In 33rd International Symposium on Microarchitecture, December 2000.]]
[26]
J. E. Smith. A study of branch prediction strategies. In 8th Annual International Symposium of Computer Architecture, pages 135-148. ACM, 1981.]]
[27]
C.D. Snyder. Fpga processors cores get serious. Microprocessor Report, 14(9), September 2000.]]
[28]
A. Srivastava and A. Eustace. Atom: A system for building customized program analysis tools. In Proceedings of the Conference on Programming Language Design and Implementation, pages 196-205. ACM, 1994.]]
[29]
G. Tyson, M. Farrens, J. Mathews, and A. Pleszken. Managing data caches using selective cache line replacement. International Journal of Parallel Programming, 25(3), 1997.]]
[30]
G. Tyson, K. Lick, and M. Fattens. Limited dual path execution. Technical Report CSE-TR 345-97, University of Michigan, 1997.]]
[31]
T.Y. Yeh and Y.N. Patt. A comparison of dynamic branch predictors that use two levels of branch history. In 20th Annual International Symposium on Computer Architecture, pages 257-266, San Diego, CA, May 1993. ACM.]]
[32]
A. Zhi, A. Moshovos, S. Hauck, and P. Banerjee. Chimaera: A high performance architecture with a tightly-coupled reconfigurable functional unit. In 27th Annual International Symposium on Comptrter Architecture, June 2000.]]

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 29, Issue 2
Special Issue: Proceedings of the 28th annual international symposium on Computer architecture (ISCA '01)
May 2001
262 pages
ISSN:0163-5964
DOI:10.1145/384285
Issue’s Table of Contents
  • cover image ACM Conferences
    ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture
    June 2001
    289 pages
    ISBN:0769511627
    DOI:10.1145/379240

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2001
Published in SIGARCH Volume 29, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media