skip to main content
10.1145/1403375.1403559acmconferencesArticle/Chapter ViewAbstractPublication PagesdateConference Proceedingsconference-collections
research-article

Harnessing horizontal parallelism and vertical instruction packing of programs to improve system overall efficiency

Published: 10 March 2008 Publication History

Abstract

Multi-issue processors can exploit the Instruction Level Parallelism (ILP) of programs to improve the performance greatly. How to reduce the energy consumption while maintaining the high performance of programs running on multi-issue processors remains a challenging problem. In this paper, we propose a novel approach to apply the instruction register file (IRF) technique from single-issue processor to VLIW architecture. Frequently executed instructions are selected to be placed in the on-chip IRF for fast access in program execution. Violation of synchronization among VLIW instruction slots is avoided by introducing new instruction formats and microarchitectural support. The enhanced VLIW architecture is thus able to orchestrate the horizontal instruction parallelism and vertical instruction packing for programs to improve system overall efficiency. Our experimental results show that the proposed processor architecture achieves both the performance advantage provided by the VLIW architecture and high energy efficiency provided by the IRF-based instruction packing technique (e.g., 71.1% reduction in the fetch energy consumption for a 4-way VLIW architecture with 8-entry IRFs).

References

[1]
SIMPLESCALAR-ARM POWER MODELING PROJECT. {http://www.eecs.umich.edu/panalyzer/}.
[2]
Trimaran. {http://www.trimaran.org/}.
[3]
G. Ascia, V. Catania, M. Palesi, and D. Patti. System-level framework for evaluating area/performance/power trade-offs of VLIW-based embedded systems. In Proc. Asia & South-Pacific Design Automation Conf., pages 940--943, Jan. 2005.
[4]
T. M. Conte, S. Banerjia, S. Y. Larin, K. N. Menezes, and S. W. Sathaye. Instruction fetch mechanisms for VLIW architectures with compressed encodings. In Proc. Int. Symp. Microarchitecture, pages 201--211, Dec. 1996.
[5]
E. Gibert, J. Sanchez, and A. Gonzalez. Effective instruction scheduling techniques for an interleaved cache clustered VLIW processor. In Proc. Int. Symp. Microarchitecture, pages 123--133, Nov. 2002.
[6]
S. Haga, Y. Zhang, A. Webber, and R. Barua. Reducing code size in VLIW instruction scheduling. Journal of Embedded Computing, 1(3):415--433, Aug. 2005.
[7]
S. Hines, J. Green, G. Tyson, and D. Whalley. Improving program efficiency by packing instructions into registers. In Proc. Int. Symp. Computer Architecture, pages 260--271, May 2005.
[8]
S. Hines, G. Tyson, and D. Whalley. Improving the energy and execution efficiency of a small instruction cache by using an instruction register file. In Proc. of Watson Conf. on Interaction between Architecture, Circuits, & Compilers, pages 160--169, Sept. 2005.
[9]
M. Johnson. Superscalar Microprocessor Design. Prentice Hall, 1991.
[10]
H. S. Kim, N. Vijaykrishnan, M. Kandemir, and M. J. Irwin. A framework for energy estimation of VLIW architecture. In Proc. Int. Conf. Computer Design, pages 40--46, Sept. 2001.
[11]
A. Macii, E. Macii, F. Crudo, and R. Zafalon. A new algorithm for energy-driven data compression in VLIW embedded processors. In Proc. Design Automation & Test Europe Conf., pages 10024--10030, Oct. 2003.
[12]
Philips-Inc. An Introduction to Very-long Instruction Word (VLIW) computer architecture. Philips Semiconductors, 1997.
[13]
Y. Qian, S. Carr, and P. Sweany. Optimizing loop performance for clustered VLIW architectures. In Proc. of Int. Conf. on Parallel Architectures & Compilation Techniques, pages 271--280, Sept. 2002.
[14]
H. Sasaki, M. Kondo, and H. Nakamura. Energy-efficient dynamic instruction scheduling logic through instruction grouping. In Proc. Int. Symp. Low Power Electronics & Design, pages 43--48, Oct. 2006.
[15]
J. Sharkey, D. Ponomarev, K. Ghose, and O. Ergin. Instruction packing: reducing power and delay of the dynamic scheduling logic. In Proc. Int. Symp. Low Power Electronics & Design, pages 30--35, Aug. 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DATE '08: Proceedings of the conference on Design, automation and test in Europe
March 2008
1575 pages
ISBN:9783981080131
DOI:10.1145/1403375
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 March 2008

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

DATE '08
Sponsor:
  • EDAA
  • SIGDA
  • The Russian Academy of Sciences
DATE '08: Design, Automation and Test in Europe
March 10 - 14, 2008
Munich, Germany

Acceptance Rates

Overall Acceptance Rate 518 of 1,794 submissions, 29%

Upcoming Conference

DATE '25
Design, Automation and Test in Europe
March 31 - April 2, 2025
Lyon , France

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media