skip to main content
research-article

Generalized Multiway Branch Unit for VLIW Microprocessors

Published: 01 August 1995 Publication History

Abstract

VLIW processors use multiway branch instructions to achieve high-speed, parallel evaluation of control structures. This paper introduces a new multiway branch mechanism that allows constant-time branch-target resolution based on an arbitrary condition tree. The unique feature of this mechanism is its target selection unit, which yields a branch-target based on a set of condition bit values and a condition tree description. A representation of condition trees that results in a compact target selection unit is described, and the logic diagram of a target selection unit that provides a four-way branching is shown. Our experimental results on nontrivial integer benchmarks indicate that the proposed multiway branch unit can improve the performance of VLIW machines substantially (i.e., as much as a geometric mean of 35%), compared to using the conventional two-way branching.

References

[1]
B. Rau and J. Fisher, “Instruction-level parallel processing: History, overview, and perspective,” J. of Supercomputing, Special Issue on Instruction-Level Parallelism, vol. 7, no. 1 /2, pp. 9-50, 1993.]]
[2]
J. Fisher, “2<sup>n</sup>-way jump microinstruction hardware and an effective instruction binding method,” Proc. 13th Ann. Workshop Microprogramming (Micro-13), pp. 64-75, Nov. 1980.]]
[3]
J. Fisher, “VLIW architecture and the ELI-512,” Proc. 10th Int’l Symp. Computer Architecture, pp. 140-150, May 1983.]]
[4]
K. Karplus and A. Nicolau, “Efficient hardware for multi-way jumps and prefetches,” Proc. 18th Ann. Workshop Microprogramming (Micro-18), pp. 11-18, Dec. 1985.]]
[5]
K. Ebcioglu, “Some design ideas for a VLIW architecture for sequential natured software,” Proc. IFIP 10.3 Working Conf. Parallel Processing, pp. 3-21, Apr. 1988.]]
[6]
S.-M. Moon, S. Carson, and A. Agrawala, “Hardware implementation of a general multi-way jump mechanism,” Proc. 23rd Ann. Symp. Microarchitecture (Micro-23), pp. 38-45, Dec. 1990.]]
[7]
S. McFarling and J. Hennessy, “Reducing the cost of branches,” Proc. 18th Ann. Workshop Microprogramming (Micro-18), pp.11-18, 1985.]]
[8]
J. Ellis, Bulldog: A Compiler for VLIW Architecture. Cambridge, Mass.: MIT Press, 1986.]]
[9]
A. Aiken and A. Nicolau, “A development environment for horizontal microcode,” IEEE Transactions on Software Engineering, vol. 14, no. 5, pp. 584-594, May 1988.]]
[10]
S.-M. Moon and K. Ebcioglu, “An efficient resource-constrained global scheduling technique for superscalar and VLIW processors,” Proc. 25th Ann. Int’l Symp. Microarchitecture (Micro-25), pp. 55-71, Dec. 1992.]]
[11]
B. Rau and C. Glaeser, “Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing,” Proc. 14th Ann. Workshop Microprogramming (Micro-14), pp. 183-198, Oct. 1981.]]
[12]
M. Lam, “Software pipelining: An effective scheduling technique for VLIW machines,” Proc. SIGPLAN 1988 Conf. Programming Language Design and Implementation, pp. 318-328, June 1988.]]
[13]
K. Ebcioglu and T. Nakatani, “A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture,” Languages and Compilers for Parallel Computing. Cambridge, Mass.: MIT Press, pp. 213-229, 1989.]]
[14]
S.-M. Moon, “Increasing instruction-level parallelism through multi-way branching,” Proc. 1993 Int’l Conf. Parallel Processing, pp. 2:241-245, Aug. 1993.]]
[15]
T. Nakatani and K. Ebcioglu, “Making compaction based parallelization affordable,” IEEE Trans. Parallel Distributed Syst., pp. 1,014-1,029, Sept. 1993.]]
[16]
S.-M. Moon, “Compile-time parallelization of non-numerical code; VLIW and superscalar,” PhD dissertation, Dept. Computer Science, Univ. of Maryland, 1993.]]
[17]
H. Warren, M. Auslander, G. Chaitin, A. Chibib, M. Hopkins, and A. MacKay, “Final code generation in the PL.8 compiler,” IBM Research Division, June, 1986, Res. Rep. RC 11974.]]
[18]
D. Bernstein and M. Rodeh, “Global instruction scheduling for superscalar machines,” Proc. SIGPLAN 1991 Conf. Programming Language Design and Implementation, pp. 241-255, June 1991.]]
[19]
M. Smith, M. Horowitz, and M. Lam, “Efficient superscalar performance through boosting,” Proc. Fifth Int’l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS-5), pp. 248-259, Oct. 1992.]]

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems  Volume 6, Issue 8
August 1995
129 pages

Publisher

IEEE Press

Publication History

Published: 01 August 1995

Author Tags

  1. Instruction-level parallelism
  2. VLIW compiler
  3. VLIW microprocessor
  4. condition tree
  5. generalized multiway branching
  6. mirror normalization
  7. superscalar microprocessor.

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media