skip to main content

Generalized Multiway Branch Unit for VLIW Microprocessors

Published: 01 August 1995 Publication History


VLIW processors use multiway branch instructions to achieve high-speed, parallel evaluation of control structures. This paper introduces a new multiway branch mechanism that allows constant-time branch-target resolution based on an arbitrary condition tree. The unique feature of this mechanism is its target selection unit, which yields a branch-target based on a set of condition bit values and a condition tree description. A representation of condition trees that results in a compact target selection unit is described, and the logic diagram of a target selection unit that provides a four-way branching is shown. Our experimental results on nontrivial integer benchmarks indicate that the proposed multiway branch unit can improve the performance of VLIW machines substantially (i.e., as much as a geometric mean of 35%), compared to using the conventional two-way branching.


B. Rau and J. Fisher, “Instruction-level parallel processing: History, overview, and perspective,” J. of Supercomputing, Special Issue on Instruction-Level Parallelism, vol. 7, no. 1 /2, pp. 9-50, 1993.]]
J. Fisher, “2<sup>n</sup>-way jump microinstruction hardware and an effective instruction binding method,” Proc. 13th Ann. Workshop Microprogramming (Micro-13), pp. 64-75, Nov. 1980.]]
J. Fisher, “VLIW architecture and the ELI-512,” Proc. 10th Int’l Symp. Computer Architecture, pp. 140-150, May 1983.]]
K. Karplus and A. Nicolau, “Efficient hardware for multi-way jumps and prefetches,” Proc. 18th Ann. Workshop Microprogramming (Micro-18), pp. 11-18, Dec. 1985.]]
K. Ebcioglu, “Some design ideas for a VLIW architecture for sequential natured software,” Proc. IFIP 10.3 Working Conf. Parallel Processing, pp. 3-21, Apr. 1988.]]
S.-M. Moon, S. Carson, and A. Agrawala, “Hardware implementation of a general multi-way jump mechanism,” Proc. 23rd Ann. Symp. Microarchitecture (Micro-23), pp. 38-45, Dec. 1990.]]
S. McFarling and J. Hennessy, “Reducing the cost of branches,” Proc. 18th Ann. Workshop Microprogramming (Micro-18), pp.11-18, 1985.]]
J. Ellis, Bulldog: A Compiler for VLIW Architecture. Cambridge, Mass.: MIT Press, 1986.]]
A. Aiken and A. Nicolau, “A development environment for horizontal microcode,” IEEE Transactions on Software Engineering, vol. 14, no. 5, pp. 584-594, May 1988.]]
S.-M. Moon and K. Ebcioglu, “An efficient resource-constrained global scheduling technique for superscalar and VLIW processors,” Proc. 25th Ann. Int’l Symp. Microarchitecture (Micro-25), pp. 55-71, Dec. 1992.]]
B. Rau and C. Glaeser, “Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing,” Proc. 14th Ann. Workshop Microprogramming (Micro-14), pp. 183-198, Oct. 1981.]]
M. Lam, “Software pipelining: An effective scheduling technique for VLIW machines,” Proc. SIGPLAN 1988 Conf. Programming Language Design and Implementation, pp. 318-328, June 1988.]]
K. Ebcioglu and T. Nakatani, “A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture,” Languages and Compilers for Parallel Computing. Cambridge, Mass.: MIT Press, pp. 213-229, 1989.]]
S.-M. Moon, “Increasing instruction-level parallelism through multi-way branching,” Proc. 1993 Int’l Conf. Parallel Processing, pp. 2:241-245, Aug. 1993.]]
T. Nakatani and K. Ebcioglu, “Making compaction based parallelization affordable,” IEEE Trans. Parallel Distributed Syst., pp. 1,014-1,029, Sept. 1993.]]
S.-M. Moon, “Compile-time parallelization of non-numerical code; VLIW and superscalar,” PhD dissertation, Dept. Computer Science, Univ. of Maryland, 1993.]]
H. Warren, M. Auslander, G. Chaitin, A. Chibib, M. Hopkins, and A. MacKay, “Final code generation in the PL.8 compiler,” IBM Research Division, June, 1986, Res. Rep. RC 11974.]]
D. Bernstein and M. Rodeh, “Global instruction scheduling for superscalar machines,” Proc. SIGPLAN 1991 Conf. Programming Language Design and Implementation, pp. 241-255, June 1991.]]
M. Smith, M. Horowitz, and M. Lam, “Efficient superscalar performance through boosting,” Proc. Fifth Int’l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS-5), pp. 248-259, Oct. 1992.]]

Cited By

View all



Information & Contributors


Published In

cover image IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems  Volume 6, Issue 8
August 1995
129 pages


IEEE Press

Publication History

Published: 01 August 1995

Author Tags

  1. Instruction-level parallelism
  2. VLIW compiler
  3. VLIW microprocessor
  4. condition tree
  5. generalized multiway branching
  6. mirror normalization
  7. superscalar microprocessor.


  • Research-article


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Dec 2024

Other Metrics


Cited By

View all

View Options

View options







Share this Publication link

Share on social media