skip to main content
research-article

Destination-based congestion awareness for adaptive routing in 2D mesh networks

Published: 25 October 2013 Publication History

Abstract

The choice of routing algorithm plays a vital role in the performance of on-chip interconnection networks. Adaptive routing is appealing because it offers better latency and throughput than oblivious routing, especially under nonuniform and bursty traffic. The performance of an adaptive routing algorithm is determined by its ability to accurately estimate congestion in the network. In this regard, maintaining global congestion state using a separate monitoring network offers better congestion visibility into distant parts of the network compared to solutions relying only on local congestion. However, the main challenge in designing such routing schemes is to keep the logic and bandwidth overhead as low as possible to fit into the tight power, area, and delay budgets of on-chip routers. In this article, we propose a minimal destination-based adaptive routing strategy (DAR), where every node estimates the delay to every other node in the network, and routing decisions are based on these per-destination delay estimates. DAR outperforms Regional Congestion Awareness (RCA), the best previously known adaptive routing algorithm that uses nonlocal congestion state. The performance improvement is brought about by maintaining fine-grained per-destination delay estimates in DAR that are more accurate than regional congestion metrics measured in RCA. The increased accuracy is a consequence of the fact that the per-destination delay estimates are not corrupted by congestion on links outside the admissible routing paths to the destination. A scalable version of DAR, referred to as SDAR, is also proposed for minimizing the overheads associated with DAR in large network topologies. We show that DAR outperforms local adaptive routing by up to 79% and RCA by up to 58% in terms of latency on SPLASH-2 benchmarks. DAR and SDAR also outperform existing adaptive and oblivious routing algorithms in latency and throughput under synthetic traffic patterns on 8×8 and 16times;16 mesh topologies, respectively.

References

[1]
Balfour, J. and Dally, W. 2006. Design tradeoffs for tiled cmp on-chip networks. In Proceedings of the 20th Annual International Conference on Supercomputing.
[2]
Dally, W. J. 1990. Virtual-channel flow control. In Proceedings of the Annual International Symposium on Computer Architecture.
[3]
Duato, J. 1995. A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. IEEE Trans. Parallel Distrib. Syst. 6, 10, 1055--1067.
[4]
Gratz, P., Grot, B., and Keckler, S. W. 2008. Regional congestion awareness for load balance in networks-on-chip. In Proceedings of the International Symposium on High-Performance Computer Architecture.
[5]
Gratz, P., Kim, C., McDonald, R., Keckler, S. W., and Burger, D. 2006. Implementation and evaluation of on-chip network architectures. In Proceedings of the IEEE International Conference on Computer Design.
[6]
Hu, J. and Marculescu, R. 2004. Dyad: smart routing for networks-on-chip. In Proceedings of the 41st annual Design Automation Conference (DAC'04). ACM, New York, 260--263.
[7]
IBM. IBM Blue Gene project. http://www.research.ibm.com/bluegene/.
[8]
Jiang, N., Dally, W., and Kim, J. 2009. Indirect adaptive routing on large scale interconnection networks. In Proceedings of the Annual International Symposium on Computer Architecture.
[9]
Kahle, J. A., Day, M. N., Hofstee, H. P., Johns, C. R., Maeurer, T. R., and Shippy, D. 2005. Introduction to the Cell multiprocessor. IBM J. Res. Dev. 49, 4/5.
[10]
Kim, J., Balfour, J., and Dally, W. 2007. Flattened butterfly topology for on-chip networks. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture.
[11]
Kim, J., Park, D., Theocharides, T., Vijaykrishnan, N., and Das, C. R. 2005. A low latency router supporting adaptivity for on-chip interconnects. In Proceedings of the 41st annual Design Automation Conference.
[12]
Kumar, A., Peh, L.-S., and Jha, N. K. 2008. Token flow control. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture.
[13]
Kumar, A., Peh, L.-S., Kundu, P., and Jha, N. 2007. Express virtual channels: Towards the ideal interconnection fabric. In Proceedings of the Annual International Symposium on Computer Architecture.
[14]
Li, M., Zeng, Q.-A., and Jone, W.-B. 2006. Dyxy: A proximity congestion-aware deadlock-free dynamic routing method for network on chip. In Proceedings of the 43rd Annual Design Automation Conference (DAC'06). ACM, New York, 849--852.
[15]
Lugones, D., Franco, D., and Luque, E. 2009. Dynamic and distributed multipath routing policy for high-speed cluster networks. In Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'09). IEEE, 396--403.
[16]
Ma, S., Jerger, N. E., and Wang, Z. 2011. Dbar: an efficient routing algorithm to support multiple concurrent applications in networks-on-chip. In Proceedings of the 38th annual international symposium on Computer architecture (ISCA'11). ACM, New York, 413--424.
[17]
Mullins, R., West, A., and Moore, S. 2004. Low-latency virtual-channel routers for on-chip networks. In Proceedings of the Annual International Symposium on Computer Architecture.
[18]
Netmaker. 2009. Netmaker. http://www-dyn.cl.cam.ac.uk/∼rdm34/wiki/index.php?title=Main_Page.
[19]
Paxson, V. 1997. Fast, approximate synthesis of fractional gaussian noise for generating self-similar network traffic. ACM SIGCOMM Comput. Commun. Rev. 27, 5.
[20]
Ramanujam, R. S. and Lin, B. 2010. Destination-based adaptive routing in 2D mesh networks. In Proceedings of the ACM/IEEE Symposium onArchitectures for Networking and Communications Systems.
[21]
Scott, S. L. and Thorson, G. 1996. The Cray T3E network: Adaptive routing in a high-performance 3D torus. In Proceedings of the Annual Symposium on High Performance Interconnects.
[22]
Seiler, L., Carmean, D., et al. 2008. Larrabee: a many-core x86 architecture for visual computing. In Proceedings of the ACM SIGGRAPH International Conference on Computer Graphics and Interactive Techniques.
[23]
Seo, D., Ali, A., Lim, W.-T., Rafique, N., and Thottethodi, M. 2005. Near-optimal worst-case throughput routing for two-dimensional mesh networks. In Proceedings of the Annual International Symposium on Computer Architecture.
[24]
Shang, L., Peh, L.-S., and Jha, N. K. 2003. Dynamic voltage scaling with links for power optimization of interconnection networks. In Proceedings of the International Symposium on High-Performance Computer Architecture.
[25]
Singh, A., Dally, W. J., Gupta, A. K., and Towles, B. 2004. Adaptive channel queue routing on k-ary n-cubes. In Proceedings of the Annual ACM Symposium on Parallelism in Algorithms and Architectures.
[26]
SPLASH-2. http://www-flash.stanford.edu/apps/SPLASH/.
[27]
Taylor, M., Kim, J., et al. 2002. The Raw microprocessor: A computational fabric for software circuits and general-purpose programs. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture.
[28]
Towles, B. and Dally, W. J. 2003. Throughput-centric routing algorithm design. In Proceedings of the Annual ACM Symposium on Parallelism in Algorithms and Architectures. 200--209.
[29]
Valiant, L. G. and Brebner, G. J. 1981. Universal schemes for parallel communication. In Proceedings of the Annual ACM Symposium on Theory of Computing.
[30]
Vangal, S., Howard, J., et al. An 80-tile 1.28TFLOPS network-on-chip in 65nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference.

Cited By

View all

Index Terms

  1. Destination-based congestion awareness for adaptive routing in 2D mesh networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Design Automation of Electronic Systems
    ACM Transactions on Design Automation of Electronic Systems  Volume 18, Issue 4
    Special Section on Networks on Chip: Architecture, Tools, and Methodologies
    October 2013
    380 pages
    ISSN:1084-4309
    EISSN:1557-7309
    DOI:10.1145/2541012
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 25 October 2013
    Accepted: 01 April 2013
    Revised: 01 August 2012
    Received: 01 February 2012
    Published in TODAES Volume 18, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. On-chip networks
    2. adaptive routing

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 31 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media