skip to main content
10.1145/782814.782855acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article

High performance RDMA-based MPI implementation over InfiniBand

Published: 23 June 2003 Publication History

Abstract

Although InfiniBand Architecture is relatively new in the high performance computing area, it offers many features which help us to improve the performance of communication subsystems. One of these features is Remote Direct Memory Access (RDMA) operations. In this paper, we propose a new design of MPI over InfiniBand which brings the benefit of RDMA to not only large messages, but also small and control messages. We also achieve better scalability by exploiting application communication pattern and combining send/receive operations with RDMA operations. Our RDMA-based MPI implementation currently delivers a latency of 6.8 microseconds for small messages and a peak bandwidth of 871 Million Bytes (831 Mega Bytes) per second. Performance evaluation at the MPI level shows that for small messages, our RDMA-based design can reduce the latency by 24%, increase the bandwidth by over 104%, and reduce the host overhead by up to 22%. For large messages, we improve performance by reducing the time for transferring control messages. We have also shown that our new design is beneficial to MPI collective communication and NAS Parallel Benchmarks.

References

[1]
F. J. Alfaro, J. L. Sanchez, J. Duato, and C. R. Das. A Strategy to Compute the Infiniband Arbitration Tables. In Int'l Parallel and Distributed Processing Symposium (IPDPS'02), April 2002.
[2]
M. Banikazemi, R. K. Govindaraju, R. Blackmore, and D. K. Panda. MPI-LAPI: An Efficeint Implementation of MPI for IBM RS/6000 SP Systems. IEEE Transactions on Parallel and Distributed Systems, pages 1081--1093, October 2001.
[3]
E. V. Carrera, S. Rao, L. Iftode, and R. Bianchini. User-level communication in cluster-based servers. In Proceedings of the Eighth Symposium on High-Performance Architecture (HPCA'02), Februry 2002.
[4]
D. E. Culler, R. M. Karp, D. A. Patterson, A. Shy, K. E. Schauser, E. Santos, R. Subramonian, and T. von Eicken. Logp: Towards realistic model of parallel computation. In Principles Practice of Parallel Programming, pages 1--12, 1993.
[5]
R. Dimitrov and A. Skjellum. An Efficient MPI Implementation for Virtual Interface (VI) Architecture-Enabled Cluster Computing. http://www.mpi-softtech.com/publications/, 1998.
[6]
D. Dunning, G. Regnier, G. McAlpine, D. Cameron, B. Shubert, F. Berry, A. Merritt, E. Gronke, and C. Dodd. The Virtual Interface Architecture. IEEE Micro, pages 66--76, March/April 1998.
[7]
W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A high-performance, portable implementation of the MPI message passing interface standard. Parallel Computing, 22(6):789--828, 1996.
[8]
H. Tezuk and F. O'Carroll and A. Hori and Y. Ishikawa. Pin-down Cache: A Virtual Memory Management Technique for Zero-copy Communication. In Proceedings of 12th International Parallel Processing Symposium, 1998.
[9]
P. Husbnds and J. C. Hoe. MPI-StrT: Delivering Network Performance to Numerical Applications. In Proceedings of the Supercomputing, 1998.
[10]
InfiniBand Trade Association. InfiniBand Architecture Specification, Release 1.0, October 24, 2000.
[11]
Lawrence Livermore National Laboratory. MVICH: MPI for Virtual Interface Architecture, August, 2001.
[12]
J. Liu, J. Wu, S. P. Kinis, D. Buntins, W. Yu, B. Chandrasekaran, R. Noronha, P. Wyckoff, and D. K. Panda. MPI over InfiniBand: Early Experiences. Technical Report, OSU-CISRC-10/02-TR25, Computer and Information Science, the Ohio State University, January, 2003.
[13]
K. Magoutis, S. Addetia, A. Fedorova, M. Seltzer, J. Chase, A. Gallatin, R. Kisley, R. Wickremesinghe, and E. Gabber. Structure and performance of the direct access file system. In Proceedings of USENIX 2002 Annual Technical Conference, Monterey, CA, pages 1--14, June, 2002.
[14]
R. Martin, A. Vahdat, D. Culler, and T. Anderson. Effects of Communication Latency, Overhead, and Bandwidth in Cluster Architecture. In Proceedings of the International Symposium on Computer Architecture, 1997.
[15]
Mellanox Technologies. Mellanox InfiniBand InfiniHost Adapters, July, 2002.
[16]
NASA. NAS Parallel Benchmarks.
[17]
Pallas. Pallas MPI Benchmarks. http://www.pallas.com/e/products/pmb/.
[18]
R. Gupta, P. Balaji, D. K. Panda, and J. Nieplocha. Efficient Collective Operations using Remote Memory Operations on VIA-Based Clusters. In Int'l Parallel and Distriauted Processing Symposium (IPDPS'03), April, 2003.
[19]
S. J. Sistare and C. J. Jackson. Ultra-High Performance Communication with MPI and the Sun Fire Link Interconnect. In Proceedings of the Supercomputing, 2002.
[20]
J. S. Vetter and F. Mueller. Communication Characteristics of Large-Scale Scientific Applications for Contemporary Cluster Architectures. In Int'l Parallel and Distributed Processing Symposium (IPDPS'02), April, 2002.
[21]
J. Wu, J. Liu, P. Wyckoff, and D. K. Panda. Impact of On-Demand Connection Management in MPI over VIA. In Proceedings of the IEEE International Conference on Cluster Computing, 2002.
[22]
Y. Zhou, A. Bilas, S. Jagannathan, C. Dubnicki, J. F. Philbin, and K. Li. Expericences with vi communication for database storage. In In Proceedings of International Symposium on Computer Architecture'02, 2002.

Cited By

View all

Index Terms

  1. High performance RDMA-based MPI implementation over InfiniBand

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '03: Proceedings of the 17th annual international conference on Supercomputing
    June 2003
    380 pages
    ISBN:1581137338
    DOI:10.1145/782814
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 June 2003

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. InfiniBand
    2. MPI
    3. cluster computing
    4. high performance computing

    Qualifiers

    • Article

    Conference

    ICS03
    Sponsor:
    ICS03: International Conference on Supercomputing 2003
    June 23 - 26, 2003
    CA, San Francisco, USA

    Acceptance Rates

    ICS '03 Paper Acceptance Rate 36 of 171 submissions, 21%;
    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)133
    • Downloads (Last 6 weeks)15
    Reflects downloads up to 06 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media