skip to main content
article

Integrated network interfaces for high-bandwidth TCP/IP

Published: 20 October 2006 Publication History

Abstract

This paper proposes new network interface controller (NIC) designs that take advantage of integration with the host CPU to provide increased flexibility for operating system kernel-based performance optimization.We believe that this approach is more likely to meet the needs of current and future high-bandwidth TCP/IP networking on end hosts than the current trend of putting more complexity in the NIC, while avoiding the need to modify applications and protocols. This paper presents two such NICs. The first, the simple integrated NIC (SINIC), is a minimally complex design that moves the responsibility for managing the network FIFOs from the NIC to the kernel. Despite this closer interaction between the kernel and the NIC, SINIC provides performance equivalent to a conventional DMA-based NIC without increasing CPU overhead. The second design, V-SINIC, adds virtual per-packet registers to SINIC, enabling parallel packet processing while maintaining a FIFO model. V-SINIC allows the kernel to decouple examining a packet's header from copying its payload to memory. We exploit this capability to implement a true zero-copy receive optimization in the Linux 2.6 kernel, providing bandwidth improvements of over 50% on unmodified sockets-based receive-intensive benchmarks.

References

[1]
Alacritech, Inc. Alacritech / SLIC technology overview. http://www.alacritech.com/html/tech review.html.
[2]
Apache Software Foundation. Apache HTTP server. http://httpd.apache.org.
[3]
P. Barford and M. Crovella. Generating representative web workloads for network and server performance evaluation. In Measurement and Modeling of Computer Systems, pages 151--160, 1998.
[4]
N.L. Binkert, R.G. Dreslinski, L.R. Hsu, K.T. Lim, A.G. Saidi, and S.K. Reinhardt. The M5 simulator: Modeling networked systems. IEEE Micro, 26(4):52--60, Jul/Aug 2006.
[5]
N.L. Binkert, L.R. Hsu, A.G. Saidi, R.G. Dreslinski, A.L. Schultz, and S.K. Reinhardt. Performance analysis of system overheads in TCP/IP workloads. In Proc. 14th Ann. Int'l Conf. on Parallel Architectures and Compilation Techniques, pages 218--228, Sept. 2005.
[6]
M.A. Blumrich, C. Dubnicki, E.W. Felten, and K. Li. Protected, user-level DMA for the SHRIMP network interface. In Proc. 2nd Int'l Symp. on High-Performance Computer Architecture (HPCA), pages 154--165, Feb. 1996.
[7]
Broadcom Corp. BCM5706 product brief, 2004. http://www.broadcom.com/collateral/pb/5706-PB04-R.pdf.
[8]
Broadcom Corporation. BCM1250 product brief, 2003. http://www.broadcom.com/collateral/pb/1250-PB09-R.pdf.
[9]
J. Chase. High Performance TCP/IP Networking, chapter 13, "Software Implementation of TCP". Prentice-Hall, 2003.
[10]
J. Corbet. Linux and TCP offload engines. Linux Weekly News, Aug. 2005. http://lwn.net/Articles/148697.
[11]
W.J. Dally, L. Chao, A. Chien, S. Hassoun, W. Horwat, J. Kaplan, P. Song, B. Totty, and S. Wills. Architecture of a message-driven processor. In Proc. 14th Ann. Int'l Symp. on Computer Architecture, pages 189--196, May 1987.
[12]
C. Dalton, G. Watson, D. Banks, C. Calamvokis, A. Edwards, and J. Lumley. Afterburner. IEEE Network, 7(4):36--43, July 1993.
[13]
C. Demerjian. Sun's Niagara falls neatly into multithreaded place. The Inquirer, Nov. 2004. http://www.theinquirer.net/?article=19423.
[14]
W. Feng et al. Optimizing 10-Gigabit Ethernet for networks of workstations, clusters, and grids: A case study. In Proc. Supercomputing 2003, Nov. 2003.
[15]
M. Fillo, S.W. Keckler, W.J. Dally, N.P. Carter, A. Chang, Y. Gurevich, and W.S. Lee. The M-Machine multicomputer. In 28th Ann. Int'l Symp. on Microarchitecture, pages 146--156, Dec. 1995.
[16]
A.P. Foong, T.R. Huff, H.H. Hum, J. Patwardhan, and G.J. Regnier. TCP performance re-visited. In Proc. 2003 IEEE Int'l Symp. on Performance Analysis of Systems and Software, Mar. 2003.
[17]
B. Francis. Enterprises pushing 10GigE to edge. InfoWorld, Dec. 2004. http://www.infoworld.com/article/04/12/06/49NNcisco 1.html.
[18]
D. Freimuth, E. Hu, J. LaVoie, R. Mraz, E. Nahum, P. Pradhan, and J. Tracey. Server network scalability and TCP offload. In Proc. 2005 USENIX Technical Conference, pages 209--222, Apr. 2005.
[19]
A. Gallatin, J. Chase, and K. Yocum. Trapeze/IP: TCP/IP at neargigabit speeds. In Proc. 1999 USENIX Technical Conference, Freenix Track, 1999.
[20]
P. Gelsinger, H.G. Geyer, and J. Rattner. Speeding up the network: A system problem, a platform solution. Technology@Intel Magazine, Mar. 2005. http://www.intel.com/technology/magazine/communications/speeding-network-0305.pdf.
[21]
D.S. Henry and C.F. Joerg. A tightly-coupled processor-network interface. In Proc. Fifth Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS V), pages 111--122, Oct. 1992.
[22]
Hewlett-Packard Company. Netperf: A network performance benchmark. http://www.netperf.org.
[23]
L.R. Hsu, A.G. Saidi, N.L. Binkert, and S.K. Reinhardt. Sampling and stability in TCP/IP workloads. In Proc. First Annual Workshop on Modeling, Benchmarking, and Simulation, pages 68--77, June 2005.
[24]
R. Huggahalli, R. Iyer, and S. Tetrick. Direct cache access for high bandwidth network I/O. In Proc. 32nd Ann. Int'l Symp. on Computer Architecture, pages 50--59, June 2005.
[25]
Intel Corp. Intel IXP1200 Network Processor Family - Hardware Reference Manual, Dec. 2001.
[26]
K. Lauritzen, T. Sawicki, T. Stachura, and C.E. Wilson. Intel I/O acceleration technology improves network performance, reliability and efficiently. Technology@Intel magazine, Mar. 2005. http://www.intel.com/technology/magazine/communications/Intel-IOAT-0305.pdf.
[27]
D.S. Miller. Re: {PATCH} TCP Offload (TOE) - Chelsio. E-mail, Aug. 2005. http://lwn.net/Articles/148701.
[28]
J.C. Mogul. TCP offload is a dumb idea whose time has come. In Proc. 9th Workshop on Hot Topics in Operating Systems, May 2003.
[29]
S.S. Mukherjee and M.D. Hill. Making network interfaces less peripheral. IEEE Computer, 31(10):70--76, Oct. 1998.
[30]
T.H. Myer and I.E. Sutherland. On the design of display processors. Commun. ACM, 11(6):410--414, June 1968.
[31]
National Semiconductor. DP83820 datasheet, Feb. 2001. http://www.national.com/ds.cgi/DP/DP83820.pdf.
[32]
R.S. Nikhil, G.M. Papadopoulos, and Arvind. *T: A multithreaded massively parallel architecture. In Proc. 19th Ann. Int'l Symp. on Computer Architecture, pages 156--167, May 1992.
[33]
M. Ohmacht et al. Blue Gene/L compute chip: Memory and Ethernet subsystem. IBM Journal of Research and Development, 49(2/3):255--264, March/May 2005.
[34]
G. Regnier, S. Makineni, R. Illikkal, R. Iyer, D. Minturn, R. Huggahalli, D. Newell, L. Cline, and A. Foong. TCP onloading for data center servers. IEEE Computer, 37(11):48--58, Nov. 2004.
[35]
A.G. Saidi, N.L. Binkert, L.R. Hsu, and S.K. Reinhardt. Performance validation of network-intensive workloads on a fullsystem simulator. In Proc. 2005 Workshop on Interaction between Operating System and Computer Architecture (IOSCA), pages 33--38, Oct. 2005.
[36]
J. Satran, C. Sapuntzakis, M. Chadalapaka, and E. Zeidner. iscsi. http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-20. pdf, January 2004.
[37]
P. Shivam and J.S. Chase. On the elusive benefits of protocol offload. In NICELI '03: Proceedings of the ACM SIGCOMM Workshop on Network-I/O Convergence, pages 179--184, 2003.
[38]
Standard Performance Evaluation Corporation. SPECweb99 benchmark. http://www.spec.org/web99.
[39]
P. Willmann, H. Kim, S. Rixner, and V.S. Pai. An efficient programmable 10 gigabit Ethernet network interface card. In Proc. 11th Int'l Symp. on High-Performance Computer Architecture (HPCA), Feb. 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review  Volume 40, Issue 5
Proceedings of the 2006 ASPLOS Conference
December 2006
425 pages
ISSN:0163-5980
DOI:10.1145/1168917
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
    October 2006
    440 pages
    ISBN:1595934510
    DOI:10.1145/1168857
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2006
Published in SIGOPS Volume 40, Issue 5

Check for updates

Author Tags

  1. TCP/IP performance
  2. interfaces
  3. network
  4. zero-copy

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)3
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media