Guay et al., 2011 - Google Patents

A scalable method for signalling dynamic reconfiguration events with opensm

Guay et al., 2011

Document ID
3911354495632008386
Author
Guay W
Reinemo S
Publication year
Publication venue
2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

External Links

Snippet

Rerouting around faulty components, on-the-fly policy changes, and migration of jobs all require reconfiguration of data structures in the Queue Pairs residing in the hosts on an InfiniBand cluster. In addition to a proper implementation at the host, the subnet manager …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance or administration or management of packet switching networks
    • H04L41/06Arrangements for maintenance or administration or management of packet switching networks involving management of faults or events or alarms
    • H04L41/0654Network fault recovery
    • H04L41/0659Network fault recovery by isolating the faulty entity
    • H04L41/0663Network fault recovery by isolating the faulty entity involving offline failover planning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/26Monitoring arrangements; Testing arrangements
    • H04L12/2602Monitoring arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing packet switching networks
    • H04L43/08Monitoring based on specific metrics
    • H04L43/0805Availability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network-specific arrangements or communication protocols supporting networked applications
    • H04L67/10Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network
    • H04L67/1002Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers, e.g. load balancing
    • H04L67/1004Server selection in load balancing
    • H04L67/1023Server selection in load balancing based on other criteria, e.g. hash applied to IP address, specific algorithms or cost
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance or administration or management of packet switching networks
    • H04L41/12Arrangements for maintenance or administration or management of packet switching networks network topology discovery or management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Application independent communication protocol aspects or techniques in packet data networks
    • H04L69/40Techniques for recovering from a failure of a protocol instance or entity, e.g. failover routines, service redundancy protocols, protocol state redundancy or protocol service redirection in case of a failure or disaster recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing packet switching networks
    • H04L43/10Arrangements for monitoring or testing packet switching networks using active monitoring, e.g. heartbeat protocols, polling, ping, trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/22Alternate routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance or administration or management of packet switching networks
    • H04L41/02Arrangements for maintenance or administration or management of packet switching networks involving integration or standardization
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/58Association of routers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance or administration or management of packet switching networks
    • H04L41/22Arrangements for maintenance or administration or management of packet switching networks using GUI [Graphical User Interface]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/60Hybrid or multiprotocol packet, ATM or frame switches
    • H04L49/602Multilayer or multiprotocol switching, e.g. IP switching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/25Routing or path finding through a switch fabric
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/35Application specific switches

Similar Documents

Publication Publication Date Title
Liu et al. F10: A {Fault-Tolerant} engineered network
CN107360092B (en) System and method for balancing load in data network
Liu et al. Data center networks: Topologies, architectures and fault-tolerance characteristics
US8879396B2 (en) System and method for using dynamic allocation of virtual lanes to alleviate congestion in a fat-tree topology
JP6835444B2 (en) Software-defined data center and service cluster scheduling method and traffic monitoring method for that purpose
Koerner et al. Multiple service load-balancing with OpenFlow
JP6169251B2 (en) Asymmetric packet flow in distributed load balancers
EP2891286B1 (en) System and method for supporting discovery and routing degraded fat-trees in a middleware machine environment
Sidki et al. Fault tolerant mechanisms for SDN controllers
Xie et al. An incrementally scalable and cost-efficient interconnection structure for data centers
US8880932B2 (en) System and method for signaling dynamic reconfiguration events in a middleware machine environment
JP6604336B2 (en) Information processing apparatus, information processing method, and program
Lin et al. Fast, scalable and robust centralized routing for data center networks
Bermudez et al. On the infiniband subnet discovery process
Gray et al. A priori state synchronization for fast failover of stateful firewall VNFs
Guay et al. dFtree: a fat-tree routing algorithm using dynamic allocation of virtual lanes to alleviate congestion in infiniband networks
JP2006148376A (en) Network monitoring system, network superordinate monitoring system, network subordinate monitoring system, and network monitoring method
US11853254B1 (en) Methods, systems, and computer readable media for exposing data processing unit (DPU) traffic in a smartswitch
US12063140B2 (en) Methods, systems, and computer readable media for test system agent deployment in a smartswitch computing environment
Verdi et al. InFaRR: In-Network Fast ReRouting
Guay et al. A scalable method for signalling dynamic reconfiguration events with opensm
Guay et al. Host side dynamic reconfiguration with infiniband
Reinemo et al. Multi-homed fat-tree routing with InfiniBand
JP2023529462A (en) High Availability Cluster Leader Election in Distributed Routing Systems
Mustafa et al. Genix: A geni-based ixp emulation