CoDel (Controlled Delay; pronounced "coddle") is an active queue management (AQM) algorithm in network routing, developed by Van Jacobson and Kathleen Nichols and published as RFC8289. [1] It is designed to overcome bufferbloat in networking hardware, such as routers, by setting limits on the delay network packets experience as they pass through buffers in this equipment. CoDel aims to improve on the overall performance of the random early detection (RED) algorithm by addressing some of its fundamental misconceptions, as perceived by Jacobson, and by being easier to manage.
In 2012, an implementation of CoDel was written by Dave Täht and Eric Dumazet for the Linux kernel and dual licensed under the GNU General Public License and the 3-clause BSD license. Dumazet's improvement on CoDel is called FQ-CoDel, standing for "Fair/Flow Queue CoDel"; it was first adopted as the standard AQM and packet scheduling solution in 2014 in the OpenWrt 14.07 release called "Barrier Breaker". From there, CoDel and FQ-CoDel have migrated into various downstream projects such as Tomato, dd-wrt, OPNsense and Ubiquiti's "Smart Queues" feature.
CoDel is based on observations of packet behavior in packet-switched networks under the influence of data buffers. Some of these observations are about the fundamental nature of queueing and the causes of bufferbloat, others relate to weaknesses of alternative queue management algorithms. CoDel was developed as an attempt to address the problem of bufferbloat. [2]
The flow of packets slows down while traveling through a network link between a fast and a slow network, especially at the start of a TCP session, when there is a sudden burst of packets and the slower network may not be able to accept the burst quickly enough. Buffers exist to ease this problem by giving the fast network a place to store packets to be read by the slower network at its own pace. [3] In other words, buffers act like shock absorbers to convert bursty arrivals into smooth, steady departures. However, a buffer has limited capacity. The ideal buffer is sized so it can handle a sudden burst of communication and match the speed of that burst to the speed of the slower network. Ideally, the shock-absorbing situation is characterized by a temporary delay for packets in the buffer during the transmission burst, after which the delay rapidly disappears and the network reaches a balance in offering and handling packets. [3]
The TCP congestion control algorithm relies on packet drops to determine the available bandwidth between two communicating devices. It speeds up the data transfer until packets start to drop, and then slows down the transmission rate. Ideally, it keeps speeding up and slowing down as it finds equilibrium at the speed of the link. For this to work, the packet drops must occur in a timely manner so that the algorithm can responsively select a suitable transfer speed. With packets held in an overly-large buffer, the packets will arrive at their destination but with a higher latency but no packets are dropped so TCP does not slow down. Under these conditions, TCP may even decide that the path of the connection has changed and repeat the search for a new equilibrium. [4] [5]
Having a big and constantly full buffer that causes increased transmission delays and reduced interactivity, especially when looking at two or more simultaneous transmissions over the same channel, is called bufferbloat. Available channel bandwidth can also end up being unused, as some fast destinations may not be reached due to buffers being clogged with data awaiting delivery to slow destinations.
CoDel distinguishes between two types of queue: [3] [6] A good queue is one that exhibits no bufferbloat. Communication bursts cause no more than a temporary increase in queue delay. The network link utilization is maximized. A bad queue exhibits bufferbloat. Communication bursts cause the buffer to fill up and stay filled, resulting in low utilization and a constantly high buffer delay. In order to be effective against bufferbloat, a solution in the form of an active queue management (AQM) algorithm must be able to recognize an occurrence of bufferbloat and react by deploying effective countermeasures.
Van Jacobson asserted in 2006 that existing algorithms have been using incorrect means of recognizing bufferbloat. [5] Algorithms like RED measure the average queue length and consider it a case of bufferbloat if the average grows too large. Jacobson demonstrated in 2006 that this measurement is not a good metric, as the average queue length rises sharply in the case of a communications burst. The queue can then dissipate quickly (good queue) or become a standing queue (bad queue). Other factors in network traffic can also cause false positives or negatives, causing countermeasures to be deployed unnecessarily. Jacobson suggested that average queue length actually contains no information at all about packet demand or network load. [3] [5] He suggested that a better metric might be the minimum queue length during a sliding time window. [3]
Based on Jacobson's notion from 2006, CoDel was developed to manage queues under control of the minimum delay experienced by packets in the running buffer window. The goal is to keep this minimum delay below 5 milliseconds. If the minimum delay rises to too high a value, packets are dropped from the queue until the delay drops below the maximum level. [3] Nichols and Jacobson cite several advantages to using nothing more than this metric: [3]
CoDel does nothing to manage the buffer if the minimum delay for the buffer window is below the maximum allowed value. It also does nothing if the buffer is relatively empty (if there are fewer than one MTU's worth of bytes in the buffer). [3] If these conditions do not hold, then CoDel drops packets probabilistically. [3]
The algorithm is independently computed at each network hop. The algorithm operates over an interval, initially 100 milliseconds. Per-packet queuing delay is monitored through the hop. As each packet is dequeued for forwarding, the queuing delay (amount of time the packet spent waiting in the queue) is calculated. The lowest queuing delay for the interval is stored. When the last packet of the interval is dequeued, if the lowest queuing delay for the interval is greater than 5 milliseconds, this single packet is dropped and the interval used for the next group of packets is shortened. If the lowest queuing delay for the interval is less than 5 milliseconds, the packet is forwarded and the interval is reset to 100 milliseconds.
When the interval is shortened, it is done so in accordance with the inverse square root of the number of successive intervals in which packets were dropped due to excessive queuing delay. The sequence of intervals is , , , , ...
CoDel has been tested in simulation tests by Nichols and Jacobson, at different MTUs and link rates and other variations of conditions. In general, results indicate: [3] [7]
Simulation was also performed by Greg White and Joey Padden at CableLabs. [8]
A full implementation of CoDel was realized in May 2012 and made available as open-source software. [3] It was implemented within the Linux kernel (starting with the 3.5 mainline). [9] Dave Täht back-ported CoDel to Linux kernel 3.3 for project CeroWrt, which concerns itself among other things with bufferbloat, [10] where it was exhaustively tested. CoDel began to appear as an option in some proprietary/turnkey bandwidth management platforms in 2013. [11] FreeBSD had CoDel integrated into the 11.x [12] and 10.x [13] code branches in 2016. [14] An implementation is distributed with OpenBSD since version 6.2. [15]
Fair/Flow Queue CoDel (FQ-CoDel; fq_codel in Linux code) adds flow queuing to CoDel so that it differentiates between multiple simultaneous connections and works fairly. It gives the first packet in each stream priority, so that small streams can start and finish quickly for better use of network resources. CoDel co-author Van Jacobson recommends the use of fq_codel over codel where it's available. [16] FQ-CoDel is published as RFC8290. It is written by T. Hoeiland-Joergensen, P. McKenney, D. Täht, J. Gettys, and E. Dumazet, all members of the "bufferbloat project". [17]
Common Applications Kept Enhanced (CAKE; sch_cake in Linux code) is a combined traffic shaper and AQM algorithm presented by the bufferbloat project in 2018. It builds on the experience of using fq_codel with the HTB (Hierarchy Token Bucket) traffic shaper. It improves over the Linux htb+fq_codel implementation by reducing hash collisions between flows, reducing CPU utilization in traffic shaping, and in a few other ways. [18]
In 2022, Dave Täht reviewed the state of fq_codel and sch_cake implementations in the wild. He found that while many systems have switched to either as the default AQM, several implementations have dubious deviations from the standard. For example, Apple's implementation of fq_codel (default in iOS) has a very large number of users but no "codel" component. Täht also notes the general lack of hardware offloading, made more important by the increase in network traffic brought by the COVID-19 pandemic. [19]
Explicit Congestion Notification (ECN) is an extension to the Internet Protocol and to the Transmission Control Protocol and is defined in RFC 3168 (2001). ECN allows end-to-end notification of network congestion without dropping packets. ECN is an optional feature that may be used between two ECN-enabled endpoints when the underlying network infrastructure also supports it.
Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. A consequence of congestion is that an incremental increase in offered load leads either only to a small increase or even a decrease in network throughput.
FAST TCP is a TCP congestion avoidance algorithm especially targeted at long-distance, high latency links, developed at the Netlab, California Institute of Technology and now being commercialized by FastSoft. FastSoft was acquired by Akamai Technologies in 2012.
The leaky bucket is an algorithm based on an analogy of how a bucket with a constant leak will overflow if either the average rate at which water is poured in exceeds the rate at which the bucket leaks or if more water than the capacity of the bucket is poured in all at once. It can be used to determine whether some sequence of discrete events conforms to defined limits on their average and peak rates or frequencies, e.g. to limit the actions associated with these events to these rates or delay them until they do conform to the rates. It may also be used to check conformance or limit to an average rate alone, i.e. remove any variation from the average.
The token bucket is an algorithm used in packet-switched and telecommunications networks. It can be used to check that data transmissions, in the form of packets, conform to defined limits on bandwidth and burstiness. It can also be used as a scheduling algorithm to determine the timing of transmissions that will comply with the limits set for the bandwidth and burstiness: see network scheduler.
Random early detection (RED), also known as random early discard or random early drop, is a queuing discipline for a network scheduler suited for congestion avoidance.
ALTQ is the network scheduler for Berkeley Software Distribution. ALTQ provides queueing disciplines, and other components related to quality of service (QoS), required to realize resource sharing. It is most commonly implemented on BSD-based routers. ALTQ is included in the base distribution of FreeBSD, NetBSD, and DragonFly BSD, and was integrated into the pf packet filter of OpenBSD but later replaced by a new queueing subsystem.
Transmission Control Protocol (TCP) uses a congestion control algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and a congestion window (CWND), to achieve congestion avoidance. The TCP congestion-avoidance algorithm is the primary basis for congestion control in the Internet. Per the end-to-end principle, congestion control is largely a function of internet hosts, not the network itself. There are several variations and versions of the algorithm implemented in protocol stacks of operating systems of computers that connect to the Internet.
Nagle's algorithm is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. It was defined by John Nagle while working for Ford Aerospace. It was published in 1984 as a Request for Comments (RFC) with title Congestion Control in IP/TCP Internetworks in RFC 896.
TCP Vegas is a TCP congestion avoidance algorithm that emphasizes packet delay, rather than packet loss, as a signal to help determine the rate at which to send packets. It was developed at the University of Arizona by Lawrence Brakmo and Larry L. Peterson and introduced in 1994.
TCP tuning techniques adjust the network congestion avoidance parameters of Transmission Control Protocol (TCP) connections over high-bandwidth, high-latency networks. Well-tuned networks can perform up to 10 times faster in some cases. However, blindly following instructions without understanding their real consequences can hurt performance as well.
The Media Delivery Index (MDI) is a set of measures that can be used to monitor both the quality of a delivered video stream as well as to show system margin for IPTV systems by providing an accurate measurement of jitter and delay at network level (Internet Protocol, IP), which are the main causes for quality loss. Identifying and quantizing such problems in this kind of networks is key to maintaining high quality video delivery and providing indications that warn system operators with enough advance notice to allow corrective action.
Bandwidth management is the process of measuring and controlling the communications on a network link, to avoid filling the link to capacity or overfilling the link, which would result in network congestion and poor performance of the network. Bandwidth is described by bit rate and measured in units of bits per second (bit/s) or bytes per second (B/s).
In routers and switches, active queue management (AQM) is the policy of dropping packets inside a buffer associated with a network interface controller (NIC) before that buffer becomes full, often with the goal of reducing network congestion or improving end-to-end latency. This task is performed by the network scheduler, which for this purpose uses various algorithms such as random early detection (RED), Explicit Congestion Notification (ECN), or controlled delay (CoDel). RFC 7567 recommends active queue management as a best practice.
The TCP window scale option is an option to increase the receive window size allowed in Transmission Control Protocol above its former maximum value of 65,535 bytes. This TCP option, along with several others, is defined in RFC 7323 which deals with long fat networks (LFNs).
Blue is a scheduling discipline for the network scheduler developed by graduate student Wu-chang Feng for Professor Kang G. Shin at the University of Michigan and others at the Thomas J. Watson Research Center of IBM in 1999.
Bufferbloat is a cause of high latency and jitter in packet-switched networks caused by excess buffering of packets. Bufferbloat can also cause packet delay variation, as well as reduce the overall network throughput. When a router or switch is configured to use excessively large buffers, even very high-speed networks can become practically unusable for many interactive applications like voice over IP (VoIP), audio streaming, online gaming, and even ordinary web browsing.
A network scheduler, also called packet scheduler, queueing discipline (qdisc) or queueing algorithm, is an arbiter on a node in a packet switching communication network. It manages the sequence of network packets in the transmit and receive queues of the protocol stack and network interface controller. There are several network schedulers available for the different operating systems, that implement many of the existing network scheduling algorithms.
Kathleen Nichols is an American computer scientist and computer networking expert. Nichols is the founder and CEO of Pollere, Inc, a network architecture and performance company based in California, US. Before founding Pollere, Nichols was VP of Network Science at Packet Design, where she was part of the founding team. Prior to Packet Design she was director of advanced Internet architectures in the Office of CTO at Cisco Systems.
Dave Täht is an American network engineer, musician, lecturer, asteroid exploration advocate, and Internet activist. He is the chief executive officer of TekLibre.