VIDEOS 》 Linux Kernel TCP Congestion Control Algorithms

TCP congestion control :: https://en.wikipedia.org/wiki/TCP_congestion_control
Linux Kernel Source:
Linux Kernel - IPv4 Stack /net/ipv4 :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4
TCP CUBIC: Binary Increase Congestion control for TCP - /net/ipv4/tcp_cubic.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cubic.c
TCP-HYBLA Congestion control algorithm - /net/ipv4/tcp_hybla.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_hybla.c
* A TCP Enhancement for Heterogeneous Networks
TCP Illinois congestion control - /net/ipv4/tcp_illinois.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_illinois.c
TCP Westwood+: end-to-end bandwidth estimation for TCP - /net/ipv4/tcp_westwood.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_westwood.c
TCP Vegas congestion control - /net/ipv4/tcp_vegas.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_vegas.c
TCP Veno congestion control - /net/ipv4/tcp_veno.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_veno.c
H-TCP congestion control: TCP for high-speed and long-distance networks - /net/ipv4/tcp_htcp.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_htcp.c

Also refer the Kernel Architecture difference between a standard Linux System vs Android OS, discussed in this video:
Kernel Architecture - Generic Linux System vs Android - The Linux Channel

CUBIC TCP - https://en.wikipedia.org/wiki/CUBIC_TCP
BIC TCP - https://en.wikipedia.org/wiki/BIC_TCP
LFN: long fat networks - https://en.wikipedia.org/wiki/Bandwidth-delay_product
Bandwidth-delay product - https://en.wikipedia.org/wiki/Bandwidth-delay_product
Linux Kernel Source:
TCP CUBIC: Binary Increase Congestion control for TCP v2.3 - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cubic.c
static struct tcp_congestion_ops cubictcp - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cubic.c#L457
static int __init cubictcp_register(void) - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cubic.c#L469
static void __exit cubictcp_unregister(void) - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cubic.c#L504
tcp_register_congestion_control(), tcp_unregister_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/include/net/tcp.h#L1029
Pluggable TCP congestion control support - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c
struct tcp_congestion_ops data-structure - http://elixir.free-electrons.com/linux/latest/source/include/net/tcp.h#L989

Here is the struct tcp_congestion_ops data-structure (/include/net/tcp.h) from the Kernel-source version 4.14 for quick reference:

struct tcp_congestion_ops {
	struct list_head	list;
	u32 key;
	u32 flags;

	/* initialize private data (optional) */
	void (*init)(struct sock *sk);
	/* cleanup private data  (optional) */
	void (*release)(struct sock *sk);

	/* return slow start threshold (required) */
	u32 (*ssthresh)(struct sock *sk);
	/* do new cwnd calculation (required) */
	void (*cong_avoid)(struct sock *sk, u32 ack, u32 acked);
	/* call before changing ca_state (optional) */
	void (*set_state)(struct sock *sk, u8 new_state);
	/* call when cwnd event occurs (optional) */
	void (*cwnd_event)(struct sock *sk, enum tcp_ca_event ev);
	/* call when ack arrives (optional) */
	void (*in_ack_event)(struct sock *sk, u32 flags);
	/* new value of cwnd after loss (required) */
	u32  (*undo_cwnd)(struct sock *sk);
	/* hook for packet ack accounting (optional) */
	void (*pkts_acked)(struct sock *sk, const struct ack_sample *sample);
	/* suggest number of segments for each skb to transmit (optional) */
	u32 (*tso_segs_goal)(struct sock *sk);
	/* returns the multiplier used in tcp_sndbuf_expand (optional) */
	u32 (*sndbuf_expand)(struct sock *sk);
	/* call when packets are delivered to update cwnd and pacing rate,
	 * after all the ca_state processing. (optional)
	void (*cong_control)(struct sock *sk, const struct rate_sample *rs);
	/* get info for inet_diag (optional) */
	size_t (*get_info)(struct sock *sk, u32 ext, int *attr,
			   union tcp_cc_info *info);

	char 		name[TCP_CA_NAME_MAX];
	struct module 	*owner;

Here is the tcp_register_congestion_control(), tcp_unregister_congestion_control() pluggable congestion control registration APIs (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

 * Attach new congestion control algorithm to the list
 * of available options.
int tcp_register_congestion_control(struct tcp_congestion_ops *ca)
	int ret = 0;

	/* all algorithms must implement these */
	if (!ca->ssthresh || !ca->undo_cwnd ||
	    !(ca->cong_avoid || ca->cong_control)) {
		pr_err("%s does not implement required ops\n", ca->name);
		return -EINVAL;

	ca->key = jhash(ca->name, sizeof(ca->name), strlen(ca->name));

	if (ca->key == TCP_CA_UNSPEC || tcp_ca_find_key(ca->key)) {
		pr_notice("%s already registered or non-unique key\n",
		ret = -EEXIST;
	} else {
		list_add_tail_rcu(&ca->list, &tcp_cong_list);
		pr_debug("%s registered\n", ca->name);

	return ret;

 * Remove congestion control algorithm, called from
 * the module's remove function.  Module ref counts are used
 * to ensure that this can't be done till all sockets using
 * that method are closed.
void tcp_unregister_congestion_control(struct tcp_congestion_ops *ca)

	/* Wait for outstanding readers to complete before the
	 * module gets removed entirely.
	 * A try_module_get() should fail by now as our module is
	 * in "going" state since no refs are held anymore and
	 * module_exit() handler being called.

Here is the struct tcp_congestion_ops cubictcp data-structure instance, cubictcp_register(), cubictcp_unregister() APIs (/net/ipv4/tcp_cubic.c) from the Kernel-source version 4.14 for quick reference:

static struct tcp_congestion_ops cubictcp __read_mostly = {
	.init		= bictcp_init,
	.ssthresh	= bictcp_recalc_ssthresh,
	.cong_avoid	= bictcp_cong_avoid,
	.set_state	= bictcp_state,
	.undo_cwnd	= tcp_reno_undo_cwnd,
	.cwnd_event	= bictcp_cwnd_event,
	.pkts_acked     = bictcp_acked,
	.owner		= THIS_MODULE,
	.name		= "cubic",

static int __init cubictcp_register(void)
	BUILD_BUG_ON(sizeof(struct bictcp) > ICSK_CA_PRIV_SIZE);

	/* Precompute a bunch of the scaling factors that are used per-packet
	 * based on SRTT of 100ms

	beta_scale = 8*(BICTCP_BETA_SCALE+beta) / 3
		/ (BICTCP_BETA_SCALE - beta);

	cube_rtt_scale = (bic_scale * 10);	/* 1024*c/rtt */

	/* calculate the "K" for (wmax-cwnd) = c/rtt * K^3
	 *  so K = cubic_root( (wmax-cwnd)*rtt/c )
	 * the unit of K is bictcp_HZ=2^10, not HZ
	 *  c = bic_scale >> 10
	 *  rtt = 100ms
	 * the following code has been designed and tested for
	 * cwnd < 1 million packets
	 * RTT < 100 seconds
	 * HZ < 1,000,00  (corresponding to 10 nano-second)

	/* 1/c * 2^2*bictcp_HZ * srtt */
	cube_factor = 1ull << (10+3*BICTCP_HZ); /* 2^40 */

	/* divide by bic_scale and by constant Srtt (100ms) */
	do_div(cube_factor, bic_scale * 10);

	return tcp_register_congestion_control(&cubictcp);

static void __exit cubictcp_unregister(void)


MODULE_AUTHOR("Sangtae Ha, Stephen Hemminger");

Linux Kernel Source:
Main TCP congestion control support implementation file:
/net/ipv4/tcp_cong.c - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c
tcp_register_congestion_control(), tcp_unregister_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/include/net/tcp.h#L1029
tcp_ca_find() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L23
tcp_set_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L341
tcp_init_congestion_control(), tcp_reinit_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L179
tcp_get_allowed_congestion_control(), tcp_set_allowed_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L280
tcp_set_default_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L217
tcp_get_available_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L252
Data-structures: struct tcp_congestion_ops data-structure - http://elixir.free-electrons.com/linux/latest/source/include/net/tcp.h#L989
static LIST_HEAD(tcp_cong_list); - Linked List - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L20

Here is the LIST_HEAD(tcp_cong_list) linked list, tcp_ca_find() API (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

static DEFINE_SPINLOCK(tcp_cong_list_lock);
static LIST_HEAD(tcp_cong_list);

/* Simple linear search, don't expect many entries! */
static struct tcp_congestion_ops *tcp_ca_find(const char *name)
	struct tcp_congestion_ops *e;

	list_for_each_entry_rcu(e, &tcp_cong_list, list) {
		if (strcmp(e->name, name) == 0)
			return e;

	return NULL;

Here is the tcp_init_congestion_control(), tcp_reinit_congestion_control() APIs (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

void tcp_init_congestion_control(struct sock *sk)
	const struct inet_connection_sock *icsk = inet_csk(sk);

	tcp_sk(sk)->prior_ssthresh = 0;
	if (icsk->icsk_ca_ops->init)
	if (tcp_ca_needs_ecn(sk))

static void tcp_reinit_congestion_control(struct sock *sk,
					  const struct tcp_congestion_ops *ca)
	struct inet_connection_sock *icsk = inet_csk(sk);

	icsk->icsk_ca_ops = ca;
	icsk->icsk_ca_setsockopt = 1;
	memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv));

	if (sk->sk_state != TCP_CLOSE)

Here is the tcp_set_default_congestion_control(), tcp_get_available_congestion_control(), tcp_set_allowed_congestion_control() APIs (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

/* Used by sysctl to change default congestion control */
int tcp_set_default_congestion_control(const char *name)
	struct tcp_congestion_ops *ca;
	int ret = -ENOENT;

	ca = tcp_ca_find(name);
	if (!ca && capable(CAP_NET_ADMIN)) {

		request_module("tcp_%s", name);
		ca = tcp_ca_find(name);

	if (ca) {
		ca->flags |= TCP_CONG_NON_RESTRICTED;	/* default is always allowed */
		list_move(&ca->list, &tcp_cong_list);
		ret = 0;

	return ret;

/* Set default value from kernel configuration at bootup */
static int __init tcp_congestion_default(void)
	return tcp_set_default_congestion_control(CONFIG_DEFAULT_TCP_CONG);

/* Build string with list of available congestion control values */
void tcp_get_available_congestion_control(char *buf, size_t maxlen)
	struct tcp_congestion_ops *ca;
	size_t offs = 0;

	list_for_each_entry_rcu(ca, &tcp_cong_list, list) {
		offs += snprintf(buf + offs, maxlen - offs,
				 offs == 0 ? "" : " ", ca->name);

/* Get current default congestion control */
void tcp_get_default_congestion_control(char *name)
	struct tcp_congestion_ops *ca;
	/* We will always have reno... */

	ca = list_entry(tcp_cong_list.next, struct tcp_congestion_ops, list);
	strncpy(name, ca->name, TCP_CA_NAME_MAX);

/* Change list of non-restricted congestion control */
int tcp_set_allowed_congestion_control(char *val)
	struct tcp_congestion_ops *ca;
	char *saved_clone, *clone, *name;
	int ret = 0;

	saved_clone = clone = kstrdup(val, GFP_USER);
	if (!clone)
		return -ENOMEM;

	/* pass 1 check for bad entries */
	while ((name = strsep(&clone, " ")) && *name) {
		ca = tcp_ca_find(name);
		if (!ca) {
			ret = -ENOENT;
			goto out;

	/* pass 2 clear old values */
	list_for_each_entry_rcu(ca, &tcp_cong_list, list)
		ca->flags &= ~TCP_CONG_NON_RESTRICTED;

	/* pass 3 mark as allowed */
	while ((name = strsep(&val, " ")) && *name) {
		ca = tcp_ca_find(name);
		if (ca)
			ca->flags |= TCP_CONG_NON_RESTRICTED;

	return ret;

Suggested Topics:

Video Episodes :: Linux Kernel programming

Linux Kernel /proc Interface ↗
Saturday' 13-Mar-2021

Linux Kernel Programming - Device Drivers ↗
Saturday' 13-Mar-2021
Watch detailed videos and read topics on Linux Kernel Programming - Device Drivers

Linux Kernel Architecture ↗
Saturday' 13-Mar-2021

Linux Kernel Programming ↗
Saturday' 13-Mar-2021

Linux Operating System - User-space Processes ↗
Saturday' 13-Mar-2021

Linux Kernel Compilation ↗
Saturday' 13-Mar-2021

Linux Kernel FileSystems Subsystem ↗
Saturday' 13-Mar-2021

Linux Kernel - Containers and Namespaces ↗
Saturday' 13-Mar-2021

Linux ioctl() API interface ↗
Saturday' 13-Mar-2021
Watch detailed videos and read topics on Linux Kernel Programming and Linux ioctl() API interface

Join The Linux Channel :: Facebook Group ↗

Visit The Linux Channel :: on Youtube ↗

💗 Help shape the future: Sponsor/Donate

Recommended Topics:
Featured Video:
Watch on Youtube - Linux Kernel run-time performance vs GCC Compiler ↗

Raspberry Pi OS with PIXEL ↗
Saturday' 13-Mar-2021

Programming Language Performance and Overheads ↗
Saturday' 13-Mar-2021
A detailed Youtube video series of various programming language performance and overheads - a big picture

What is a Linux Kernel Module - a Big Picture ↗
Saturday' 13-Mar-2021
Learning Linux Kernel Programming is always fascinating and yet challenging. So generally you may tend to learn Kernel Module programming, since such a module can be dynamically plugged into running Linux Kernel. But this will lead to confusion, and many assume kernel source is mostly a collection of these modules. Which in reality is not. Not just that, when we say Kernel Module, its a vaguely defined term. The term Module (as we know) is nothing but a collection of APIs, bunch of variables and associated data-structures. Which may or may not be a plugable kernel module. If you ask me, I am a fan of wiring Linux Kernel Modules, which may not be necessarily a pluggable kernel module. It all boils down to the address space at which these modules function inside a monolithic Linux Kernel. Which is nothing but Linux Kernel's address space. Hence here is my detailed multi-episode Youtube video series on Linux Kernel modules, a big picture and the significance of the

Linux Kernel Programming | with or without Kernel Modules | Device Drivers ↗
Saturday' 13-Mar-2021
When learning Linux Kernel programming, often I notice my students and viewers gets confused and they start with learning writing Linux Kernel modules. And so they develop the common misconception about Kernel Programming in general. They assume writing code in Linux Kernel means writing kernel modules. Which is absolutely not. Kernel modules are an optional choice and are part of Linux Kernel. But besides modules, Linux Kernel has lot of other mainstream code. Hence if anyone wants to be a Kernel Developer, you should be aware that sometimes you add new code via modules, sometimes without them. And if you ask me, I am not much in favor of writing Kernel modules. Instead in my code, I try to integrate and make them a part of Linux Kernel so that they all get initialized during boot time. Here is an extensive Youtube video of mine on Linux Kernel Programming, with and without Kernel Modules.

Weekly News Digest - Week 03 - June 2020 ↗
Saturday' 13-Mar-2021
The Linux Channel :: Weekly News Digest - Week 03 - June 2020 > KimĪ‡ Micro: A powerful alternative to the Raspberry Pi that supports PCIe cards > Linux Lands And Then Reverts Usage Of Flexible Array Members > AMD Ryzen 4000-Powered Asus Mini PC Challenges Intel's NUC > Aaeon's Raspberry Pi-like Board Bears An Intel 8th Gen Core i7 CPU and 16GB of DDR4 > Linus Torvalds Announces Massive Linux Kernel 5.8 Update > New Cisco Cloud Scale ASIC & 400G Line Card Announced > Key Differences of PoE vs PoE+ vs PoE++ Switches a STH Guide

CEO, CTO Talk ↗
Saturday' 13-Mar-2021

TCP vs UDP an Expert Opinion ↗
Saturday' 13-Mar-2021

Weekly News Digest - Week 5 - February 2018 ↗
Saturday' 13-Mar-2021
> Mini-ITX board fosters flexibility and fights obsolescence > You Can Do Better Than Arduino: Try These Microcontrollers > Open source is 20: How it changed programming and business forever > Since computer technology evolves so rapidly, does it matter if you have 5 or 15 years of experience as e.g. software developer? > Linux Kernel net_device data-structure - possible_net_t nd_net - Network namespace and Linux Containers - Ep7 > New Wave of Mini Satellites Could Boost Climate Research > Here come all the AI deployments; Now how do we manage AI? > Imagination announces neural network acceleration push

How to start Open-Source Project ↗
Saturday' 13-Mar-2021

Weekly News Digest - Week 8 - February 2018 ↗
Saturday' 13-Mar-2021
> Asymmetric Processor Cores > Using QCT Quanta LB6M 10GbE Switch for Container Networking > Using SSE Instead Of WebSockets For Unidirectional Data Flow Over HTTP/2 > Why IPv6 networks create DNS configuration problems > Hypervisor or containers: Which solution is right for you? > Blockchain for 2018 and Beyond: A (growing) list of blockchain use cases > Raw sockets backdoor gives attackers complete control of some Linux servers

Trending Video:
Watch on Youtube - 352 Linux user-space - Shared Memory IPC - Live Demo and Exa ↗

CUDA GPU Distributed Parallel Computing ↗
Saturday' 13-Mar-2021

Recommended Video:
Watch on Youtube - x224 Linux Kernel Dummy Network Interface /drivers/net/dummy... ↗