VIDEOS 》 Linux Kernel TCP Congestion Control Algorithms

TCP congestion control ::
Linux Kernel Source:
Linux Kernel - IPv4 Stack /net/ipv4 ::
TCP CUBIC: Binary Increase Congestion control for TCP - /net/ipv4/tcp_cubic.c ::
TCP-HYBLA Congestion control algorithm - /net/ipv4/tcp_hybla.c ::
* A TCP Enhancement for Heterogeneous Networks
TCP Illinois congestion control - /net/ipv4/tcp_illinois.c ::
TCP Westwood+: end-to-end bandwidth estimation for TCP - /net/ipv4/tcp_westwood.c ::
TCP Vegas congestion control - /net/ipv4/tcp_vegas.c ::
TCP Veno congestion control - /net/ipv4/tcp_veno.c ::
H-TCP congestion control: TCP for high-speed and long-distance networks - /net/ipv4/tcp_htcp.c ::

Also refer the Kernel Architecture difference between a standard Linux System vs Android OS, discussed in this video:
Kernel Architecture - Generic Linux System vs Android - The Linux Channel

LFN: long fat networks -
Bandwidth-delay product -
Linux Kernel Source:
TCP CUBIC: Binary Increase Congestion control for TCP v2.3 -
static struct tcp_congestion_ops cubictcp -
static int __init cubictcp_register(void) -
static void __exit cubictcp_unregister(void) -
tcp_register_congestion_control(), tcp_unregister_congestion_control() -
Pluggable TCP congestion control support -
struct tcp_congestion_ops data-structure -

Here is the struct tcp_congestion_ops data-structure (/include/net/tcp.h) from the Kernel-source version 4.14 for quick reference:

struct tcp_congestion_ops {
	struct list_head	list;
	u32 key;
	u32 flags;

	/* initialize private data (optional) */
	void (*init)(struct sock *sk);
	/* cleanup private data  (optional) */
	void (*release)(struct sock *sk);

	/* return slow start threshold (required) */
	u32 (*ssthresh)(struct sock *sk);
	/* do new cwnd calculation (required) */
	void (*cong_avoid)(struct sock *sk, u32 ack, u32 acked);
	/* call before changing ca_state (optional) */
	void (*set_state)(struct sock *sk, u8 new_state);
	/* call when cwnd event occurs (optional) */
	void (*cwnd_event)(struct sock *sk, enum tcp_ca_event ev);
	/* call when ack arrives (optional) */
	void (*in_ack_event)(struct sock *sk, u32 flags);
	/* new value of cwnd after loss (required) */
	u32  (*undo_cwnd)(struct sock *sk);
	/* hook for packet ack accounting (optional) */
	void (*pkts_acked)(struct sock *sk, const struct ack_sample *sample);
	/* suggest number of segments for each skb to transmit (optional) */
	u32 (*tso_segs_goal)(struct sock *sk);
	/* returns the multiplier used in tcp_sndbuf_expand (optional) */
	u32 (*sndbuf_expand)(struct sock *sk);
	/* call when packets are delivered to update cwnd and pacing rate,
	 * after all the ca_state processing. (optional)
	void (*cong_control)(struct sock *sk, const struct rate_sample *rs);
	/* get info for inet_diag (optional) */
	size_t (*get_info)(struct sock *sk, u32 ext, int *attr,
			   union tcp_cc_info *info);

	char 		name[TCP_CA_NAME_MAX];
	struct module 	*owner;

Here is the tcp_register_congestion_control(), tcp_unregister_congestion_control() pluggable congestion control registration APIs (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

 * Attach new congestion control algorithm to the list
 * of available options.
int tcp_register_congestion_control(struct tcp_congestion_ops *ca)
	int ret = 0;

	/* all algorithms must implement these */
	if (!ca->ssthresh || !ca->undo_cwnd ||
	    !(ca->cong_avoid || ca->cong_control)) {
		pr_err("%s does not implement required ops\n", ca->name);
		return -EINVAL;

	ca->key = jhash(ca->name, sizeof(ca->name), strlen(ca->name));

	if (ca->key == TCP_CA_UNSPEC || tcp_ca_find_key(ca->key)) {
		pr_notice("%s already registered or non-unique key\n",
		ret = -EEXIST;
	} else {
		list_add_tail_rcu(&ca->list, &tcp_cong_list);
		pr_debug("%s registered\n", ca->name);

	return ret;

 * Remove congestion control algorithm, called from
 * the module's remove function.  Module ref counts are used
 * to ensure that this can't be done till all sockets using
 * that method are closed.
void tcp_unregister_congestion_control(struct tcp_congestion_ops *ca)

	/* Wait for outstanding readers to complete before the
	 * module gets removed entirely.
	 * A try_module_get() should fail by now as our module is
	 * in "going" state since no refs are held anymore and
	 * module_exit() handler being called.

Here is the struct tcp_congestion_ops cubictcp data-structure instance, cubictcp_register(), cubictcp_unregister() APIs (/net/ipv4/tcp_cubic.c) from the Kernel-source version 4.14 for quick reference:

static struct tcp_congestion_ops cubictcp __read_mostly = {
	.init		= bictcp_init,
	.ssthresh	= bictcp_recalc_ssthresh,
	.cong_avoid	= bictcp_cong_avoid,
	.set_state	= bictcp_state,
	.undo_cwnd	= tcp_reno_undo_cwnd,
	.cwnd_event	= bictcp_cwnd_event,
	.pkts_acked     = bictcp_acked,
	.owner		= THIS_MODULE,
	.name		= "cubic",

static int __init cubictcp_register(void)
	BUILD_BUG_ON(sizeof(struct bictcp) > ICSK_CA_PRIV_SIZE);

	/* Precompute a bunch of the scaling factors that are used per-packet
	 * based on SRTT of 100ms

	beta_scale = 8*(BICTCP_BETA_SCALE+beta) / 3
		/ (BICTCP_BETA_SCALE - beta);

	cube_rtt_scale = (bic_scale * 10);	/* 1024*c/rtt */

	/* calculate the "K" for (wmax-cwnd) = c/rtt * K^3
	 *  so K = cubic_root( (wmax-cwnd)*rtt/c )
	 * the unit of K is bictcp_HZ=2^10, not HZ
	 *  c = bic_scale >> 10
	 *  rtt = 100ms
	 * the following code has been designed and tested for
	 * cwnd < 1 million packets
	 * RTT < 100 seconds
	 * HZ < 1,000,00  (corresponding to 10 nano-second)

	/* 1/c * 2^2*bictcp_HZ * srtt */
	cube_factor = 1ull << (10+3*BICTCP_HZ); /* 2^40 */

	/* divide by bic_scale and by constant Srtt (100ms) */
	do_div(cube_factor, bic_scale * 10);

	return tcp_register_congestion_control(&cubictcp);

static void __exit cubictcp_unregister(void)


MODULE_AUTHOR("Sangtae Ha, Stephen Hemminger");

Linux Kernel Source:
Main TCP congestion control support implementation file:
/net/ipv4/tcp_cong.c -
tcp_register_congestion_control(), tcp_unregister_congestion_control() -
tcp_ca_find() -
tcp_set_congestion_control() -
tcp_init_congestion_control(), tcp_reinit_congestion_control() -
tcp_get_allowed_congestion_control(), tcp_set_allowed_congestion_control() -
tcp_set_default_congestion_control() -
tcp_get_available_congestion_control() -
Data-structures: struct tcp_congestion_ops data-structure -
static LIST_HEAD(tcp_cong_list); - Linked List -

Here is the LIST_HEAD(tcp_cong_list) linked list, tcp_ca_find() API (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

static DEFINE_SPINLOCK(tcp_cong_list_lock);
static LIST_HEAD(tcp_cong_list);

/* Simple linear search, don't expect many entries! */
static struct tcp_congestion_ops *tcp_ca_find(const char *name)
	struct tcp_congestion_ops *e;

	list_for_each_entry_rcu(e, &tcp_cong_list, list) {
		if (strcmp(e->name, name) == 0)
			return e;

	return NULL;

Here is the tcp_init_congestion_control(), tcp_reinit_congestion_control() APIs (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

void tcp_init_congestion_control(struct sock *sk)
	const struct inet_connection_sock *icsk = inet_csk(sk);

	tcp_sk(sk)->prior_ssthresh = 0;
	if (icsk->icsk_ca_ops->init)
	if (tcp_ca_needs_ecn(sk))

static void tcp_reinit_congestion_control(struct sock *sk,
					  const struct tcp_congestion_ops *ca)
	struct inet_connection_sock *icsk = inet_csk(sk);

	icsk->icsk_ca_ops = ca;
	icsk->icsk_ca_setsockopt = 1;
	memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv));

	if (sk->sk_state != TCP_CLOSE)

Here is the tcp_set_default_congestion_control(), tcp_get_available_congestion_control(), tcp_set_allowed_congestion_control() APIs (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

/* Used by sysctl to change default congestion control */
int tcp_set_default_congestion_control(const char *name)
	struct tcp_congestion_ops *ca;
	int ret = -ENOENT;

	ca = tcp_ca_find(name);
	if (!ca && capable(CAP_NET_ADMIN)) {

		request_module("tcp_%s", name);
		ca = tcp_ca_find(name);

	if (ca) {
		ca->flags |= TCP_CONG_NON_RESTRICTED;	/* default is always allowed */
		list_move(&ca->list, &tcp_cong_list);
		ret = 0;

	return ret;

/* Set default value from kernel configuration at bootup */
static int __init tcp_congestion_default(void)
	return tcp_set_default_congestion_control(CONFIG_DEFAULT_TCP_CONG);

/* Build string with list of available congestion control values */
void tcp_get_available_congestion_control(char *buf, size_t maxlen)
	struct tcp_congestion_ops *ca;
	size_t offs = 0;

	list_for_each_entry_rcu(ca, &tcp_cong_list, list) {
		offs += snprintf(buf + offs, maxlen - offs,
				 offs == 0 ? "" : " ", ca->name);

/* Get current default congestion control */
void tcp_get_default_congestion_control(char *name)
	struct tcp_congestion_ops *ca;
	/* We will always have reno... */

	ca = list_entry(, struct tcp_congestion_ops, list);
	strncpy(name, ca->name, TCP_CA_NAME_MAX);

/* Change list of non-restricted congestion control */
int tcp_set_allowed_congestion_control(char *val)
	struct tcp_congestion_ops *ca;
	char *saved_clone, *clone, *name;
	int ret = 0;

	saved_clone = clone = kstrdup(val, GFP_USER);
	if (!clone)
		return -ENOMEM;

	/* pass 1 check for bad entries */
	while ((name = strsep(&clone, " ")) && *name) {
		ca = tcp_ca_find(name);
		if (!ca) {
			ret = -ENOENT;
			goto out;

	/* pass 2 clear old values */
	list_for_each_entry_rcu(ca, &tcp_cong_list, list)
		ca->flags &= ~TCP_CONG_NON_RESTRICTED;

	/* pass 3 mark as allowed */
	while ((name = strsep(&val, " ")) && *name) {
		ca = tcp_ca_find(name);
		if (ca)
			ca->flags |= TCP_CONG_NON_RESTRICTED;

	return ret;

Suggested Topics:

Video Episodes :: Linux Kernel programming

Linux Kernel - Containers and Namespaces ↗
Thursday' 09-Jul-2020

Linux ioctl() API interface ↗
Thursday' 09-Jul-2020
Watch detailed videos and read topics on Linux Kernel Programming and Linux ioctl() API interface

Linux Kernel Architecture ↗
Thursday' 09-Jul-2020

Linux Kernel Programming - Device Drivers ↗
Thursday' 09-Jul-2020
Watch detailed videos and read topics on Linux Kernel Programming - Device Drivers

Linux Kernel /proc Interface ↗
Thursday' 09-Jul-2020

Linux Kernel Programming ↗
Thursday' 09-Jul-2020

Linux Kernel FileSystems Subsystem ↗
Thursday' 09-Jul-2020

Linux Kernel Compilation ↗
Thursday' 09-Jul-2020

Linux Operating System - User-space Processes ↗
Thursday' 09-Jul-2020

Join The Linux Channel :: Facebook Group ↗

Visit The Linux Channel :: on Youtube ↗

Join a course:

💎 Linux, Kernel, Networking and Device Drivers: PDF Brochure
💎 PhD or equivalent (or Post Doctoral) looking for assistance: Details
💎 ... or unlimited life-time mentorship: Details

💗 Help shape the future: Sponsor/Donate

Tópicos recomendados:
Featured Video:
Assista no Youtube - [803//0] 214 Introduction and code-walk - Linux Kernel struct dst_entry datastructure - ep1 ↗

Roadmap - How to become Linux Kernel Developer - Device Drivers Programmer and a Systems Software Expert ↗
Thursday' 09-Jul-2020
Many viewers and even sometimes my students ask me how I can become a kernel programmer or just device driver developer and so on. So I shot this video (and an add-on video) where I summarized steps and a roadmap to become a full-fledged Linux Kernel Developer.

Compiling a C Compiler with a C Compilter | Compile gcc with gcc ↗
Thursday' 09-Jul-2020
The fundamental aspect of a programming language compiler is to translate code written from language to other. But most commonly compilers will compile code written in high-level human friendly language such as C, C++, Java, etc. to native CPU architecture specific (machine understandable) binary code which is nothing but sequence of CPU instructions. Hence if we see that way we should able to compile gcc Compiler source code with a gcc Compiler binary.

AT&T Archives: The UNIX Operating System ↗
Thursday' 09-Jul-2020

CEO, CTO Talk ↗
Thursday' 09-Jul-2020

Programming Language Performance and Overheads ↗
Thursday' 09-Jul-2020
A detailed Youtube video series of various programming language performance and overheads - a big picture

Arduino UNO - RO Water Purifier Controller ↗
Thursday' 09-Jul-2020
Here is a Youtube VLOG of my DIY RO Water Purifier Controller done via Arduino UNO. I want the Arduino UNO to control the RO pump, so that it pumps for a specific duration and stops automatically. This is done via Opto-isolated 4 Channel 5V 10A Relay Board meant for Arduino UNO, Raspberry Pi or similar SoC boards which offers GPIO pins. To this relay I have connected the RO water purifier booster pump which works at 24V DC connected via 220V AC to 24V DC power supply adaptar. I have also connected a small active 5V buzzer to notify the progress and completion as it fills the tank/canister.

CYRIL INGÉNIERIE - CoreFreq Linux CPU monitoring software ↗
Thursday' 09-Jul-2020

Nmap Network Scanning ↗
Thursday' 09-Jul-2020

Telnet installation and remote access ↗
Thursday' 09-Jul-2020

Linux Kernel Network Programming - Transport Layer L4 TCP/UDP Registration - Protocol APIs ↗
Thursday' 09-Jul-2020

Trending Video:
Assista no Youtube - [8054//0] 352 Linux user-space - Shared Memory IPC - Live Demo and Example ↗

Linux Kernel vs performance tools ↗
Thursday' 09-Jul-2020

Recommended Video:
Assista no Youtube - [8198//0] 0x16d NAS OS - FreeNAS vs UnRAID vs Rockstor vs OpenMediaVault vs Ubuntu Server and my DIY NAS bare-metal build ↗