HOMEVIDEOSCOURSESSTUDENTSSPONSORSDONATIONSEVENTSTUTORIALSLINKSNEWSCONTACT


VIDEOS 》 Linux Kernel TCP Congestion Control Algorithms

Refer:
TCP congestion control :: https://en.wikipedia.org/wiki/TCP_congestion_control
-----
Linux Kernel Source:
Linux Kernel - IPv4 Stack /net/ipv4 :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4
TCP CUBIC: Binary Increase Congestion control for TCP - /net/ipv4/tcp_cubic.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cubic.c
TCP-HYBLA Congestion control algorithm - /net/ipv4/tcp_hybla.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_hybla.c
* A TCP Enhancement for Heterogeneous Networks
TCP Illinois congestion control - /net/ipv4/tcp_illinois.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_illinois.c
TCP Westwood+: end-to-end bandwidth estimation for TCP - /net/ipv4/tcp_westwood.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_westwood.c
TCP Vegas congestion control - /net/ipv4/tcp_vegas.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_vegas.c
TCP Veno congestion control - /net/ipv4/tcp_veno.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_veno.c
H-TCP congestion control: TCP for high-speed and long-distance networks - /net/ipv4/tcp_htcp.c :: http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_htcp.c

Also refer the Kernel Architecture difference between a standard Linux System vs Android OS, discussed in this video:
Kernel Architecture - Generic Linux System vs Android - The Linux Channel

Refer:
Wiki:
CUBIC TCP - https://en.wikipedia.org/wiki/CUBIC_TCP
BIC TCP - https://en.wikipedia.org/wiki/BIC_TCP
LFN: long fat networks - https://en.wikipedia.org/wiki/Bandwidth-delay_product
Bandwidth-delay product - https://en.wikipedia.org/wiki/Bandwidth-delay_product
-----
Linux Kernel Source:
TCP CUBIC: Binary Increase Congestion control for TCP v2.3 - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cubic.c
static struct tcp_congestion_ops cubictcp - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cubic.c#L457
static int __init cubictcp_register(void) - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cubic.c#L469
static void __exit cubictcp_unregister(void) - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cubic.c#L504
tcp_register_congestion_control(), tcp_unregister_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/include/net/tcp.h#L1029
Pluggable TCP congestion control support - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c
struct tcp_congestion_ops data-structure - http://elixir.free-electrons.com/linux/latest/source/include/net/tcp.h#L989

Here is the struct tcp_congestion_ops data-structure (/include/net/tcp.h) from the Kernel-source version 4.14 for quick reference:

struct tcp_congestion_ops {
	struct list_head	list;
	u32 key;
	u32 flags;

	/* initialize private data (optional) */
	void (*init)(struct sock *sk);
	/* cleanup private data  (optional) */
	void (*release)(struct sock *sk);

	/* return slow start threshold (required) */
	u32 (*ssthresh)(struct sock *sk);
	/* do new cwnd calculation (required) */
	void (*cong_avoid)(struct sock *sk, u32 ack, u32 acked);
	/* call before changing ca_state (optional) */
	void (*set_state)(struct sock *sk, u8 new_state);
	/* call when cwnd event occurs (optional) */
	void (*cwnd_event)(struct sock *sk, enum tcp_ca_event ev);
	/* call when ack arrives (optional) */
	void (*in_ack_event)(struct sock *sk, u32 flags);
	/* new value of cwnd after loss (required) */
	u32  (*undo_cwnd)(struct sock *sk);
	/* hook for packet ack accounting (optional) */
	void (*pkts_acked)(struct sock *sk, const struct ack_sample *sample);
	/* suggest number of segments for each skb to transmit (optional) */
	u32 (*tso_segs_goal)(struct sock *sk);
	/* returns the multiplier used in tcp_sndbuf_expand (optional) */
	u32 (*sndbuf_expand)(struct sock *sk);
	/* call when packets are delivered to update cwnd and pacing rate,
	 * after all the ca_state processing. (optional)
	 */
	void (*cong_control)(struct sock *sk, const struct rate_sample *rs);
	/* get info for inet_diag (optional) */
	size_t (*get_info)(struct sock *sk, u32 ext, int *attr,
			   union tcp_cc_info *info);

	char 		name[TCP_CA_NAME_MAX];
	struct module 	*owner;
};

Here is the tcp_register_congestion_control(), tcp_unregister_congestion_control() pluggable congestion control registration APIs (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

/*
 * Attach new congestion control algorithm to the list
 * of available options.
 */
int tcp_register_congestion_control(struct tcp_congestion_ops *ca)
{
	int ret = 0;

	/* all algorithms must implement these */
	if (!ca->ssthresh || !ca->undo_cwnd ||
	    !(ca->cong_avoid || ca->cong_control)) {
		pr_err("%s does not implement required ops\n", ca->name);
		return -EINVAL;
	}

	ca->key = jhash(ca->name, sizeof(ca->name), strlen(ca->name));

	spin_lock(&tcp_cong_list_lock);
	if (ca->key == TCP_CA_UNSPEC || tcp_ca_find_key(ca->key)) {
		pr_notice("%s already registered or non-unique key\n",
			  ca->name);
		ret = -EEXIST;
	} else {
		list_add_tail_rcu(&ca->list, &tcp_cong_list);
		pr_debug("%s registered\n", ca->name);
	}
	spin_unlock(&tcp_cong_list_lock);

	return ret;
}
EXPORT_SYMBOL_GPL(tcp_register_congestion_control);

/*
 * Remove congestion control algorithm, called from
 * the module's remove function.  Module ref counts are used
 * to ensure that this can't be done till all sockets using
 * that method are closed.
 */
void tcp_unregister_congestion_control(struct tcp_congestion_ops *ca)
{
	spin_lock(&tcp_cong_list_lock);
	list_del_rcu(&ca->list);
	spin_unlock(&tcp_cong_list_lock);

	/* Wait for outstanding readers to complete before the
	 * module gets removed entirely.
	 *
	 * A try_module_get() should fail by now as our module is
	 * in "going" state since no refs are held anymore and
	 * module_exit() handler being called.
	 */
	synchronize_rcu();
}
EXPORT_SYMBOL_GPL(tcp_unregister_congestion_control);

Here is the struct tcp_congestion_ops cubictcp data-structure instance, cubictcp_register(), cubictcp_unregister() APIs (/net/ipv4/tcp_cubic.c) from the Kernel-source version 4.14 for quick reference:

static struct tcp_congestion_ops cubictcp __read_mostly = {
	.init		= bictcp_init,
	.ssthresh	= bictcp_recalc_ssthresh,
	.cong_avoid	= bictcp_cong_avoid,
	.set_state	= bictcp_state,
	.undo_cwnd	= tcp_reno_undo_cwnd,
	.cwnd_event	= bictcp_cwnd_event,
	.pkts_acked     = bictcp_acked,
	.owner		= THIS_MODULE,
	.name		= "cubic",
};

static int __init cubictcp_register(void)
{
	BUILD_BUG_ON(sizeof(struct bictcp) > ICSK_CA_PRIV_SIZE);

	/* Precompute a bunch of the scaling factors that are used per-packet
	 * based on SRTT of 100ms
	 */

	beta_scale = 8*(BICTCP_BETA_SCALE+beta) / 3
		/ (BICTCP_BETA_SCALE - beta);

	cube_rtt_scale = (bic_scale * 10);	/* 1024*c/rtt */

	/* calculate the "K" for (wmax-cwnd) = c/rtt * K^3
	 *  so K = cubic_root( (wmax-cwnd)*rtt/c )
	 * the unit of K is bictcp_HZ=2^10, not HZ
	 *
	 *  c = bic_scale >> 10
	 *  rtt = 100ms
	 *
	 * the following code has been designed and tested for
	 * cwnd < 1 million packets
	 * RTT < 100 seconds
	 * HZ < 1,000,00  (corresponding to 10 nano-second)
	 */

	/* 1/c * 2^2*bictcp_HZ * srtt */
	cube_factor = 1ull << (10+3*BICTCP_HZ); /* 2^40 */

	/* divide by bic_scale and by constant Srtt (100ms) */
	do_div(cube_factor, bic_scale * 10);

	return tcp_register_congestion_control(&cubictcp);
}

static void __exit cubictcp_unregister(void)
{
	tcp_unregister_congestion_control(&cubictcp);
}

module_init(cubictcp_register);
module_exit(cubictcp_unregister);

MODULE_AUTHOR("Sangtae Ha, Stephen Hemminger");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("CUBIC TCP");
MODULE_VERSION("2.3");

Refer:
Linux Kernel Source:
Main TCP congestion control support implementation file:
/net/ipv4/tcp_cong.c - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c
-------
APIs:
tcp_register_congestion_control(), tcp_unregister_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/include/net/tcp.h#L1029
tcp_ca_find() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L23
tcp_set_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L341
tcp_init_congestion_control(), tcp_reinit_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L179
tcp_get_allowed_congestion_control(), tcp_set_allowed_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L280
tcp_set_default_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L217
tcp_get_available_congestion_control() - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L252
-------
Data-structures: struct tcp_congestion_ops data-structure - http://elixir.free-electrons.com/linux/latest/source/include/net/tcp.h#L989
static LIST_HEAD(tcp_cong_list); - Linked List - http://elixir.free-electrons.com/linux/latest/source/net/ipv4/tcp_cong.c#L20

Here is the LIST_HEAD(tcp_cong_list) linked list, tcp_ca_find() API (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

static DEFINE_SPINLOCK(tcp_cong_list_lock);
static LIST_HEAD(tcp_cong_list);

/* Simple linear search, don't expect many entries! */
static struct tcp_congestion_ops *tcp_ca_find(const char *name)
{
	struct tcp_congestion_ops *e;

	list_for_each_entry_rcu(e, &tcp_cong_list, list) {
		if (strcmp(e->name, name) == 0)
			return e;
	}

	return NULL;
}

Here is the tcp_init_congestion_control(), tcp_reinit_congestion_control() APIs (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

void tcp_init_congestion_control(struct sock *sk)
{
	const struct inet_connection_sock *icsk = inet_csk(sk);

	tcp_sk(sk)->prior_ssthresh = 0;
	if (icsk->icsk_ca_ops->init)
		icsk->icsk_ca_ops->init(sk);
	if (tcp_ca_needs_ecn(sk))
		INET_ECN_xmit(sk);
	else
		INET_ECN_dontxmit(sk);
}

static void tcp_reinit_congestion_control(struct sock *sk,
					  const struct tcp_congestion_ops *ca)
{
	struct inet_connection_sock *icsk = inet_csk(sk);

	tcp_cleanup_congestion_control(sk);
	icsk->icsk_ca_ops = ca;
	icsk->icsk_ca_setsockopt = 1;
	memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv));

	if (sk->sk_state != TCP_CLOSE)
		tcp_init_congestion_control(sk);
}

Here is the tcp_set_default_congestion_control(), tcp_get_available_congestion_control(), tcp_set_allowed_congestion_control() APIs (/net/ipv4/tcp_cong.c) from the Kernel-source version 4.14 for quick reference:

/* Used by sysctl to change default congestion control */
int tcp_set_default_congestion_control(const char *name)
{
	struct tcp_congestion_ops *ca;
	int ret = -ENOENT;

	spin_lock(&tcp_cong_list_lock);
	ca = tcp_ca_find(name);
#ifdef CONFIG_MODULES
	if (!ca && capable(CAP_NET_ADMIN)) {
		spin_unlock(&tcp_cong_list_lock);

		request_module("tcp_%s", name);
		spin_lock(&tcp_cong_list_lock);
		ca = tcp_ca_find(name);
	}
#endif

	if (ca) {
		ca->flags |= TCP_CONG_NON_RESTRICTED;	/* default is always allowed */
		list_move(&ca->list, &tcp_cong_list);
		ret = 0;
	}
	spin_unlock(&tcp_cong_list_lock);

	return ret;
}

/* Set default value from kernel configuration at bootup */
static int __init tcp_congestion_default(void)
{
	return tcp_set_default_congestion_control(CONFIG_DEFAULT_TCP_CONG);
}
late_initcall(tcp_congestion_default);

/* Build string with list of available congestion control values */
void tcp_get_available_congestion_control(char *buf, size_t maxlen)
{
	struct tcp_congestion_ops *ca;
	size_t offs = 0;

	rcu_read_lock();
	list_for_each_entry_rcu(ca, &tcp_cong_list, list) {
		offs += snprintf(buf + offs, maxlen - offs,
				 "%s%s",
				 offs == 0 ? "" : " ", ca->name);
	}
	rcu_read_unlock();
}

/* Get current default congestion control */
void tcp_get_default_congestion_control(char *name)
{
	struct tcp_congestion_ops *ca;
	/* We will always have reno... */
	BUG_ON(list_empty(&tcp_cong_list));

	rcu_read_lock();
	ca = list_entry(tcp_cong_list.next, struct tcp_congestion_ops, list);
	strncpy(name, ca->name, TCP_CA_NAME_MAX);
	rcu_read_unlock();
}

/* Change list of non-restricted congestion control */
int tcp_set_allowed_congestion_control(char *val)
{
	struct tcp_congestion_ops *ca;
	char *saved_clone, *clone, *name;
	int ret = 0;

	saved_clone = clone = kstrdup(val, GFP_USER);
	if (!clone)
		return -ENOMEM;

	spin_lock(&tcp_cong_list_lock);
	/* pass 1 check for bad entries */
	while ((name = strsep(&clone, " ")) && *name) {
		ca = tcp_ca_find(name);
		if (!ca) {
			ret = -ENOENT;
			goto out;
		}
	}

	/* pass 2 clear old values */
	list_for_each_entry_rcu(ca, &tcp_cong_list, list)
		ca->flags &= ~TCP_CONG_NON_RESTRICTED;

	/* pass 3 mark as allowed */
	while ((name = strsep(&val, " ")) && *name) {
		ca = tcp_ca_find(name);
		WARN_ON(!ca);
		if (ca)
			ca->flags |= TCP_CONG_NON_RESTRICTED;
	}
out:
	spin_unlock(&tcp_cong_list_lock);
	kfree(saved_clone);

	return ret;
}



Suggested Topics:


Video Episodes :: Linux Kernel programming

Linux ioctl() API interface ↗
Saturday' 13-Mar-2021
Watch detailed videos and read topics on Linux Kernel Programming and Linux ioctl() API interface

Linux Kernel Compilation ↗
Wednesday' 18-May-2022

Linux Operating System - User-space Processes ↗
Saturday' 14-May-2022

Linux Kernel Programming - Device Drivers ↗
Saturday' 13-Mar-2021
Watch detailed videos and read topics on Linux Kernel Programming - Device Drivers

Linux Kernel FileSystems Subsystem ↗
Saturday' 13-Mar-2021

Linux Kernel /proc Interface ↗
Wednesday' 18-May-2022
/proc is one of the most popular kernel to user-space interface which you can leverage to add an interface to your Kernel code such as Kernel modules, Kernel Device Drivers, etc. Personally I prefer /proc interface than other alternatives such as /sysfs, ioctl() and so on for my personal Kernel modules/stack. So here is my detailed multi-episode Youtube video series on /sysfs Interface. I also conduct sessions/classes on Systems and Network software programming and architecture.

Linux Kernel Architecture ↗
Wednesday' 18-May-2022

Linux Kernel Programming ↗
Saturday' 13-Mar-2021

Linux Kernel /sysfs Interface ↗
Saturday' 14-May-2022
/sysfs is one of the most popular kernel to user-space interface which you can leverage to add an interface to your Kernel code such as Kernel modules, Kernel Device Drivers, etc. Although personally I prefer /proc interface than other alternatives such as /sysfs, ioctl() and so on for my personal Kernel modules/stack. So here is my detailed multi-episode Youtube video series on /sysfs Interface.

Linux Kernel vs User-space - Library APIs - Linux Kernel Programming ↗
Tuesday' 17-Jan-2023
One of the important aspects a beginner who is into Linux Kernel space systems software development has to understand is that unlike user-space C/C++ programming, where you can freely include any library APIs via respective #include files (which are dynamically linked during run-time via those /lib .so files), in the case of Kernel space programming, these library APIs are written within the Kernel source itself. These are the fundamental APIs which we commonly use, such as memcpy(), memcmp(), strlen(), strcpy(), strcpy() and so on. So here is my detailed Youtube video episode on the same with live demo, walk-through and examples.

What is purpose of Kernel Development - Example SMOAD Networks SDWAN Orchestrator Firewall Kernel Engine ↗
Monday' 18-Jul-2022
Often aspiring students may have this question, that what is the purpose of Linux Kernel Development. Since Linux Kernel is very mature and it has almost everything one would need. Usually, we need custom kernel development in the case of any new driver development for new upcoming hardware. And this happens on and on. But at times we may also come across few features/modules/components which are already provided by the Linux Kernel which are not adequate or atleast not the way we exactly intended to use. So, this is the real-world example, sometimes no matter what Linux Kernel provides as a part of stock Kernel/OS features, sometimes we have to write our own custom kernel stack or module(s) which can specifically cater our exact needs.

Linux Kernel - Containers and Namespaces ↗
Saturday' 13-Mar-2021

Join The Linux Channel :: Facebook Group ↗

Visit The Linux Channel :: on Youtube ↗


💗 Help shape the future: Sponsor/Donate


Recommended Topics:
Featured Video:
Watch on Youtube - [488//0] 0x166 NVIDIA CUDA Toolkit - Parallel Programming in CUDA - Ep3 ↗

What is purpose of Kernel Development - Example SMOAD Networks SDWAN Orchestrator Firewall Kernel Engine ↗
Monday' 18-Jul-2022
Often aspiring students may have this question, that what is the purpose of Linux Kernel Development. Since Linux Kernel is very mature and it has almost everything one would need. Usually, we need custom kernel development in the case of any new driver development for new upcoming hardware. And this happens on and on. But at times we may also come across few features/modules/components which are already provided by the Linux Kernel which are not adequate or atleast not the way we exactly intended to use. So, this is the real-world example, sometimes no matter what Linux Kernel provides as a part of stock Kernel/OS features, sometimes we have to write our own custom kernel stack or module(s) which can specifically cater our exact needs.

Roadmap - How to become Systems Software Developer ↗
Friday' 13-May-2022
When you are at the beginning of your career or a student, and aspire to become a software developer, one of the avenues to choose is to become a hard-core Systems Software Developer. However it is easier said than done, since there are many aspects to it as you explore further. As a part of systems developer, you can get into core kernel space developer, kernel device drivers developer, embedded developer and get into things like board bring-up, porting, etc, or can become a user-space systems programmer, and so on. So here is my detailed multi-episode Youtube video series on Roadmap - How to become Systems Software Developer.

The Linux Channel :: Sponsors ↗
Monday' 30-May-2022
Here is a list of all The Linux Channel sponsors/donors (individual/companies).

Linux Kernel vs User-space - Library APIs - Linux Kernel Programming ↗
Tuesday' 17-Jan-2023
One of the important aspects a beginner who is into Linux Kernel space systems software development has to understand is that unlike user-space C/C++ programming, where you can freely include any library APIs via respective #include files (which are dynamically linked during run-time via those /lib .so files), in the case of Kernel space programming, these library APIs are written within the Kernel source itself. These are the fundamental APIs which we commonly use, such as memcpy(), memcmp(), strlen(), strcpy(), strcpy() and so on. So here is my detailed Youtube video episode on the same with live demo, walk-through and examples.

Linux Kernel Driver Device Trees ↗
Tuesday' 17-Jan-2023
The Linux kernel is the backbone of the Linux operating system. A device tree is a hierarchical tree structure that describes the various devices that are present in a system, including their properties and relationships to one another. The device tree is used by the Linux kernel to identify and initialize the different devices on a system, and to provide a consistent interface for interacting with them.

Inline Programming | Assembly | Scripts | php, python, shell, etc | Rust in Linux Kernel ↗
Friday' 12-May-2023
Inline programming is a technique where code statements are included directly in the text of a program, instead of being contained in separate files or modules. Inline programming can be useful for small or simple tasks, as it can eliminate the need for a separate script or function. One common example of inline programming is using JavaScripts, Php, etc in HTML documents to create dynamic content. Similarly in Linux Kernel we can find lot of instances where we can find inline programming such as inline assembly and now Rust within the Kernel source.

Rockchip ROC-RK3566-PC from Firefly | OpenWRT ↗
Monday' 23-Jan-2023
Here is my multi-episode video series on evaluation of Rockchip ROC-RK3566-PC from Firefly with stock OpenWRT firmware.

Linux Kernel /sysfs Interface ↗
Saturday' 14-May-2022
/sysfs is one of the most popular kernel to user-space interface which you can leverage to add an interface to your Kernel code such as Kernel modules, Kernel Device Drivers, etc. Although personally I prefer /proc interface than other alternatives such as /sysfs, ioctl() and so on for my personal Kernel modules/stack. So here is my detailed multi-episode Youtube video series on /sysfs Interface.

Support, Donate and Contribute - The Linux Channel ↗
Saturday' 13-Mar-2021
Help shape the future and make an impact by donating/sponsor The Linux Channel. Your donation will transform lives !

Research Socket overhead in Linux vs Message Queues and benchmarking ↗
Saturday' 13-Mar-2021


Trending Video:
Watch on Youtube - [488//0] 315 Scripting in PHP - for Systems Software and Kernel Developers ↗

B.E(B.Tech, B.S) and M.E(M.Tech, M.S) Collage Final Year Projects ↗
Saturday' 13-Mar-2021



Recommended Video:
Watch on Youtube - [936//0] Linux Kernel skbuff data-structure - part9 - skb_trim() ↗