VIDEOS 》 Linux Kernel struct socket and struct sock data-structure
For more details refer Linux Kernel Source:
https://wiki.linuxfoundation.org/images/1/1c/Network_data_flow_through_kernel.png
struct socket - http://elixir.free-electrons.com/linux/latest/source/include/linux/net.h
struct sock - http://elixir.free-electrons.com/linux/latest/source/include/net/sock.h
struct socket implementation and APIs - http://elixir.free-electrons.com/linux/v4.13.10/source/net/socket.c
struct sock implementation and APIs - http://elixir.free-electrons.com/linux/v4.13.10/source/net/core/sock.c
Big-picture Linux Kernel Networking subsystem - http://the-linux-channel.the-toffee-project.org/index.php?page=45-videos-linux-kernel-networking-sub-system
Socket User-space/user-side:
TUTORIALS :: UDP sample socket code for Systems and Network software developers - http://the-linux-channel.the-toffee-project.org/index.php?page=2-tutorials-udp-sample-socket-code-for-systems-and-network-software-developers
And here is the copy paste of struct socket data-structure
(/include/linux/net.h)
from the Kernel-source version 4.13 for quick reference:
/** * struct socket - general BSD socket * @state: socket state (%SS_CONNECTED, etc) * @type: socket type (%SOCK_STREAM, etc) * @flags: socket flags (%SOCK_NOSPACE, etc) * @ops: protocol specific socket operations * @file: File back pointer for gc * @sk: internal networking protocol agnostic socket representation * @wq: wait queue for several uses */ struct socket { socket_state state; kmemcheck_bitfield_begin(type); short type; kmemcheck_bitfield_end(type); unsigned long flags; struct socket_wq __rcu *wq; struct file *file; struct sock *sk; const struct proto_ops *ops; };
And here is the copy paste of sample struct socket data-structure API in
(/source/net/socket.c (struct socket implementation and APIs))
from the Kernel-source version 4.13 for quick reference:
/** * sock_alloc - allocate a socket * * Allocate a new inode and socket object. The two are bound together * and initialised. The socket is then returned. If we are out of inodes * NULL is returned. */ struct socket *sock_alloc(void) { struct inode *inode; struct socket *sock; inode = new_inode_pseudo(sock_mnt->mnt_sb); if (!inode) return NULL; sock = SOCKET_I(inode); kmemcheck_annotate_bitfield(sock, type); inode->i_ino = get_next_ino(); inode->i_mode = S_IFSOCK | S_IRWXUGO; inode->i_uid = current_fsuid(); inode->i_gid = current_fsgid(); inode->i_op = &sockfs_inode_ops; this_cpu_add(sockets_in_use, 1); return sock; } EXPORT_SYMBOL(sock_alloc);
And here is the copy paste of struct sock data-structure
(/include/net/sock.h)
from the Kernel-source version 4.13 for quick reference:
/** * struct sock - network layer representation of sockets * @__sk_common: shared layout with inet_timewait_sock * @sk_shutdown: mask of %SEND_SHUTDOWN and/or %RCV_SHUTDOWN * @sk_userlocks: %SO_SNDBUF and %SO_RCVBUF settings * @sk_lock: synchronizer * @sk_kern_sock: True if sock is using kernel lock classes * @sk_rcvbuf: size of receive buffer in bytes * @sk_wq: sock wait queue and async head * @sk_rx_dst: receive input route used by early demux * @sk_dst_cache: destination cache * @sk_dst_pending_confirm: need to confirm neighbour * @sk_policy: flow policy * @sk_receive_queue: incoming packets * @sk_wmem_alloc: transmit queue bytes committed * @sk_tsq_flags: TCP Small Queues flags * @sk_write_queue: Packet sending queue * @sk_omem_alloc: "o" is "option" or "other" * @sk_wmem_queued: persistent queue size * @sk_forward_alloc: space allocated forward * @sk_napi_id: id of the last napi context to receive data for sk * @sk_ll_usec: usecs to busypoll when there is no data * @sk_allocation: allocation mode * @sk_pacing_rate: Pacing rate (if supported by transport/packet scheduler) * @sk_pacing_status: Pacing status (requested, handled by sch_fq) * @sk_max_pacing_rate: Maximum pacing rate (%SO_MAX_PACING_RATE) * @sk_sndbuf: size of send buffer in bytes * @__sk_flags_offset: empty field used to determine location of bitfield * @sk_padding: unused element for alignment * @sk_no_check_tx: %SO_NO_CHECK setting, set checksum in TX packets * @sk_no_check_rx: allow zero checksum in RX packets * @sk_route_caps: route capabilities (e.g. %NETIF_F_TSO) * @sk_route_nocaps: forbidden route capabilities (e.g NETIF_F_GSO_MASK) * @sk_gso_type: GSO type (e.g. %SKB_GSO_TCPV4) * @sk_gso_max_size: Maximum GSO segment size to build * @sk_gso_max_segs: Maximum number of GSO segments * @sk_lingertime: %SO_LINGER l_linger setting * @sk_backlog: always used with the per-socket spinlock held * @sk_callback_lock: used with the callbacks in the end of this struct * @sk_error_queue: rarely used * @sk_prot_creator: sk_prot of original sock creator (see ipv6_setsockopt, * IPV6_ADDRFORM for instance) * @sk_err: last error * @sk_err_soft: errors that don't cause failure but are the cause of a * persistent failure not just 'timed out' * @sk_drops: raw/udp drops counter * @sk_ack_backlog: current listen backlog * @sk_max_ack_backlog: listen backlog set in listen() * @sk_uid: user id of owner * @sk_priority: %SO_PRIORITY setting * @sk_type: socket type (%SOCK_STREAM, etc) * @sk_protocol: which protocol this socket belongs in this network family * @sk_peer_pid: &struct pid for this socket's peer * @sk_peer_cred: %SO_PEERCRED setting * @sk_rcvlowat: %SO_RCVLOWAT setting * @sk_rcvtimeo: %SO_RCVTIMEO setting * @sk_sndtimeo: %SO_SNDTIMEO setting * @sk_txhash: computed flow hash for use on transmit * @sk_filter: socket filtering instructions * @sk_timer: sock cleanup timer * @sk_stamp: time stamp of last packet received * @sk_tsflags: SO_TIMESTAMPING socket options * @sk_tskey: counter to disambiguate concurrent tstamp requests * @sk_socket: Identd and reporting IO signals * @sk_user_data: RPC layer private data * @sk_frag: cached page frag * @sk_peek_off: current peek_offset value * @sk_send_head: front of stuff to transmit * @sk_security: used by security modules * @sk_mark: generic packet mark * @sk_cgrp_data: cgroup data for this cgroup * @sk_memcg: this socket's memory cgroup association * @sk_write_pending: a write to stream socket waits to start * @sk_state_change: callback to indicate change in the state of the sock * @sk_data_ready: callback to indicate there is data to be processed * @sk_write_space: callback to indicate there is bf sending space available * @sk_error_report: callback to indicate errors (e.g. %MSG_ERRQUEUE) * @sk_backlog_rcv: callback to process the backlog * @sk_destruct: called at sock freeing time, i.e. when all refcnt == 0 * @sk_reuseport_cb: reuseport group container * @sk_rcu: used during RCU grace period */ struct sock { /* * Now struct inet_timewait_sock also uses sock_common, so please just * don't add nothing before this first member (__sk_common) --acme */ struct sock_common __sk_common; #define sk_node __sk_common.skc_node #define sk_nulls_node __sk_common.skc_nulls_node #define sk_refcnt __sk_common.skc_refcnt #define sk_tx_queue_mapping __sk_common.skc_tx_queue_mapping #define sk_dontcopy_begin __sk_common.skc_dontcopy_begin #define sk_dontcopy_end __sk_common.skc_dontcopy_end #define sk_hash __sk_common.skc_hash #define sk_portpair __sk_common.skc_portpair #define sk_num __sk_common.skc_num #define sk_dport __sk_common.skc_dport #define sk_addrpair __sk_common.skc_addrpair #define sk_daddr __sk_common.skc_daddr #define sk_rcv_saddr __sk_common.skc_rcv_saddr #define sk_family __sk_common.skc_family #define sk_state __sk_common.skc_state #define sk_reuse __sk_common.skc_reuse #define sk_reuseport __sk_common.skc_reuseport #define sk_ipv6only __sk_common.skc_ipv6only #define sk_net_refcnt __sk_common.skc_net_refcnt #define sk_bound_dev_if __sk_common.skc_bound_dev_if #define sk_bind_node __sk_common.skc_bind_node #define sk_prot __sk_common.skc_prot #define sk_net __sk_common.skc_net #define sk_v6_daddr __sk_common.skc_v6_daddr #define sk_v6_rcv_saddr __sk_common.skc_v6_rcv_saddr #define sk_cookie __sk_common.skc_cookie #define sk_incoming_cpu __sk_common.skc_incoming_cpu #define sk_flags __sk_common.skc_flags #define sk_rxhash __sk_common.skc_rxhash socket_lock_t sk_lock; atomic_t sk_drops; int sk_rcvlowat; struct sk_buff_head sk_error_queue; struct sk_buff_head sk_receive_queue; /* * The backlog queue is special, it is always used with * the per-socket spinlock held and requires low latency * access. Therefore we special case it's implementation. * Note : rmem_alloc is in this structure to fill a hole * on 64bit arches, not because its logically part of * backlog. */ struct { atomic_t rmem_alloc; int len; struct sk_buff *head; struct sk_buff *tail; } sk_backlog; #define sk_rmem_alloc sk_backlog.rmem_alloc int sk_forward_alloc; #ifdef CONFIG_NET_RX_BUSY_POLL unsigned int sk_ll_usec; /* ===== mostly read cache line ===== */ unsigned int sk_napi_id; #endif int sk_rcvbuf; struct sk_filter __rcu *sk_filter; union { struct socket_wq __rcu *sk_wq; struct socket_wq *sk_wq_raw; }; #ifdef CONFIG_XFRM struct xfrm_policy __rcu *sk_policy[2]; #endif struct dst_entry *sk_rx_dst; struct dst_entry __rcu *sk_dst_cache; atomic_t sk_omem_alloc; int sk_sndbuf; /* ===== cache line for TX ===== */ int sk_wmem_queued; refcount_t sk_wmem_alloc; unsigned long sk_tsq_flags; struct sk_buff *sk_send_head; struct sk_buff_head sk_write_queue; __s32 sk_peek_off; int sk_write_pending; __u32 sk_dst_pending_confirm; u32 sk_pacing_status; /* see enum sk_pacing */ long sk_sndtimeo; struct timer_list sk_timer; __u32 sk_priority; __u32 sk_mark; u32 sk_pacing_rate; /* bytes per second */ u32 sk_max_pacing_rate; struct page_frag sk_frag; netdev_features_t sk_route_caps; netdev_features_t sk_route_nocaps; int sk_gso_type; unsigned int sk_gso_max_size; gfp_t sk_allocation; __u32 sk_txhash; /* * Because of non atomicity rules, all * changes are protected by socket lock. */ unsigned int __sk_flags_offset[0]; #ifdef __BIG_ENDIAN_BITFIELD #define SK_FL_PROTO_SHIFT 16 #define SK_FL_PROTO_MASK 0x00ff0000 #define SK_FL_TYPE_SHIFT 0 #define SK_FL_TYPE_MASK 0x0000ffff #else #define SK_FL_PROTO_SHIFT 8 #define SK_FL_PROTO_MASK 0x0000ff00 #define SK_FL_TYPE_SHIFT 16 #define SK_FL_TYPE_MASK 0xffff0000 #endif kmemcheck_bitfield_begin(flags); unsigned int sk_padding : 1, sk_kern_sock : 1, sk_no_check_tx : 1, sk_no_check_rx : 1, sk_userlocks : 4, sk_protocol : 8, sk_type : 16; #define SK_PROTOCOL_MAX U8_MAX kmemcheck_bitfield_end(flags); u16 sk_gso_max_segs; unsigned long sk_lingertime; struct proto *sk_prot_creator; rwlock_t sk_callback_lock; int sk_err, sk_err_soft; u32 sk_ack_backlog; u32 sk_max_ack_backlog; kuid_t sk_uid; struct pid *sk_peer_pid; const struct cred *sk_peer_cred; long sk_rcvtimeo; ktime_t sk_stamp; u16 sk_tsflags; u8 sk_shutdown; u32 sk_tskey; struct socket *sk_socket; void *sk_user_data; #ifdef CONFIG_SECURITY void *sk_security; #endif struct sock_cgroup_data sk_cgrp_data; struct mem_cgroup *sk_memcg; void (*sk_state_change)(struct sock *sk); void (*sk_data_ready)(struct sock *sk); void (*sk_write_space)(struct sock *sk); void (*sk_error_report)(struct sock *sk); int (*sk_backlog_rcv)(struct sock *sk, struct sk_buff *skb); void (*sk_destruct)(struct sock *sk); struct sock_reuseport __rcu *sk_reuseport_cb; struct rcu_head sk_rcu; }; enum sk_pacing { SK_PACING_NONE = 0, SK_PACING_NEEDED = 1, SK_PACING_FQ = 2, };
Here is the Network data/API flow through kernel for quick reference:
Source: https://wiki.linuxfoundation.org/images/1/1c/Network_data_flow_through_kernel.png
For more details refer Linux Kernel Source:
struct socket - http://elixir.free-electrons.com/linux/latest/source/include/linux/net.h
struct socket implementation and APIs - http://elixir.free-electrons.com/linux/v4.13.10/source/net/socket.c
sock_sendmsg() - http://elixir.free-electrons.com/linux/v4.13.10/source/net/socket.c#L638
sock_recvmsg() - http://elixir.free-electrons.com/linux/v4.13.10/source/net/socket.c#L795
----
udp_sendmsg(), udp_recvmsg() - http://elixir.free-electrons.com/linux/v4.13.10/source/net/ipv4/udp.c#L2548
int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int noblock, int flags, int *addr_len) - http://elixir.free-electrons.com/linux/v4.13.10/source/net/ipv4/udp.c#L1564
int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) - http://elixir.free-electrons.com/linux/v4.13.10/source/net/ipv4/udp.c#L862
----
tcp_sendmsg(), tcp_recvmsg() - http://elixir.free-electrons.com/linux/v4.13.10/source/net/ipv4/tcp_ipv4.c#L2440
int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, int flags, int *addr_len) - http://elixir.free-electrons.com/linux/v4.13.10/source/net/ipv4/tcp.c#L1663
int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) - http://elixir.free-electrons.com/linux/v4.13.10/source/net/ipv4/tcp.c#L1147
----
sctp_sendmsg(), sctp_recvmsg() - http://elixir.free-electrons.com/linux/v4.13.10/source/net/sctp/socket.c#L8229
static int sctp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int noblock, int flags, int *addr_len) - http://elixir.free-electrons.com/linux/v4.13.10/source/net/sctp/socket.c#L2080
static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len) - http://elixir.free-electrons.com/linux/v4.13.10/source/net/sctp/socket.c#L1598
And here is the copy paste of sock_sendmsg()
(/net/socket.c)
from the Kernel-source version 4.13 for quick reference:
static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg) { int ret = sock->ops->sendmsg(sock, msg, msg_data_left(msg)); BUG_ON(ret == -EIOCBQUEUED); return ret; } int sock_sendmsg(struct socket *sock, struct msghdr *msg) { int err = security_socket_sendmsg(sock, msg, msg_data_left(msg)); return err ?: sock_sendmsg_nosec(sock, msg); } EXPORT_SYMBOL(sock_sendmsg);
And here is the copy paste of sock_recvmsg()
(/net/socket.c)
from the Kernel-source version 4.13 for quick reference:
static inline int sock_recvmsg_nosec(struct socket *sock, struct msghdr *msg, int flags) { return sock->ops->recvmsg(sock, msg, msg_data_left(msg), flags); } int sock_recvmsg(struct socket *sock, struct msghdr *msg, int flags) { int err = security_socket_recvmsg(sock, msg, msg_data_left(msg), flags); return err ?: sock_recvmsg_nosec(sock, msg, flags); } EXPORT_SYMBOL(sock_recvmsg);
Suggested Topics:
Video Episodes :: Linux Kernel programming
Saturday' 13-Mar-2021
Wednesday' 18-May-2022
Wednesday' 18-May-2022
Saturday' 13-Mar-2021
Saturday' 13-Mar-2021
Saturday' 14-May-2022
Join The Linux Channel :: Facebook Group ↗
Visit The Linux Channel :: on Youtube ↗
💗 Help shape the future: Sponsor/Donate
Recommended Topics:
Featured Video:
Saturday' 13-Mar-2021
Trending Video:
Recommended Video: