<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/net/vmw_vsock/af_vsock.c, branch docs-mw</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-mw</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-mw'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2026-04-14T19:04:00+00:00</updated>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2026-04-14T19:04:00+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2026-04-14T18:54:21+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=35c2c39832e569449b9192fa1afbbc4c66227af7'/>
<id>urn:sha1:35c2c39832e569449b9192fa1afbbc4c66227af7</id>
<content type='text'>
Merge in late fixes in preparation for the net-next PR.

Conflicts:

include/net/sch_generic.h
  a6bd339dbb351 ("net_sched: fix skb memory leak in deferred qdisc drops")
  ff2998f29f390 ("net: sched: introduce qdisc-specific drop reason tracing")
https://lore.kernel.org/adz0iX85FHMz0HdO@sirena.org.uk

drivers/net/ethernet/airoha/airoha_eth.c
  1acdfbdb516b ("net: airoha: Fix VIP configuration for AN7583 SoC")
  bf3471e6e6c0 ("net: airoha: Make flow control source port mapping dependent on nbq parameter")

Adjacent changes:

drivers/net/ethernet/airoha/airoha_ppe.c
  f44218cd5e6a ("net: airoha: Reset PPE cpu port configuration in airoha_ppe_hw_init()")
  7da62262ec96 ("inet: add ip_local_port_step_width sysctl to improve port usage distribution")

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>vsock: fix buffer size clamping order</title>
<updated>2026-04-12T21:31:50+00:00</updated>
<author>
<name>Norbert Szetei</name>
<email>norbert@doyensec.com</email>
</author>
<published>2026-04-09T16:34:12+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=d114bfdc9b76bf93b881e195b7ec957c14227bab'/>
<id>urn:sha1:d114bfdc9b76bf93b881e195b7ec957c14227bab</id>
<content type='text'>
In vsock_update_buffer_size(), the buffer size was being clamped to the
maximum first, and then to the minimum. If a user sets a minimum buffer
size larger than the maximum, the minimum check overrides the maximum
check, inverting the constraint.

This breaks the intended socket memory boundaries by allowing the
vsk-&gt;buffer_size to grow beyond the configured vsk-&gt;buffer_max_size.

Fix this by checking the minimum first, and then the maximum. This
ensures the buffer size never exceeds the buffer_max_size.

Fixes: b9f2b0ffde0c ("vsock: handle buffer_size sockopts in the core")
Suggested-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: Norbert Szetei &lt;norbert@doyensec.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Link: https://patch.msgid.link/180118C5-8BCF-4A63-A305-4EE53A34AB9C@doyensec.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>vsock: avoid timeout for non-blocking accept() with empty backlog</title>
<updated>2026-04-07T01:29:01+00:00</updated>
<author>
<name>Laurence Rowe</name>
<email>laurencerowe@gmail.com</email>
</author>
<published>2026-04-02T20:49:18+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=98f28d8d6e5a5ed058dd37854c19e9b3bae72eff'/>
<id>urn:sha1:98f28d8d6e5a5ed058dd37854c19e9b3bae72eff</id>
<content type='text'>
A common pattern in epoll network servers is to eagerly accept all
pending connections from the non-blocking listening socket after
epoll_wait indicates the socket is ready by calling accept in a loop
until EAGAIN is returned indicating that the backlog is empty.

Scheduling a timeout for a non-blocking accept with an empty backlog
meant AF_VSOCK sockets used by epoll network servers incurred hundreds
of microseconds of additional latency per accept loop compared to
AF_INET or AF_UNIX sockets.

Signed-off-by: Laurence Rowe &lt;laurencerowe@gmail.com&gt;
Reviewed-by: Bobby Eshleman &lt;bobbyeshleman@meta.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Link: https://patch.msgid.link/20260402204918.130395-1-laurencerowe@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2026-04-02T18:03:13+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2026-04-02T17:57:09+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=8ffb33d7709b59ff60560f48960a73bd8a55be95'/>
<id>urn:sha1:8ffb33d7709b59ff60560f48960a73bd8a55be95</id>
<content type='text'>
Cross-merge networking fixes after downstream PR (net-7.0-rc7).

Conflicts:

net/vmw_vsock/af_vsock.c
  b18c83388874 ("vsock: initialize child_ns_mode_locked in vsock_net_init()")
  0de607dc4fd8 ("vsock: add G2H fallback for CIDs not owned by H2G transport")

Adjacent changes:

drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
  ceee35e5674a ("bnxt_en: Refactor some basic ring setup and adjustment logic")
  57cdfe0dc70b ("bnxt_en: Resize RSS contexts on channel count change")

drivers/net/wireless/intel/iwlwifi/mld/mac80211.c
  4d56037a02bd ("wifi: iwlwifi: mld: block EMLSR during TDLS connections")
  687a95d204e7 ("wifi: iwlwifi: mld: correctly set wifi generation data")

drivers/net/wireless/intel/iwlwifi/mld/scan.h
  b6045c899e37 ("wifi: iwlwifi: mld: Refactor scan command handling")
  ec66ec6a5a8f ("wifi: iwlwifi: mld: Fix MLO scan timing")

drivers/net/wireless/intel/iwlwifi/mvm/fw.c
  078df640ef05 ("wifi: iwlwifi: mld: add support for iwl_mcc_allowed_ap_type_cmd v
2")
  323156c3541e ("wifi: iwlwifi: mvm: don't send a 6E related command when not supported")

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>vsock: initialize child_ns_mode_locked in vsock_net_init()</title>
<updated>2026-04-02T15:18:56+00:00</updated>
<author>
<name>Stefano Garzarella</name>
<email>sgarzare@redhat.com</email>
</author>
<published>2026-04-01T09:21:53+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=b18c833888742ca9de80c250f9d40d0e97caa9f6'/>
<id>urn:sha1:b18c833888742ca9de80c250f9d40d0e97caa9f6</id>
<content type='text'>
The `child_ns_mode_locked` field lives in `struct net`, which persists
across vsock module reloads. When the module is unloaded and reloaded,
`vsock_net_init()` resets `mode` and `child_ns_mode` back to their
default values, but does not reset `child_ns_mode_locked`.

The stale lock from the previous module load causes subsequent writes
to `child_ns_mode` to silently fail: `vsock_net_set_child_mode()` sees
the old lock, skips updating the actual value, and returns success
when the requested mode matches the stale lock. The sysctl handler
reports no error, but `child_ns_mode` remains unchanged.

Steps to reproduce:
    $ modprobe vsock
    $ echo local &gt; /proc/sys/net/vsock/child_ns_mode
    $ cat /proc/sys/net/vsock/child_ns_mode
    local
    $ modprobe -r vsock
    $ modprobe vsock
    $ echo local &gt; /proc/sys/net/vsock/child_ns_mode
    $ cat /proc/sys/net/vsock/child_ns_mode
    global    &lt;--- expected "local"

Fix this by initializing `child_ns_mode_locked` to 0 (unlocked) in
`vsock_net_init()`, so the write-once mechanism works correctly after
module reload.

Fixes: 102eab95f025 ("vsock: lock down child_ns_mode as write-once")
Reported-by: Jin Liu &lt;jinl@redhat.com&gt;
Signed-off-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Reviewed-by: Bobby Eshleman &lt;bobbyeshleman@meta.com&gt;
Link: https://patch.msgid.link/20260401092153.28462-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>vsock: add G2H fallback for CIDs not owned by H2G transport</title>
<updated>2026-03-12T09:59:36+00:00</updated>
<author>
<name>Alexander Graf</name>
<email>graf@amazon.com</email>
</author>
<published>2026-03-04T23:00:27+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=0de607dc4fd80ede3b2a35e8a72f99c7a0bbc321'/>
<id>urn:sha1:0de607dc4fd80ede3b2a35e8a72f99c7a0bbc321</id>
<content type='text'>
When no H2G transport is loaded, vsock currently routes all CIDs to the
G2H transport (commit 65b422d9b61b ("vsock: forward all packets to the
host when no H2G is registered"). Extend that existing behavior: when
an H2G transport is loaded but does not claim a given CID, the
connection falls back to G2H in the same way.

This matters in environments like Nitro Enclaves, where an instance may
run nested VMs via vhost-vsock (H2G) while also needing to reach sibling
enclaves at higher CIDs through virtio-vsock-pci (G2H). With the old
code, any CID &gt; 2 was unconditionally routed to H2G when vhost was
loaded, making those enclaves unreachable without setting
VMADDR_FLAG_TO_HOST explicitly on every connect.

Requiring every application to set VMADDR_FLAG_TO_HOST creates friction:
tools like socat, iperf, and others would all need to learn about it.
The flag was introduced 6 years ago and I am still not aware of any tool
that supports it. Even if there was support, it would be cumbersome to
use. The most natural experience is a single CID address space where H2G
only wins for CIDs it actually owns, and everything else falls through to
G2H, extending the behavior that already exists when H2G is absent.

To give user space at least a hint that the kernel applied this logic,
automatically set the VMADDR_FLAG_TO_HOST on the remote address so it
can determine the path taken via getpeername().

Add a per-network namespace sysctl net.vsock.g2h_fallback (default 1).
At 0 it forces strict routing: H2G always wins for CID &gt; VMADDR_CID_HOST,
or ENODEV if H2G is not loaded.

Signed-off-by: Alexander Graf &lt;graf@amazon.com&gt;
Tested-by: syzbot@syzkaller.appspotmail.com
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Link: https://patch.msgid.link/20260304230027.59857-1-graf@amazon.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
</entry>
<entry>
<title>net: remove addr_len argument of recvmsg() handlers</title>
<updated>2026-03-03T02:17:17+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2026-02-27T15:11:20+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=8341c989ac77d712c7d6e2bce29e8a4bcb2eeae4'/>
<id>urn:sha1:8341c989ac77d712c7d6e2bce29e8a4bcb2eeae4</id>
<content type='text'>
Use msg-&gt;msg_namelen as a place holder instead of a
temporary variable, notably in inet[6]_recvmsg().

This removes stack canaries and allows tail-calls.

$ scripts/bloat-o-meter -t vmlinux.old vmlinux
add/remove: 0/0 grow/shrink: 2/19 up/down: 26/-532 (-506)
Function                                     old     new   delta
rawv6_recvmsg                                744     767     +23
vsock_dgram_recvmsg                           55      58      +3
vsock_connectible_recvmsg                     50      47      -3
unix_stream_recvmsg                          161     158      -3
unix_seqpacket_recvmsg                        62      59      -3
unix_dgram_recvmsg                            42      39      -3
tcp_recvmsg                                  546     543      -3
mptcp_recvmsg                               1568    1565      -3
ping_recvmsg                                 806     800      -6
tcp_bpf_recvmsg_parser                       983     974      -9
ip_recv_error                                588     576     -12
ipv6_recv_rxpmtu                             442     428     -14
udp_recvmsg                                 1243    1224     -19
ipv6_recv_error                             1046    1024     -22
udpv6_recvmsg                               1487    1461     -26
raw_recvmsg                                  465     437     -28
udp_bpf_recvmsg                             1027     984     -43
sock_common_recvmsg                          103      27     -76
inet_recvmsg                                 257     175     -82
inet6_recvmsg                                257     175     -82
tcp_bpf_recvmsg                              663     568     -95
Total: Before=25143834, After=25143328, chg -0.00%

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Link: https://patch.msgid.link/20260227151120.1346573-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>vsock: lock down child_ns_mode as write-once</title>
<updated>2026-02-26T10:10:03+00:00</updated>
<author>
<name>Bobby Eshleman</name>
<email>bobbyeshleman@meta.com</email>
</author>
<published>2026-02-23T22:38:33+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=102eab95f025b4d3f3a6c0a858400aca2af2fe52'/>
<id>urn:sha1:102eab95f025b4d3f3a6c0a858400aca2af2fe52</id>
<content type='text'>
Two administrator processes may race when setting child_ns_mode as one
process sets child_ns_mode to "local" and then creates a namespace, but
another process changes child_ns_mode to "global" between the write and
the namespace creation. The first process ends up with a namespace in
"global" mode instead of "local". While this can be detected after the
fact by reading ns_mode and retrying, it is fragile and error-prone.

Make child_ns_mode write-once so that a namespace manager can set it
once and be sure it won't change. Writing a different value after the
first write returns -EBUSY. This applies to all namespaces, including
init_net, where an init process can write "local" to lock all future
namespaces into local mode.

Fixes: eafb64f40ca4 ("vsock: add netns to vsock core")
Suggested-by: Daan De Meyer &lt;daan.j.demeyer@gmail.com&gt;
Suggested-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Co-developed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: Bobby Eshleman &lt;bobbyeshleman@meta.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Link: https://patch.msgid.link/20260223-vsock-ns-write-once-v3-2-c0cde6959923@meta.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
</entry>
<entry>
<title>vsock: Use container_of() to get net namespace in sysctl handlers</title>
<updated>2026-02-26T02:59:18+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2026-02-23T17:32:18+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=5cc619583c7e735c4fb801bede671fb6f9c79425'/>
<id>urn:sha1:5cc619583c7e735c4fb801bede671fb6f9c79425</id>
<content type='text'>
current-&gt;nsproxy is should not be accessed directly as syzbot has found
that it could be NULL at times, causing crashes.  Fix up the af_vsock
sysctl handlers to use container_of() to deal with the current net
namespace instead of attempting to rely on current.

This is the same type of change done in commit 7f5611cbc487 ("rds:
sysctl: rds_tcp_{rcv,snd}buf: avoid using current-&gt;nsproxy")

Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Bobby Eshleman &lt;bobbyeshleman@meta.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Fixes: eafb64f40ca4 ("vsock: add netns to vsock core")
Link: https://patch.msgid.link/2026022318-rearview-gallery-ae13@gregkh
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>vsock: prevent child netns mode switch from local to global</title>
<updated>2026-02-13T20:28:38+00:00</updated>
<author>
<name>Stefano Garzarella</name>
<email>sgarzare@redhat.com</email>
</author>
<published>2026-02-12T20:59:16+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=6a997f38bdf822d4c5cc10b445ff1cb26872580a'/>
<id>urn:sha1:6a997f38bdf822d4c5cc10b445ff1cb26872580a</id>
<content type='text'>
A "local" namespace can change its `child_ns_mode` sysctl to "global",
allowing nested namespaces to access global CIDs. This can be exploited
by an unprivileged user who gained CAP_NET_ADMIN through a user
namespace.

Prevent this by rejecting writes that attempt to set `child_ns_mode` to
"global" when the current namespace's mode is "local".

Fixes: eafb64f40ca4 ("vsock: add netns to vsock core")
Cc: bobbyeshleman@meta.com
Signed-off-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Reviewed-by: Bobby Eshleman &lt;bobbyeshleman@meta.com&gt;
Link: https://patch.msgid.link/20260212205916.97533-3-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
