diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2026-04-14 18:36:10 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2026-04-14 18:36:10 -0700 |
| commit | 91a4855d6c03e770e42f17c798a36a3c46e63de2 (patch) | |
| tree | 5103bfe3aea2aab7e8b358c5c9329539508f648d /net/ipv6 | |
| parent | f5ad4101009e7f5f5984ffea6923d4fcd470932a (diff) | |
| parent | 35c2c39832e569449b9192fa1afbbc4c66227af7 (diff) | |
| download | lwn-91a4855d6c03e770e42f17c798a36a3c46e63de2.tar.gz lwn-91a4855d6c03e770e42f17c798a36a3c46e63de2.zip | |
Merge tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Support HW queue leasing, allowing containers to be granted access
to HW queues for zero-copy operations and AF_XDP
- Number of code moves to help the compiler with inlining. Avoid
output arguments for returning drop reason where possible
- Rework drop handling within qdiscs to include more metadata about
the reason and dropping qdisc in the tracepoints
- Remove the rtnl_lock use from IP Multicast Routing
- Pack size information into the Rx Flow Steering table pointer
itself. This allows making the table itself a flat array of u32s,
thus making the table allocation size a power of two
- Report TCP delayed ack timer information via socket diag
- Add ip_local_port_step_width sysctl to allow distributing the
randomly selected ports more evenly throughout the allowed space
- Add support for per-route tunsrc in IPv6 segment routing
- Start work of switching sockopt handling to iov_iter
- Improve dynamic recvbuf sizing in MPTCP, limit burstiness and avoid
buffer size drifting up
- Support MSG_EOR in MPTCP
- Add stp_mode attribute to the bridge driver for STP mode selection.
This addresses concerns about call_usermodehelper() usage
- Remove UDP-Lite support (as announced in 2023)
- Remove support for building IPv6 as a module. Remove the now
unnecessary function calling indirection
Cross-tree stuff:
- Move Michael MIC code from generic crypto into wireless, it's
considered insecure but some WiFi networks still need it
Netfilter:
- Switch nft_fib_ipv6 module to no longer need temporary dst_entry
object allocations by using fib6_lookup() + RCU.
Florian W reports this gets us ~13% higher packet rate
- Convert IPVS's global __ip_vs_mutex to per-net service_mutex and
switch the service tables to be per-net. Convert some code that
walks the service lists to use RCU instead of the service_mutex
- Add more opinionated input validation to lower security exposure
- Make IPVS hash tables to be per-netns and resizable
Wireless:
- Finished assoc frame encryption/EPPKE/802.1X-over-auth
- Radar detection improvements
- Add 6 GHz incumbent signal detection APIs
- Multi-link support for FILS, probe response templates and client
probing
- New APIs and mac80211 support for NAN (Neighbor Aware Networking,
aka Wi-Fi Aware) so less work must be in firmware
Driver API:
- Add numerical ID for devlink instances (to avoid having to create
fake bus/device pairs just to have an ID). Support shared devlink
instances which span multiple PFs
- Add standard counters for reporting pause storm events (implement
in mlx5 and fbnic)
- Add configuration API for completion writeback buffering (implement
in mana)
- Support driver-initiated change of RSS context sizes
- Support DPLL monitoring input frequency (implement in zl3073x)
- Support per-port resources in devlink (implement in mlx5)
Misc:
- Expand the YAML spec for Netfilter
Drivers
- Software:
- macvlan: support multicast rx for bridge ports with shared
source MAC address
- team: decouple receive and transmit enablement for IEEE 802.3ad
LACP "independent control"
- Ethernet high-speed NICs:
- nVidia/Mellanox:
- support high order pages in zero-copy mode (for payload
coalescing)
- support multiple packets in a page (for systems with 64kB
pages)
- Broadcom 25-400GE (bnxt):
- implement XDP RSS hash metadata extraction
- add software fallback for UDP GSO, lowering the IOMMU cost
- Broadcom 800GE (bnge):
- add link status and configuration handling
- add various HW and SW statistics
- Marvell/Cavium:
- NPC HW block support for cn20k
- Huawei (hinic3):
- add mailbox / control queue
- add rx VLAN offload
- add driver info and link management
- Ethernet NICs:
- Marvell/Aquantia:
- support reading SFP module info on some AQC100 cards
- Realtek PCI (r8169):
- add support for RTL8125cp
- Realtek USB (r8152):
- support for the RTL8157 5Gbit chip
- add 2500baseT EEE status/configuration support
- Ethernet NICs embedded and off-the-shelf IP:
- Synopsys (stmmac):
- cleanup and reorganize SerDes handling and PCS support
- cleanup descriptor handling and per-platform data
- cleanup and consolidate MDIO defines and handling
- shrink driver memory use for internal structures
- improve Tx IRQ coalescing
- improve TCP segmentation handling
- add support for Spacemit K3
- Cadence (macb):
- support PHYs that have inband autoneg disabled with GEM
- support IEEE 802.3az EEE
- rework usrio capabilities and handling
- AMD (xgbe):
- improve power management for S0i3
- improve TX resilience for link-down handling
- Virtual:
- Google cloud vNIC:
- support larger ring sizes in DQO-QPL mode
- improve HW-GRO handling
- support UDP GSO for DQO format
- PCIe NTB:
- support queue count configuration
- Ethernet PHYs:
- automatically disable PHY autonomous EEE if MAC is in charge
- Broadcom:
- add BCM84891/BCM84892 support
- Micrel:
- support for LAN9645X internal PHY
- Realtek:
- add RTL8224 pair order support
- support PHY LEDs on RTL8211F-VD
- support spread spectrum clocking (SSC)
- Maxlinear:
- add PHY-level statistics via ethtool
- Ethernet switches:
- Maxlinear (mxl862xx):
- support for bridge offloading
- support for VLANs
- support driver statistics
- Bluetooth:
- large number of fixes and new device IDs
- Mediatek:
- support MT6639 (MT7927)
- support MT7902 SDIO
- WiFi:
- Intel (iwlwifi):
- UNII-9 and continuing UHR work
- MediaTek (mt76):
- mt7996/mt7925 MLO fixes/improvements
- mt7996 NPU support (HW eth/wifi traffic offload)
- Qualcomm (ath12k):
- monitor mode support on IPQ5332
- basic hwmon temperature reporting
- support IPQ5424
- Realtek:
- add USB RX aggregation to improve performance
- add USB TX flow control by tracking in-flight URBs
- Cellular:
- IPA v5.2 support"
* tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1561 commits)
net: pse-pd: fix kernel-doc function name for pse_control_find_by_id()
wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
wireguard: allowedips: remove redundant space
tools: ynl: add sample for wireguard
wireguard: allowedips: Use kfree_rcu() instead of call_rcu()
MAINTAINERS: Add netkit selftest files
selftests/net: Add additional test coverage in nk_qlease
selftests/net: Split netdevsim tests from HW tests in nk_qlease
tools/ynl: Make YnlFamily closeable as a context manager
net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
net: airoha: Fix VIP configuration for AN7583 SoC
net: caif: clear client service pointer on teardown
net: strparser: fix skb_head leak in strp_abort_strp()
net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete()
selftests/bpf: add test for xdp_master_redirect with bond not up
net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master
net: airoha: Remove PCE_MC_EN_MASK bit in REG_FE_PCE_CFG configuration
sctp: disable BH before calling udp_tunnel_xmit_skb()
sctp: fix missing encap_port propagation for GSO fragments
net: airoha: Rely on net_device pointer in ETS callbacks
...
Diffstat (limited to 'net/ipv6')
43 files changed, 520 insertions, 950 deletions
diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index b8f9a8c0302e..c024aa77f25b 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -3,9 +3,8 @@ # IPv6 configuration # -# IPv6 as module will cause a CRASH if you try to unload it menuconfig IPV6 - tristate "The IPv6 protocol" + bool "The IPv6 protocol" default y select CRYPTO_LIB_SHA1 help @@ -17,9 +16,6 @@ menuconfig IPV6 Documentation/networking/ipv6.rst and read the HOWTO at <https://www.tldp.org/HOWTO/Linux+IPv6-HOWTO/> - To compile this protocol support as a module, choose M here: the - module will be called ipv6. - if IPV6 config IPV6_ROUTER_PREF diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile index 0492f1a0b491..2c9ce2ccbde1 100644 --- a/net/ipv6/Makefile +++ b/net/ipv6/Makefile @@ -7,7 +7,7 @@ obj-$(CONFIG_IPV6) += ipv6.o ipv6-y := af_inet6.o anycast.o ip6_output.o ip6_input.o addrconf.o \ addrlabel.o \ - route.o ip6_fib.o ipv6_sockglue.o ndisc.o udp.o udplite.o \ + route.o ip6_fib.o ipv6_sockglue.o ndisc.o udp.o \ raw.o icmp.o mcast.o reassembly.o tcp_ipv6.o ping.o \ exthdrs.o datagram.o ip6_flowlabel.o inet6_connection_sock.o \ udp_offload.o seg6.o fib6_notifier.o rpl.o ioam6.o diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 22c5cdffeae7..5476b6536eb7 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3586,15 +3586,15 @@ static int fixup_permanent_addr(struct net *net, struct fib6_info *f6i, *prev; f6i = addrconf_f6i_alloc(net, idev, &ifp->addr, false, - GFP_ATOMIC, NULL); + GFP_KERNEL, NULL); if (IS_ERR(f6i)) return PTR_ERR(f6i); /* ifp->rt can be accessed outside of rtnl */ - spin_lock(&ifp->lock); + spin_lock_bh(&ifp->lock); prev = ifp->rt; ifp->rt = f6i; - spin_unlock(&ifp->lock); + spin_unlock_bh(&ifp->lock); fib6_info_release(prev); } @@ -3602,7 +3602,7 @@ static int fixup_permanent_addr(struct net *net, if (!(ifp->flags & IFA_F_NOPREFIXROUTE)) { addrconf_prefix_route(&ifp->addr, ifp->prefix_len, ifp->rt_priority, idev->dev, 0, 0, - GFP_ATOMIC); + GFP_KERNEL); } if (ifp->state == INET6_IFADDR_STATE_PREDAD) @@ -3613,29 +3613,36 @@ static int fixup_permanent_addr(struct net *net, static void addrconf_permanent_addr(struct net *net, struct net_device *dev) { - struct inet6_ifaddr *ifp, *tmp; + struct inet6_ifaddr *ifp; + LIST_HEAD(tmp_addr_list); struct inet6_dev *idev; + /* Mutual exclusion with other if_list_aux users. */ + ASSERT_RTNL(); + idev = __in6_dev_get(dev); if (!idev) return; write_lock_bh(&idev->lock); + list_for_each_entry(ifp, &idev->addr_list, if_list) { + if (ifp->flags & IFA_F_PERMANENT) + list_add_tail(&ifp->if_list_aux, &tmp_addr_list); + } + write_unlock_bh(&idev->lock); - list_for_each_entry_safe(ifp, tmp, &idev->addr_list, if_list) { - if ((ifp->flags & IFA_F_PERMANENT) && - fixup_permanent_addr(net, idev, ifp) < 0) { - write_unlock_bh(&idev->lock); + while (!list_empty(&tmp_addr_list)) { + ifp = list_first_entry(&tmp_addr_list, + struct inet6_ifaddr, if_list_aux); + list_del(&ifp->if_list_aux); + if (fixup_permanent_addr(net, idev, ifp) < 0) { net_info_ratelimited("%s: Failed to add prefix route for address %pI6c; dropping\n", idev->dev->name, &ifp->addr); in6_ifa_hold(ifp); ipv6_del_addr(ifp); - write_lock_bh(&idev->lock); } } - - write_unlock_bh(&idev->lock); } static int addrconf_notify(struct notifier_block *this, unsigned long event, diff --git a/net/ipv6/addrconf_core.c b/net/ipv6/addrconf_core.c index c008d21925d7..fa27a90ab3cd 100644 --- a/net/ipv6/addrconf_core.c +++ b/net/ipv6/addrconf_core.c @@ -6,7 +6,6 @@ #include <linux/export.h> #include <net/ipv6.h> -#include <net/ipv6_stubs.h> #include <net/addrconf.h> #include <net/ip.h> @@ -129,96 +128,6 @@ int inet6addr_validator_notifier_call_chain(unsigned long val, void *v) } EXPORT_SYMBOL(inet6addr_validator_notifier_call_chain); -static struct dst_entry *eafnosupport_ipv6_dst_lookup_flow(struct net *net, - const struct sock *sk, - struct flowi6 *fl6, - const struct in6_addr *final_dst) -{ - return ERR_PTR(-EAFNOSUPPORT); -} - -static int eafnosupport_ipv6_route_input(struct sk_buff *skb) -{ - return -EAFNOSUPPORT; -} - -static struct fib6_table *eafnosupport_fib6_get_table(struct net *net, u32 id) -{ - return NULL; -} - -static int -eafnosupport_fib6_table_lookup(struct net *net, struct fib6_table *table, - int oif, struct flowi6 *fl6, - struct fib6_result *res, int flags) -{ - return -EAFNOSUPPORT; -} - -static int -eafnosupport_fib6_lookup(struct net *net, int oif, struct flowi6 *fl6, - struct fib6_result *res, int flags) -{ - return -EAFNOSUPPORT; -} - -static void -eafnosupport_fib6_select_path(const struct net *net, struct fib6_result *res, - struct flowi6 *fl6, int oif, bool have_oif_match, - const struct sk_buff *skb, int strict) -{ -} - -static u32 -eafnosupport_ip6_mtu_from_fib6(const struct fib6_result *res, - const struct in6_addr *daddr, - const struct in6_addr *saddr) -{ - return 0; -} - -static int eafnosupport_fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh, - struct fib6_config *cfg, gfp_t gfp_flags, - struct netlink_ext_ack *extack) -{ - NL_SET_ERR_MSG(extack, "IPv6 support not enabled in kernel"); - return -EAFNOSUPPORT; -} - -static int eafnosupport_ip6_del_rt(struct net *net, struct fib6_info *rt, - bool skip_notify) -{ - return -EAFNOSUPPORT; -} - -static int eafnosupport_ipv6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb, - int (*output)(struct net *, struct sock *, struct sk_buff *)) -{ - kfree_skb(skb); - return -EAFNOSUPPORT; -} - -static struct net_device *eafnosupport_ipv6_dev_find(struct net *net, const struct in6_addr *addr, - struct net_device *dev) -{ - return ERR_PTR(-EAFNOSUPPORT); -} - -const struct ipv6_stub *ipv6_stub __read_mostly = &(struct ipv6_stub) { - .ipv6_dst_lookup_flow = eafnosupport_ipv6_dst_lookup_flow, - .ipv6_route_input = eafnosupport_ipv6_route_input, - .fib6_get_table = eafnosupport_fib6_get_table, - .fib6_table_lookup = eafnosupport_fib6_table_lookup, - .fib6_lookup = eafnosupport_fib6_lookup, - .fib6_select_path = eafnosupport_fib6_select_path, - .ip6_mtu_from_fib6 = eafnosupport_ip6_mtu_from_fib6, - .fib6_nh_init = eafnosupport_fib6_nh_init, - .ip6_del_rt = eafnosupport_ip6_del_rt, - .ipv6_fragment = eafnosupport_ipv6_fragment, - .ipv6_dev_find = eafnosupport_ipv6_dev_find, -}; -EXPORT_SYMBOL_GPL(ipv6_stub); - /* IPv6 Wildcard Address and Loopback Address defined by RFC2553 */ const struct in6_addr in6addr_loopback __aligned(BITS_PER_LONG/8) = IN6ADDR_LOOPBACK_INIT; diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 4cbd45b68088..0a88b376141d 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -38,12 +38,10 @@ #include <linux/inet.h> #include <linux/netdevice.h> #include <linux/icmpv6.h> -#include <linux/netfilter_ipv6.h> #include <net/ip.h> #include <net/ipv6.h> #include <net/udp.h> -#include <net/udplite.h> #include <net/tcp.h> #include <net/ping.h> #include <net/protocol.h> @@ -52,7 +50,6 @@ #include <net/transp_v6.h> #include <net/ip6_route.h> #include <net/addrconf.h> -#include <net/ipv6_stubs.h> #include <net/ndisc.h> #ifdef CONFIG_IPV6_TUNNEL #include <net/ip6_tunnel.h> @@ -71,10 +68,6 @@ #include "ip6_offload.h" -MODULE_AUTHOR("Cast of dozens"); -MODULE_DESCRIPTION("IPv6 protocol stack for Linux"); -MODULE_LICENSE("GPL"); - /* The inetsw6 table contains everything that inet6_create needs to * build a new socket. */ @@ -269,8 +262,8 @@ out_sk_release: goto out; } -static int __inet6_bind(struct sock *sk, struct sockaddr_unsized *uaddr, int addr_len, - u32 flags) +int __inet6_bind(struct sock *sk, struct sockaddr_unsized *uaddr, int addr_len, + u32 flags) { struct sockaddr_in6 *addr = (struct sockaddr_in6 *)uaddr; struct inet_sock *inet = inet_sk(sk); @@ -636,8 +629,6 @@ int inet6_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) EXPORT_SYMBOL_GPL(inet6_compat_ioctl); #endif /* CONFIG_COMPAT */ -INDIRECT_CALLABLE_DECLARE(int udpv6_sendmsg(struct sock *, struct msghdr *, - size_t)); int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) { struct sock *sk = sock->sk; @@ -652,26 +643,19 @@ int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) sk, msg, size); } -INDIRECT_CALLABLE_DECLARE(int udpv6_recvmsg(struct sock *, struct msghdr *, - size_t, int, int *)); int inet6_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; const struct proto *prot; - int addr_len = 0; - int err; if (likely(!(flags & MSG_ERRQUEUE))) sock_rps_record_flow(sk); /* IPV6_ADDRFORM can change sk->sk_prot under us. */ prot = READ_ONCE(sk->sk_prot); - err = INDIRECT_CALL_2(prot->recvmsg, tcp_recvmsg, udpv6_recvmsg, - sk, msg, size, flags, &addr_len); - if (err >= 0) - msg->msg_namelen = addr_len; - return err; + return INDIRECT_CALL_2(prot->recvmsg, tcp_recvmsg, udpv6_recvmsg, + sk, msg, size, flags); } const struct proto_ops inet6_stream_ops = { @@ -706,6 +690,7 @@ const struct proto_ops inet6_stream_ops = { .compat_ioctl = inet6_compat_ioctl, #endif .set_rcvlowat = tcp_set_rcvlowat, + .set_rcvbuf = tcp_set_rcvbuf, }; EXPORT_SYMBOL_GPL(inet6_stream_ops); @@ -896,9 +881,7 @@ static int __net_init ipv6_init_mibs(struct net *net) net->mib.udp_stats_in6 = alloc_percpu(struct udp_mib); if (!net->mib.udp_stats_in6) return -ENOMEM; - net->mib.udplite_stats_in6 = alloc_percpu(struct udp_mib); - if (!net->mib.udplite_stats_in6) - goto err_udplite_mib; + net->mib.ipv6_statistics = alloc_percpu(struct ipstats_mib); if (!net->mib.ipv6_statistics) goto err_ip_mib; @@ -909,10 +892,10 @@ static int __net_init ipv6_init_mibs(struct net *net) u64_stats_init(&af_inet6_stats->syncp); } - net->mib.icmpv6_statistics = alloc_percpu(struct icmpv6_mib); if (!net->mib.icmpv6_statistics) goto err_icmp_mib; + net->mib.icmpv6msg_statistics = kzalloc_obj(struct icmpv6msg_mib); if (!net->mib.icmpv6msg_statistics) goto err_icmpmsg_mib; @@ -923,8 +906,6 @@ err_icmpmsg_mib: err_icmp_mib: free_percpu(net->mib.ipv6_statistics); err_ip_mib: - free_percpu(net->mib.udplite_stats_in6); -err_udplite_mib: free_percpu(net->mib.udp_stats_in6); return -ENOMEM; } @@ -932,7 +913,6 @@ err_udplite_mib: static void ipv6_cleanup_mibs(struct net *net) { free_percpu(net->mib.udp_stats_in6); - free_percpu(net->mib.udplite_stats_in6); free_percpu(net->mib.ipv6_statistics); free_percpu(net->mib.icmpv6_statistics); kfree(net->mib.icmpv6msg_statistics); @@ -1015,50 +995,6 @@ static struct pernet_operations inet6_net_ops = { .exit = inet6_net_exit, }; -static int ipv6_route_input(struct sk_buff *skb) -{ - ip6_route_input(skb); - return skb_dst(skb)->error; -} - -static const struct ipv6_stub ipv6_stub_impl = { - .ipv6_sock_mc_join = ipv6_sock_mc_join, - .ipv6_sock_mc_drop = ipv6_sock_mc_drop, - .ipv6_dst_lookup_flow = ip6_dst_lookup_flow, - .ipv6_route_input = ipv6_route_input, - .fib6_get_table = fib6_get_table, - .fib6_table_lookup = fib6_table_lookup, - .fib6_lookup = fib6_lookup, - .fib6_select_path = fib6_select_path, - .ip6_mtu_from_fib6 = ip6_mtu_from_fib6, - .fib6_nh_init = fib6_nh_init, - .fib6_nh_release = fib6_nh_release, - .fib6_nh_release_dsts = fib6_nh_release_dsts, - .fib6_update_sernum = fib6_update_sernum_stub, - .fib6_rt_update = fib6_rt_update, - .ip6_del_rt = ip6_del_rt, - .udpv6_encap_enable = udpv6_encap_enable, - .ndisc_send_na = ndisc_send_na, -#if IS_ENABLED(CONFIG_XFRM) - .xfrm6_local_rxpmtu = xfrm6_local_rxpmtu, - .xfrm6_udp_encap_rcv = xfrm6_udp_encap_rcv, - .xfrm6_gro_udp_encap_rcv = xfrm6_gro_udp_encap_rcv, - .xfrm6_rcv_encap = xfrm6_rcv_encap, -#endif - .nd_tbl = &nd_tbl, - .ipv6_fragment = ip6_fragment, - .ipv6_dev_find = ipv6_dev_find, - .ip6_xmit = ip6_xmit, -}; - -static const struct ipv6_bpf_stub ipv6_bpf_stub_impl = { - .inet6_bind = __inet6_bind, - .udp6_lib_lookup = __udp6_lib_lookup, - .ipv6_setsockopt = do_ipv6_setsockopt, - .ipv6_getsockopt = do_ipv6_getsockopt, - .ipv6_dev_get_saddr = ipv6_dev_get_saddr, -}; - static int __init inet6_init(void) { struct list_head *r; @@ -1085,13 +1021,9 @@ static int __init inet6_init(void) if (err) goto out_unregister_tcp_proto; - err = proto_register(&udplitev6_prot, 1); - if (err) - goto out_unregister_udp_proto; - err = proto_register(&rawv6_prot, 1); if (err) - goto out_unregister_udplite_proto; + goto out_unregister_udp_proto; err = proto_register(&pingv6_prot, 1); if (err) @@ -1134,16 +1066,11 @@ static int __init inet6_init(void) if (err) goto igmp_fail; - err = ipv6_netfilter_init(); - if (err) - goto netfilter_fail; /* Create /proc/foo6 entries. */ #ifdef CONFIG_PROC_FS err = -ENOMEM; if (raw6_proc_init()) goto proc_raw6_fail; - if (udplite6_proc_init()) - goto proc_udplite6_fail; if (ipv6_misc_proc_init()) goto proc_misc6_fail; if (if6_proc_init()) @@ -1179,10 +1106,6 @@ static int __init inet6_init(void) if (err) goto udpv6_fail; - err = udplitev6_init(); - if (err) - goto udplitev6_fail; - err = udpv6_offload_init(); if (err) goto udpv6_offload_fail; @@ -1225,10 +1148,6 @@ static int __init inet6_init(void) goto sysctl_fail; #endif - /* ensure that ipv6 stubs are visible only after ipv6 is ready */ - wmb(); - ipv6_stub = &ipv6_stub_impl; - ipv6_bpf_stub = &ipv6_bpf_stub_impl; out: return err; @@ -1253,8 +1172,6 @@ ipv6_packet_fail: tcpv6_fail: udpv6_offload_exit(); udpv6_offload_fail: - udplitev6_exit(); -udplitev6_fail: udpv6_exit(); udpv6_fail: ipv6_frag_exit(); @@ -1276,13 +1193,9 @@ ip6_route_fail: proc_if6_fail: ipv6_misc_proc_exit(); proc_misc6_fail: - udplite6_proc_exit(); -proc_udplite6_fail: raw6_proc_exit(); proc_raw6_fail: #endif - ipv6_netfilter_fini(); -netfilter_fail: igmp6_cleanup(); igmp_fail: ndisc_cleanup(); @@ -1301,14 +1214,10 @@ out_unregister_ping_proto: proto_unregister(&pingv6_prot); out_unregister_raw_proto: proto_unregister(&rawv6_prot); -out_unregister_udplite_proto: - proto_unregister(&udplitev6_prot); out_unregister_udp_proto: proto_unregister(&udpv6_prot); out_unregister_tcp_proto: proto_unregister(&tcpv6_prot); goto out; } -module_init(inet6_init); - -MODULE_ALIAS_NETPROTO(PF_INET6); +device_initcall(inet6_init); diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index 2d7b59732f7e..ca3605acb433 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -452,7 +452,7 @@ static bool ip6_datagram_support_cmsg(struct sk_buff *skb, /* * Handle MSG_ERRQUEUE */ -int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) +int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len) { struct ipv6_pinfo *np = inet6_sk(sk); struct sock_exterr_skb *serr; @@ -503,7 +503,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) &sin->sin6_addr); sin->sin6_scope_id = 0; } - *addr_len = sizeof(*sin); + msg->msg_namelen = sizeof(*sin); } memcpy(&errhdr.ee, &serr->ee, sizeof(struct sock_extended_err)); @@ -545,8 +545,7 @@ EXPORT_SYMBOL_GPL(ipv6_recv_error); /* * Handle IPV6_RECVPATHMTU */ -int ipv6_recv_rxpmtu(struct sock *sk, struct msghdr *msg, int len, - int *addr_len) +int ipv6_recv_rxpmtu(struct sock *sk, struct msghdr *msg, int len) { struct ipv6_pinfo *np = inet6_sk(sk); struct sk_buff *skb; @@ -579,7 +578,7 @@ int ipv6_recv_rxpmtu(struct sock *sk, struct msghdr *msg, int len, sin->sin6_port = 0; sin->sin6_scope_id = mtu_info.ip6m_addr.sin6_scope_id; sin->sin6_addr = mtu_info.ip6m_addr.sin6_addr; - *addr_len = sizeof(*sin); + msg->msg_namelen = sizeof(*sin); } put_cmsg(msg, SOL_IPV6, IPV6_PATHMTU, sizeof(mtu_info), &mtu_info); diff --git a/net/ipv6/fib6_notifier.c b/net/ipv6/fib6_notifier.c index 949b72610df7..64cd4ed8864c 100644 --- a/net/ipv6/fib6_notifier.c +++ b/net/ipv6/fib6_notifier.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 #include <linux/notifier.h> #include <linux/socket.h> #include <linux/kernel.h> diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index fd5f7112a51f..e1b2b4fa6e18 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -92,6 +92,9 @@ int fib6_lookup(struct net *net, int oif, struct flowi6 *fl6, return err; } +#if IS_MODULE(CONFIG_NFT_FIB_IPV6) +EXPORT_SYMBOL_GPL(fib6_lookup); +#endif struct dst_entry *fib6_rule_lookup(struct net *net, struct flowi6 *fl6, const struct sk_buff *skb, diff --git a/net/ipv6/fou6.c b/net/ipv6/fou6.c index 430518ae26fa..157765259e2f 100644 --- a/net/ipv6/fou6.c +++ b/net/ipv6/fou6.c @@ -141,8 +141,7 @@ static int gue6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, * recursion. Besides, this kind of encapsulation can't even be * configured currently. Discard this. */ - if (guehdr->proto_ctype == IPPROTO_UDP || - guehdr->proto_ctype == IPPROTO_UDPLITE) + if (guehdr->proto_ctype == IPPROTO_UDP) return -EOPNOTSUPP; skb_set_transport_header(skb, -(int)sizeof(struct icmp6hdr)); diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index d5d23a9296ea..799d9e9ac45d 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -1291,13 +1291,8 @@ int __init icmpv6_init(void) if (inet6_add_protocol(&icmpv6_protocol, IPPROTO_ICMPV6) < 0) goto fail; - err = inet6_register_icmp_sender(icmp6_send); - if (err) - goto sender_reg_err; return 0; -sender_reg_err: - inet6_del_protocol(&icmpv6_protocol, IPPROTO_ICMPV6); fail: pr_err("Failed to register ICMP6 protocol\n"); return err; @@ -1305,7 +1300,6 @@ fail: void icmpv6_cleanup(void) { - inet6_unregister_icmp_sender(icmp6_send); inet6_del_protocol(&icmpv6_protocol, IPPROTO_ICMPV6); } diff --git a/net/ipv6/ila/ila_common.c b/net/ipv6/ila/ila_common.c index b8d43ed4689d..e71571455c8a 100644 --- a/net/ipv6/ila/ila_common.c +++ b/net/ipv6/ila/ila_common.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 #include <linux/errno.h> #include <linux/ip.h> #include <linux/kernel.h> diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c index 11fc2f7de2fe..37534e116899 100644 --- a/net/ipv6/inet6_connection_sock.c +++ b/net/ipv6/inet6_connection_sock.c @@ -56,8 +56,8 @@ struct dst_entry *inet6_csk_route_req(const struct sock *sk, return dst; } -static struct dst_entry *inet6_csk_route_socket(struct sock *sk, - struct flowi6 *fl6) +struct dst_entry *inet6_csk_route_socket(struct sock *sk, + struct flowi6 *fl6) { struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); @@ -118,18 +118,3 @@ int inet6_csk_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl_unused return res; } EXPORT_SYMBOL_GPL(inet6_csk_xmit); - -struct dst_entry *inet6_csk_update_pmtu(struct sock *sk, u32 mtu) -{ - struct flowi6 *fl6 = &inet_sk(sk)->cork.fl.u.ip6; - struct dst_entry *dst; - - dst = inet6_csk_route_socket(sk, fl6); - - if (IS_ERR(dst)) - return NULL; - dst->ops->update_pmtu(dst, sk, NULL, mtu, true); - - dst = inet6_csk_route_socket(sk, fl6); - return IS_ERR(dst) ? NULL : dst; -} diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c index 182d38e6d6d8..b111b51d69fc 100644 --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -23,20 +23,55 @@ #include <net/sock_reuseport.h> #include <net/tcp.h> +void inet6_init_ehash_secret(void) +{ + net_get_random_sleepable_once(&inet6_ehash_secret, + sizeof(inet6_ehash_secret)); + net_get_random_sleepable_once(&tcp_ipv6_hash_secret, + sizeof(tcp_ipv6_hash_secret)); +} + u32 inet6_ehashfn(const struct net *net, const struct in6_addr *laddr, const u16 lport, const struct in6_addr *faddr, const __be16 fport) { - u32 lhash, fhash; + u32 a, b, c; + + /* + * Please look at jhash() implementation for reference. + * Hash laddr + faddr + lport/fport + net_hash_mix. + * Notes: + * We combine laddr[0] (high order 32 bits of local address) + * with net_hash_mix() to hash a multiple of 3 words. + * + * We do not include JHASH_INITVAL + 36 contribution + * to initial values of a, b, c. + */ - net_get_random_once(&inet6_ehash_secret, sizeof(inet6_ehash_secret)); - net_get_random_once(&tcp_ipv6_hash_secret, sizeof(tcp_ipv6_hash_secret)); + a = b = c = tcp_ipv6_hash_secret; - lhash = (__force u32)laddr->s6_addr32[3]; - fhash = __ipv6_addr_jhash(faddr, tcp_ipv6_hash_secret); + a += (__force u32)laddr->s6_addr32[0] ^ net_hash_mix(net); + b += (__force u32)laddr->s6_addr32[1]; + c += (__force u32)laddr->s6_addr32[2]; + __jhash_mix(a, b, c); - return lport + __inet6_ehashfn(lhash, 0, fhash, fport, - inet6_ehash_secret + net_hash_mix(net)); + a += (__force u32)laddr->s6_addr32[3]; + b += (__force u32)faddr->s6_addr32[0]; + c += (__force u32)faddr->s6_addr32[1]; + __jhash_mix(a, b, c); + + a += (__force u32)faddr->s6_addr32[2]; + b += (__force u32)faddr->s6_addr32[3]; + c += (__force u32)fport; + __jhash_final(a, b, c); + + /* Note: We need to add @lport instead of fully hashing it. + * See commits 9544d60a2605 ("inet: change lport contribution + * to inet_ehashfn() and inet6_ehashfn()") and d4438ce68bf1 + * ("inet: call inet6_ehashfn() once from inet6_hash_connect()") + * for references. + */ + return lport + c; } EXPORT_SYMBOL_GPL(inet6_ehashfn); @@ -363,6 +398,8 @@ int inet6_hash_connect(struct inet_timewait_death_row *death_row, if (!inet_sk(sk)->inet_num) port_offset = inet6_sk_port_offset(sk); + inet6_init_ehash_secret(); + hash_port0 = inet6_ehashfn(net, daddr, 0, saddr, inet->inet_dport); return __inet_hash_connect(death_row, sk, port_offset, hash_port0, diff --git a/net/ipv6/ip6_checksum.c b/net/ipv6/ip6_checksum.c index 377717045f8f..e1a594873675 100644 --- a/net/ipv6/ip6_checksum.c +++ b/net/ipv6/ip6_checksum.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include <net/ip.h> +#include <net/ip6_checksum.h> #include <net/udp.h> -#include <net/udplite.h> #include <asm/checksum.h> #ifndef _HAVE_ARCH_IPV6_CSUM @@ -62,53 +62,6 @@ __sum16 csum_ipv6_magic(const struct in6_addr *saddr, EXPORT_SYMBOL(csum_ipv6_magic); #endif -int udp6_csum_init(struct sk_buff *skb, struct udphdr *uh, int proto) -{ - int err; - - UDP_SKB_CB(skb)->partial_cov = 0; - UDP_SKB_CB(skb)->cscov = skb->len; - - if (proto == IPPROTO_UDPLITE) { - err = udplite_checksum_init(skb, uh); - if (err) - return err; - - if (UDP_SKB_CB(skb)->partial_cov) { - skb->csum = ip6_compute_pseudo(skb, proto); - return 0; - } - } - - /* To support RFC 6936 (allow zero checksum in UDP/IPV6 for tunnels) - * we accept a checksum of zero here. When we find the socket - * for the UDP packet we'll check if that socket allows zero checksum - * for IPv6 (set by socket option). - * - * Note, we are only interested in != 0 or == 0, thus the - * force to int. - */ - err = (__force int)skb_checksum_init_zero_check(skb, proto, uh->check, - ip6_compute_pseudo); - if (err) - return err; - - if (skb->ip_summed == CHECKSUM_COMPLETE && !skb->csum_valid) { - /* If SW calculated the value, we know it's bad */ - if (skb->csum_complete_sw) - return 1; - - /* HW says the value is bad. Let's validate that. - * skb->csum is no longer the full packet checksum, - * so don't treat is as such. - */ - skb_checksum_complete_unset(skb); - } - - return 0; -} -EXPORT_SYMBOL(udp6_csum_init); - /* Function to set UDP checksum for an IPv6 UDP packet. This is intended * for the simple case like when setting the checksum for a UDP tunnel. */ diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 45ef4d65dcbc..b897b3c5023b 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -342,6 +342,9 @@ int fib6_lookup(struct net *net, int oif, struct flowi6 *fl6, return fib6_table_lookup(net, net->ipv6.fib6_main_tbl, oif, fl6, res, flags); } +#if IS_MODULE(CONFIG_NFT_FIB_IPV6) +EXPORT_SYMBOL_GPL(fib6_lookup); +#endif static void __net_init fib6_tables_init(struct net *net) { @@ -1413,14 +1416,6 @@ void fib6_update_sernum_upto_root(struct net *net, struct fib6_info *rt) __fib6_update_sernum_upto_root(rt, fib6_new_sernum(net)); } -/* allow ipv4 to update sernum via ipv6_stub */ -void fib6_update_sernum_stub(struct net *net, struct fib6_info *f6i) -{ - spin_lock_bh(&f6i->fib6_table->tb6_lock); - fib6_update_sernum_upto_root(net, f6i); - spin_unlock_bh(&f6i->fib6_table->tb6_lock); -} - /* * Add routing information to the routing tree. * <destination addr>/<source addr> @@ -2779,7 +2774,7 @@ static void ipv6_route_native_seq_stop(struct seq_file *seq, void *v) rcu_read_unlock(); } -#if IS_BUILTIN(CONFIG_IPV6) && defined(CONFIG_BPF_SYSCALL) +#if defined(CONFIG_BPF_SYSCALL) static int ipv6_route_prog_seq_show(struct bpf_prog *prog, struct bpf_iter_meta *meta, void *v) diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index dafcc0dcd77a..63fc8556b475 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -593,6 +593,7 @@ static int gre_rcv(struct sk_buff *skb) out: icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_PORT_UNREACH, 0); drop: + dev_core_stats_rx_dropped_inc(skb->dev); kfree_skb(skb); return 0; } diff --git a/net/ipv6/ip6_icmp.c b/net/ipv6/ip6_icmp.c index 233914b63bdb..e43ea9492332 100644 --- a/net/ipv6/ip6_icmp.c +++ b/net/ipv6/ip6_icmp.c @@ -7,47 +7,8 @@ #include <net/ipv6.h> -#if IS_ENABLED(CONFIG_IPV6) +#if IS_ENABLED(CONFIG_IPV6) && IS_ENABLED(CONFIG_NF_NAT) -#if !IS_BUILTIN(CONFIG_IPV6) - -static ip6_icmp_send_t __rcu *ip6_icmp_send; - -int inet6_register_icmp_sender(ip6_icmp_send_t *fn) -{ - return (cmpxchg((ip6_icmp_send_t **)&ip6_icmp_send, NULL, fn) == NULL) ? - 0 : -EBUSY; -} -EXPORT_SYMBOL(inet6_register_icmp_sender); - -int inet6_unregister_icmp_sender(ip6_icmp_send_t *fn) -{ - int ret; - - ret = (cmpxchg((ip6_icmp_send_t **)&ip6_icmp_send, fn, NULL) == fn) ? - 0 : -EINVAL; - - synchronize_net(); - - return ret; -} -EXPORT_SYMBOL(inet6_unregister_icmp_sender); - -void __icmpv6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info, - const struct inet6_skb_parm *parm) -{ - ip6_icmp_send_t *send; - - rcu_read_lock(); - send = rcu_dereference(ip6_icmp_send); - if (send) - send(skb, type, code, info, NULL, parm); - rcu_read_unlock(); -} -EXPORT_SYMBOL(__icmpv6_send); -#endif - -#if IS_ENABLED(CONFIG_NF_NAT) #include <net/netfilter/nf_conntrack.h> void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info) { @@ -60,7 +21,7 @@ void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info) ct = nf_ct_get(skb_in, &ctinfo); if (!ct || !(READ_ONCE(ct->status) & IPS_NAT_MASK)) { - __icmpv6_send(skb_in, type, code, info, &parm); + icmp6_send(skb_in, type, code, info, NULL, &parm); return; } @@ -76,11 +37,10 @@ void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info) orig_ip = ipv6_hdr(skb_in)->saddr; dir = CTINFO2DIR(ctinfo); ipv6_hdr(skb_in)->saddr = ct->tuplehash[dir].tuple.src.u3.in6; - __icmpv6_send(skb_in, type, code, info, &parm); + icmp6_send(skb_in, type, code, info, NULL, &parm); ipv6_hdr(skb_in)->saddr = orig_ip; out: consume_skb(cloned_skb); } EXPORT_SYMBOL(icmpv6_ndo_send); #endif -#endif diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c index 2bcb981c91aa..967b07aeb683 100644 --- a/net/ipv6/ip6_input.c +++ b/net/ipv6/ip6_input.c @@ -44,6 +44,46 @@ #include <net/xfrm.h> #include <net/inet_ecn.h> #include <net/dst_metadata.h> +#include <net/inet6_hashtables.h> + +static void tcp_v6_early_demux(struct sk_buff *skb) +{ + struct net *net = dev_net_rcu(skb->dev); + const struct ipv6hdr *hdr; + const struct tcphdr *th; + struct sock *sk; + + if (skb->pkt_type != PACKET_HOST) + return; + + if (!pskb_may_pull(skb, skb_transport_offset(skb) + + sizeof(struct tcphdr))) + return; + + hdr = ipv6_hdr(skb); + th = tcp_hdr(skb); + + if (th->doff < sizeof(struct tcphdr) / 4) + return; + + /* Note : We use inet6_iif() here, not tcp_v6_iif() */ + sk = __inet6_lookup_established(net, &hdr->saddr, th->source, + &hdr->daddr, ntohs(th->dest), + inet6_iif(skb), inet6_sdif(skb)); + if (sk) { + skb->sk = sk; + skb->destructor = sock_edemux; + if (sk_fullsock(sk)) { + struct dst_entry *dst = rcu_dereference(sk->sk_rx_dst); + + if (dst) + dst = dst_check(dst, sk->sk_rx_dst_cookie); + if (dst && + sk->sk_rx_dst_ifindex == skb->skb_iif) + skb_dst_set_noref(skb, dst); + } + } +} static void ip6_rcv_finish_core(struct net *net, struct sock *sk, struct sk_buff *skb) diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index bd7f780e37a5..d8072ad6b8c4 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -286,7 +286,7 @@ not_same_flow: if (likely(proto == IPPROTO_TCP)) pp = tcp6_gro_receive(head, skb); -#if IS_BUILTIN(CONFIG_IPV6) +#if IS_ENABLED(CONFIG_IPV6) else if (likely(proto == IPPROTO_UDP)) pp = udp6_gro_receive(head, skb); #endif @@ -346,7 +346,7 @@ INDIRECT_CALLABLE_SCOPE int ipv6_gro_complete(struct sk_buff *skb, int nhoff) if (likely(ops == &net_hotdata.tcpv6_offload)) return tcp6_gro_complete(skb, nhoff); -#if IS_BUILTIN(CONFIG_IPV6) +#if IS_ENABLED(CONFIG_IPV6) if (ops == &net_hotdata.udpv6_offload) return udp6_gro_complete(skb, nhoff); #endif diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 8e2a6b28cea7..7e92909ab5be 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -259,6 +259,27 @@ bool ip6_autoflowlabel(struct net *net, const struct sock *sk) return inet6_test_bit(AUTOFLOWLABEL, sk); } +int ip6_dst_hoplimit(struct dst_entry *dst) +{ + int hoplimit = dst_metric_raw(dst, RTAX_HOPLIMIT); + + rcu_read_lock(); + if (hoplimit == 0) { + struct net_device *dev = dst_dev_rcu(dst); + struct inet6_dev *idev; + + idev = __in6_dev_get(dev); + if (idev) + hoplimit = READ_ONCE(idev->cnf.hop_limit); + else + hoplimit = READ_ONCE(dev_net(dev)->ipv6.devconf_all->hop_limit); + } + rcu_read_unlock(); + + return hoplimit; +} +EXPORT_SYMBOL(ip6_dst_hoplimit); + /* * xmit an sk_buff (used by TCP and SCTP) * Note : socket lock is not held for SYNACK packets, but might be modified @@ -873,6 +894,11 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb, __be32 frag_id; u8 *prevhdr, nexthdr = 0; + if (!ipv6_mod_enabled()) { + kfree_skb(skb); + return -EAFNOSUPPORT; + } + err = ip6_find_1stfragopt(skb, &prevhdr); if (err < 0) goto fail; @@ -1045,6 +1071,7 @@ fail: kfree_skb(skb); return err; } +EXPORT_SYMBOL_GPL(ip6_fragment); static inline int ip6_rt_check(const struct rt6key *rt_key, const struct in6_addr *fl_addr, @@ -1256,6 +1283,8 @@ struct dst_entry *ip6_dst_lookup_flow(struct net *net, const struct sock *sk, st struct dst_entry *dst = NULL; int err; + if (!ipv6_mod_enabled()) + return ERR_PTR(-EAFNOSUPPORT); err = ip6_dst_lookup_tail(net, sk, &dst, fl6); if (err) return ERR_PTR(err); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 0b53488a9229..46bc06506470 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -96,9 +96,6 @@ static inline int ip6_tnl_mpls_supported(void) return IS_ENABLED(CONFIG_MPLS); } -#define for_each_ip6_tunnel_rcu(start) \ - for (t = rcu_dereference(start); t; t = rcu_dereference(t->next)) - /** * ip6_tnl_lookup - fetch tunnel matching the end-point addresses * @net: network namespace @@ -121,7 +118,7 @@ ip6_tnl_lookup(struct net *net, int link, struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id); struct in6_addr any; - for_each_ip6_tunnel_rcu(ip6n->tnls_r_l[hash]) { + for_each_ip_tunnel_rcu(t, ip6n->tnls_r_l[hash]) { if (!ipv6_addr_equal(local, &t->parms.laddr) || !ipv6_addr_equal(remote, &t->parms.raddr) || !(t->dev->flags & IFF_UP)) @@ -135,7 +132,7 @@ ip6_tnl_lookup(struct net *net, int link, memset(&any, 0, sizeof(any)); hash = HASH(&any, local); - for_each_ip6_tunnel_rcu(ip6n->tnls_r_l[hash]) { + for_each_ip_tunnel_rcu(t, ip6n->tnls_r_l[hash]) { if (!ipv6_addr_equal(local, &t->parms.laddr) || !ipv6_addr_any(&t->parms.raddr) || !(t->dev->flags & IFF_UP)) @@ -148,7 +145,7 @@ ip6_tnl_lookup(struct net *net, int link, } hash = HASH(remote, &any); - for_each_ip6_tunnel_rcu(ip6n->tnls_r_l[hash]) { + for_each_ip_tunnel_rcu(t, ip6n->tnls_r_l[hash]) { if (!ipv6_addr_equal(remote, &t->parms.raddr) || !ipv6_addr_any(&t->parms.laddr) || !(t->dev->flags & IFF_UP)) diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c index cef3e0210744..405ef1cb8864 100644 --- a/net/ipv6/ip6_udp_tunnel.c +++ b/net/ipv6/ip6_udp_tunnel.c @@ -162,8 +162,7 @@ struct dst_entry *udp_tunnel6_dst_lookup(struct sk_buff *skb, fl6.fl6_dport = dport; fl6.flowlabel = ip6_make_flowinfo(dsfield, key->label); - dst = ipv6_stub->ipv6_dst_lookup_flow(net, sock->sk, &fl6, - NULL); + dst = ip6_dst_lookup_flow(net, sock->sk, &fl6, NULL); if (IS_ERR(dst)) { netdev_dbg(dev, "no route to %pI6\n", &fl6.daddr); return ERR_PTR(-ENETUNREACH); diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c index e047a4680ab0..85010ff21c98 100644 --- a/net/ipv6/ip6mr.c +++ b/net/ipv6/ip6mr.c @@ -1280,7 +1280,7 @@ static int ip6mr_device_event(struct notifier_block *this, static unsigned int ip6mr_seq_read(const struct net *net) { - return READ_ONCE(net->ipv6.ipmr_seq) + ip6mr_rules_seq_read(net); + return atomic_read(&net->ipv6.ipmr_seq) + ip6mr_rules_seq_read(net); } static int ip6mr_dump(struct net *net, struct notifier_block *nb, @@ -1305,7 +1305,7 @@ static int __net_init ip6mr_notifier_init(struct net *net) { struct fib_notifier_ops *ops; - net->ipv6.ipmr_seq = 0; + atomic_set(&net->ipv6.ipmr_seq, 0); ops = fib_notifier_ops_register(&ip6mr_notifier_ops_template, net); if (IS_ERR(ops)) diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index 02c4cab60c69..b4c977434c2e 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -45,7 +45,6 @@ #include <net/inet_common.h> #include <net/tcp.h> #include <net/udp.h> -#include <net/udplite.h> #include <net/xfrm.h> #include <net/compat.h> #include <net/seg6.h> @@ -563,10 +562,8 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname, if (sk->sk_type == SOCK_RAW) break; - if (sk->sk_protocol == IPPROTO_UDP || - sk->sk_protocol == IPPROTO_UDPLITE) { - struct udp_sock *up = udp_sk(sk); - if (up->pending == AF_INET6) { + if (sk->sk_protocol == IPPROTO_UDP) { + if (udp_sk(sk)->pending == AF_INET6) { retv = -EBUSY; break; } @@ -607,16 +604,11 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname, WRITE_ONCE(sk->sk_family, PF_INET); tcp_sync_mss(sk, icsk->icsk_pmtu_cookie); } else { - struct proto *prot = &udp_prot; - - if (sk->sk_protocol == IPPROTO_UDPLITE) - prot = &udplite_prot; - sock_prot_inuse_add(net, sk->sk_prot, -1); - sock_prot_inuse_add(net, prot, 1); + sock_prot_inuse_add(net, &udp_prot, 1); /* Paired with READ_ONCE(sk->sk_prot) in inet6_dgram_ops */ - WRITE_ONCE(sk->sk_prot, prot); + WRITE_ONCE(sk->sk_prot, &udp_prot); WRITE_ONCE(sk->sk_socket->ops, &inet_dgram_ops); WRITE_ONCE(sk->sk_family, PF_INET); } @@ -1098,7 +1090,6 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname, switch (optname) { case IPV6_ADDRFORM: if (sk->sk_protocol != IPPROTO_UDP && - sk->sk_protocol != IPPROTO_UDPLITE && sk->sk_protocol != IPPROTO_TCP) return -ENOPROTOOPT; if (sk->sk_state != TCP_ESTABLISHED) diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 186e60c79214..e7ad13c5bd26 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -576,6 +576,7 @@ void ndisc_send_na(struct net_device *dev, const struct in6_addr *daddr, ndisc_send_skb(skb, daddr, src_addr); } +EXPORT_SYMBOL_GPL(ndisc_send_na); static void ndisc_send_unsol_na(struct net_device *dev) { diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c index 46540a5a4331..6d80f85e55fa 100644 --- a/net/ipv6/netfilter.c +++ b/net/ipv6/netfilter.c @@ -1,7 +1,8 @@ +// SPDX-License-Identifier: GPL-2.0-or-later /* * IPv6 specific functions of netfilter core * - * Rusty Russell (C) 2000 -- This code is GPL. + * Rusty Russell (C) 2000 * Patrick McHardy (C) 2006-2012 */ #include <linux/kernel.h> @@ -85,21 +86,6 @@ int ip6_route_me_harder(struct net *net, struct sock *sk_partial, struct sk_buff } EXPORT_SYMBOL(ip6_route_me_harder); -static int nf_ip6_reroute(struct sk_buff *skb, - const struct nf_queue_entry *entry) -{ - struct ip6_rt_info *rt_info = nf_queue_entry_reroute(entry); - - if (entry->state.hook == NF_INET_LOCAL_OUT) { - const struct ipv6hdr *iph = ipv6_hdr(skb); - if (!ipv6_addr_equal(&iph->daddr, &rt_info->daddr) || - !ipv6_addr_equal(&iph->saddr, &rt_info->saddr) || - skb->mark != rt_info->mark) - return ip6_route_me_harder(entry->state.net, entry->state.sk, skb); - } - return 0; -} - int __nf_ip6_route(struct net *net, struct dst_entry **dst, struct flowi *fl, bool strict) { @@ -242,36 +228,3 @@ blackhole: return 0; } EXPORT_SYMBOL_GPL(br_ip6_fragment); - -static const struct nf_ipv6_ops ipv6ops = { -#if IS_MODULE(CONFIG_IPV6) - .chk_addr = ipv6_chk_addr, - .route_me_harder = ip6_route_me_harder, - .dev_get_saddr = ipv6_dev_get_saddr, - .route = __nf_ip6_route, -#if IS_ENABLED(CONFIG_SYN_COOKIES) - .cookie_init_sequence = __cookie_v6_init_sequence, - .cookie_v6_check = __cookie_v6_check, -#endif -#endif - .route_input = ip6_route_input, - .fragment = ip6_fragment, - .reroute = nf_ip6_reroute, -#if IS_MODULE(CONFIG_IPV6) - .br_fragment = br_ip6_fragment, -#endif -}; - -int __init ipv6_netfilter_init(void) -{ - RCU_INIT_POINTER(nf_ipv6_ops, &ipv6ops); - return 0; -} - -/* This can be called from inet6_init() on errors, so it cannot - * be marked __exit. -DaveM - */ -void ipv6_netfilter_fini(void) -{ - RCU_INIT_POINTER(nf_ipv6_ops, NULL); -} diff --git a/net/ipv6/netfilter/ip6t_eui64.c b/net/ipv6/netfilter/ip6t_eui64.c index da69a27e8332..bbb684f9964c 100644 --- a/net/ipv6/netfilter/ip6t_eui64.c +++ b/net/ipv6/netfilter/ip6t_eui64.c @@ -7,6 +7,7 @@ #include <linux/module.h> #include <linux/skbuff.h> #include <linux/ipv6.h> +#include <linux/if_arp.h> #include <linux/if_ether.h> #include <linux/netfilter/x_tables.h> @@ -21,8 +22,10 @@ eui64_mt6(const struct sk_buff *skb, struct xt_action_param *par) { unsigned char eui64[8]; - if (!(skb_mac_header(skb) >= skb->head && - skb_mac_header(skb) + ETH_HLEN <= skb->data)) { + if (!skb->dev || skb->dev->type != ARPHRD_ETHER) + return false; + + if (!skb_mac_header_was_set(skb) || skb_mac_header_len(skb) < ETH_HLEN) { par->hotdrop = true; return false; } diff --git a/net/ipv6/netfilter/nft_dup_ipv6.c b/net/ipv6/netfilter/nft_dup_ipv6.c index 492a811828a7..95ec27b3971c 100644 --- a/net/ipv6/netfilter/nft_dup_ipv6.c +++ b/net/ipv6/netfilter/nft_dup_ipv6.c @@ -74,7 +74,6 @@ static const struct nft_expr_ops nft_dup_ipv6_ops = { .eval = nft_dup_ipv6_eval, .init = nft_dup_ipv6_init, .dump = nft_dup_ipv6_dump, - .reduce = NFT_REDUCE_READONLY, }; static const struct nla_policy nft_dup_ipv6_policy[NFTA_DUP_MAX + 1] = { diff --git a/net/ipv6/netfilter/nft_fib_ipv6.c b/net/ipv6/netfilter/nft_fib_ipv6.c index 421036a3605b..8b2dba88ee96 100644 --- a/net/ipv6/netfilter/nft_fib_ipv6.c +++ b/net/ipv6/netfilter/nft_fib_ipv6.c @@ -52,7 +52,13 @@ static int nft_fib6_flowi_init(struct flowi6 *fl6, const struct nft_fib *priv, fl6->flowlabel = (*(__be32 *)iph) & IPV6_FLOWINFO_MASK; fl6->flowi6_l3mdev = nft_fib_l3mdev_master_ifindex_rcu(pkt, dev); - return lookup_flags; + return lookup_flags | RT6_LOOKUP_F_DST_NOREF; +} + +static int nft_fib6_lookup(struct net *net, struct flowi6 *fl6, + struct fib6_result *res, int flags) +{ + return fib6_lookup(net, fl6->flowi6_oif, fl6, res, flags); } static u32 __nft_fib6_eval_type(const struct nft_fib *priv, @@ -60,13 +66,14 @@ static u32 __nft_fib6_eval_type(const struct nft_fib *priv, struct ipv6hdr *iph) { const struct net_device *dev = NULL; + struct fib6_result res = {}; int route_err, addrtype; - struct rt6_info *rt; struct flowi6 fl6 = { .flowi6_iif = LOOPBACK_IFINDEX, .flowi6_proto = pkt->tprot, .flowi6_uid = sock_net_uid(nft_net(pkt), NULL), }; + int lookup_flags; u32 ret = 0; if (priv->flags & NFTA_FIB_F_IIF) @@ -74,29 +81,23 @@ static u32 __nft_fib6_eval_type(const struct nft_fib *priv, else if (priv->flags & NFTA_FIB_F_OIF) dev = nft_out(pkt); - nft_fib6_flowi_init(&fl6, priv, pkt, dev, iph); + lookup_flags = nft_fib6_flowi_init(&fl6, priv, pkt, dev, iph); if (dev && nf_ipv6_chk_addr(nft_net(pkt), &fl6.daddr, dev, true)) ret = RTN_LOCAL; - route_err = nf_ip6_route(nft_net(pkt), (struct dst_entry **)&rt, - flowi6_to_flowi(&fl6), false); + route_err = nft_fib6_lookup(nft_net(pkt), &fl6, &res, lookup_flags); if (route_err) goto err; - if (rt->rt6i_flags & RTF_REJECT) { - route_err = rt->dst.error; - dst_release(&rt->dst); - goto err; - } + if (res.fib6_flags & RTF_REJECT) + return res.fib6_type; - if (ipv6_anycast_destination((struct dst_entry *)rt, &fl6.daddr)) + if (__ipv6_anycast_destination(&res.f6i->fib6_dst, res.fib6_flags, &fl6.daddr)) ret = RTN_ANYCAST; - else if (!dev && rt->rt6i_flags & RTF_LOCAL) + else if (!dev && res.fib6_flags & RTF_LOCAL) ret = RTN_LOCAL; - dst_release(&rt->dst); - if (ret) return ret; @@ -152,6 +153,33 @@ static bool nft_fib_v6_skip_icmpv6(const struct sk_buff *skb, u8 next, const str return ipv6_addr_type(&iph->daddr) & IPV6_ADDR_LINKLOCAL; } +static bool nft_fib6_info_nh_dev_match(const struct net_device *nh_dev, + const struct net_device *dev) +{ + return nh_dev == dev || + l3mdev_master_ifindex_rcu(nh_dev) == dev->ifindex; +} + +static bool nft_fib6_info_nh_uses_dev(struct fib6_info *rt, + const struct net_device *dev) +{ + const struct net_device *nh_dev; + struct fib6_info *iter; + + nh_dev = fib6_info_nh_dev(rt); + if (nft_fib6_info_nh_dev_match(nh_dev, dev)) + return true; + + list_for_each_entry(iter, &rt->fib6_siblings, fib6_siblings) { + nh_dev = fib6_info_nh_dev(iter); + + if (nft_fib6_info_nh_dev_match(nh_dev, dev)) + return true; + } + + return false; +} + void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs, const struct nft_pktinfo *pkt) { @@ -160,14 +188,14 @@ void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs, const struct net_device *found = NULL; const struct net_device *oif = NULL; u32 *dest = ®s->data[priv->dreg]; + struct fib6_result res = {}; struct ipv6hdr *iph, _iph; struct flowi6 fl6 = { .flowi6_iif = LOOPBACK_IFINDEX, .flowi6_proto = pkt->tprot, .flowi6_uid = sock_net_uid(nft_net(pkt), NULL), }; - struct rt6_info *rt; - int lookup_flags; + int lookup_flags, ret; if (nft_fib_can_skip(pkt)) { nft_fib_store_result(dest, priv, nft_in(pkt)); @@ -193,26 +221,17 @@ void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs, lookup_flags = nft_fib6_flowi_init(&fl6, priv, pkt, oif, iph); *dest = 0; - rt = (void *)ip6_route_lookup(nft_net(pkt), &fl6, pkt->skb, - lookup_flags); - if (rt->dst.error) - goto put_rt_err; - - /* Should not see RTF_LOCAL here */ - if (rt->rt6i_flags & (RTF_REJECT | RTF_ANYCAST | RTF_LOCAL)) - goto put_rt_err; + ret = nft_fib6_lookup(nft_net(pkt), &fl6, &res, lookup_flags); + if (ret || res.fib6_flags & (RTF_REJECT | RTF_ANYCAST | RTF_LOCAL)) + return; if (!oif) { - found = rt->rt6i_idev->dev; + found = fib6_info_nh_dev(res.f6i); } else { - if (oif == rt->rt6i_idev->dev || - l3mdev_master_ifindex_rcu(rt->rt6i_idev->dev) == oif->ifindex) + if (nft_fib6_info_nh_uses_dev(res.f6i, oif)) found = oif; } - nft_fib_store_result(dest, priv, found); - put_rt_err: - ip6_rt_put(rt); } EXPORT_SYMBOL_GPL(nft_fib6_eval); @@ -225,7 +244,6 @@ static const struct nft_expr_ops nft_fib6_type_ops = { .init = nft_fib_init, .dump = nft_fib_dump, .validate = nft_fib_validate, - .reduce = nft_fib_reduce, }; static const struct nft_expr_ops nft_fib6_ops = { @@ -235,7 +253,6 @@ static const struct nft_expr_ops nft_fib6_ops = { .init = nft_fib_init, .dump = nft_fib_dump, .validate = nft_fib_validate, - .reduce = nft_fib_reduce, }; static const struct nft_expr_ops * diff --git a/net/ipv6/netfilter/nft_reject_ipv6.c b/net/ipv6/netfilter/nft_reject_ipv6.c index 5c61294f410e..ed69c768797e 100644 --- a/net/ipv6/netfilter/nft_reject_ipv6.c +++ b/net/ipv6/netfilter/nft_reject_ipv6.c @@ -46,7 +46,6 @@ static const struct nft_expr_ops nft_reject_ipv6_ops = { .init = nft_reject_init, .dump = nft_reject_dump, .validate = nft_reject_validate, - .reduce = NFT_REDUCE_READONLY, }; static struct nft_expr_type nft_reject_ipv6_type __read_mostly = { diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c index cba1684a3f30..64b1eeb79b57 100644 --- a/net/ipv6/output_core.c +++ b/net/ipv6/output_core.c @@ -100,29 +100,6 @@ int ip6_find_1stfragopt(struct sk_buff *skb, u8 **nexthdr) } EXPORT_SYMBOL(ip6_find_1stfragopt); -#if IS_ENABLED(CONFIG_IPV6) -int ip6_dst_hoplimit(struct dst_entry *dst) -{ - int hoplimit = dst_metric_raw(dst, RTAX_HOPLIMIT); - - rcu_read_lock(); - if (hoplimit == 0) { - struct net_device *dev = dst_dev_rcu(dst); - struct inet6_dev *idev; - - idev = __in6_dev_get(dev); - if (idev) - hoplimit = READ_ONCE(idev->cnf.hop_limit); - else - hoplimit = READ_ONCE(dev_net(dev)->ipv6.devconf_all->hop_limit); - } - rcu_read_unlock(); - - return hoplimit; -} -EXPORT_SYMBOL(ip6_dst_hoplimit); -#endif - int __ip6_local_out(struct net *net, struct sock *sk, struct sk_buff *skb) { ipv6_set_payload_len(ipv6_hdr(skb), skb->len - sizeof(struct ipv6hdr)); diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c index e4afc651731a..6e90d0bf9f3d 100644 --- a/net/ipv6/ping.c +++ b/net/ipv6/ping.c @@ -24,8 +24,7 @@ #include <net/ping.h> /* Compatibility glue so we can support IPv6 when it's compiled as a module */ -static int dummy_ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, - int *addr_len) +static int dummy_ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len) { return -EAFNOSUPPORT; } diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c index 73296f38c252..813013ca4e75 100644 --- a/net/ipv6/proc.c +++ b/net/ipv6/proc.c @@ -39,8 +39,6 @@ static int sockstat6_seq_show(struct seq_file *seq, void *v) sock_prot_inuse_get(net, &tcpv6_prot)); seq_printf(seq, "UDP6: inuse %d\n", sock_prot_inuse_get(net, &udpv6_prot)); - seq_printf(seq, "UDPLITE6: inuse %d\n", - sock_prot_inuse_get(net, &udplitev6_prot)); seq_printf(seq, "RAW6: inuse %d\n", sock_prot_inuse_get(net, &rawv6_prot)); seq_printf(seq, "FRAG6: inuse %u memory %lu\n", @@ -110,17 +108,6 @@ static const struct snmp_mib snmp6_udp6_list[] = { SNMP_MIB_ITEM("Udp6MemErrors", UDP_MIB_MEMERRORS), }; -static const struct snmp_mib snmp6_udplite6_list[] = { - SNMP_MIB_ITEM("UdpLite6InDatagrams", UDP_MIB_INDATAGRAMS), - SNMP_MIB_ITEM("UdpLite6NoPorts", UDP_MIB_NOPORTS), - SNMP_MIB_ITEM("UdpLite6InErrors", UDP_MIB_INERRORS), - SNMP_MIB_ITEM("UdpLite6OutDatagrams", UDP_MIB_OUTDATAGRAMS), - SNMP_MIB_ITEM("UdpLite6RcvbufErrors", UDP_MIB_RCVBUFERRORS), - SNMP_MIB_ITEM("UdpLite6SndbufErrors", UDP_MIB_SNDBUFERRORS), - SNMP_MIB_ITEM("UdpLite6InCsumErrors", UDP_MIB_CSUMERRORS), - SNMP_MIB_ITEM("UdpLite6MemErrors", UDP_MIB_MEMERRORS), -}; - static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, atomic_long_t *smib) { char name[32]; @@ -228,9 +215,6 @@ static int snmp6_seq_show(struct seq_file *seq, void *v) snmp6_seq_show_item(seq, net->mib.udp_stats_in6, NULL, snmp6_udp6_list, ARRAY_SIZE(snmp6_udp6_list)); - snmp6_seq_show_item(seq, net->mib.udplite_stats_in6, - NULL, snmp6_udplite6_list, - ARRAY_SIZE(snmp6_udplite6_list)); return 0; } diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index 27a268059168..3cc58698cbbd 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -369,7 +369,8 @@ static inline int rawv6_rcv_skb(struct sock *sk, struct sk_buff *skb) /* Charge it to the socket. */ skb_dst_drop(skb); - if (sock_queue_rcv_skb_reason(sk, skb, &reason) < 0) { + reason = sock_queue_rcv_skb_reason(sk, skb); + if (reason) { sk_skb_reason_drop(sk, skb, reason); return NET_RX_DROP; } @@ -432,7 +433,7 @@ int rawv6_rcv(struct sock *sk, struct sk_buff *skb) */ static int rawv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, - int flags, int *addr_len) + int flags) { struct ipv6_pinfo *np = inet6_sk(sk); DECLARE_SOCKADDR(struct sockaddr_in6 *, sin6, msg->msg_name); @@ -444,10 +445,10 @@ static int rawv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, return -EOPNOTSUPP; if (flags & MSG_ERRQUEUE) - return ipv6_recv_error(sk, msg, len, addr_len); + return ipv6_recv_error(sk, msg, len); if (np->rxopt.bits.rxpmtu && READ_ONCE(np->rxpmtu)) - return ipv6_recv_rxpmtu(sk, msg, len, addr_len); + return ipv6_recv_rxpmtu(sk, msg, len); skb = skb_recv_datagram(sk, flags, &err); if (!skb) @@ -481,7 +482,7 @@ static int rawv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, sin6->sin6_flowinfo = 0; sin6->sin6_scope_id = ipv6_iface_scope_id(&sin6->sin6_addr, inet6_iif(skb)); - *addr_len = sizeof(*sin6); + msg->msg_namelen = sizeof(*sin6); } sock_recv_cmsgs(msg, sk, skb); diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c index 25ec8001898d..11f9144bebbe 100644 --- a/net/ipv6/reassembly.c +++ b/net/ipv6/reassembly.c @@ -132,6 +132,9 @@ static int ip6_frag_queue(struct net *net, /* note that if prob_offset is set, the skb is freed elsewhere, * we do not free it here. */ + inet_frag_kill(&fq->q, refs); + __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), + IPSTATS_MIB_REASMFAILS); return -1; } @@ -163,6 +166,9 @@ static int ip6_frag_queue(struct net *net, * this case. -DaveM */ *prob_offset = offsetof(struct ipv6hdr, payload_len); + inet_frag_kill(&fq->q, refs); + __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), + IPSTATS_MIB_REASMFAILS); return -1; } if (end > fq->q.len) { diff --git a/net/ipv6/route.c b/net/ipv6/route.c index cb521700cee7..19eb6b702227 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2655,6 +2655,7 @@ void ip6_route_input(struct sk_buff *skb) skb_dst_set_noref(skb, ip6_route_input_lookup(net, skb->dev, &fl6, skb, flags)); } +EXPORT_SYMBOL_GPL(ip6_route_input); INDIRECT_CALLABLE_SCOPE struct rt6_info *ip6_pol_route_output(struct net *net, struct fib6_table *table, @@ -3585,6 +3586,11 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh, struct inet6_dev *idev = NULL; int err; + if (!ipv6_mod_enabled()) { + NL_SET_ERR_MSG(extack, "IPv6 support not enabled in kernel"); + return -EAFNOSUPPORT; + } + fib6_nh->fib_nh_family = AF_INET6; #ifdef CONFIG_IPV6_ROUTER_PREF fib6_nh->last_probe = jiffies; @@ -6826,7 +6832,6 @@ void __init ip6_route_init_special_entries(void) #endif } -#if IS_BUILTIN(CONFIG_IPV6) #if defined(CONFIG_BPF_SYSCALL) && defined(CONFIG_PROC_FS) DEFINE_BPF_ITER_FUNC(ipv6_route, struct bpf_iter_meta *meta, struct fib6_info *rt) @@ -6860,7 +6865,6 @@ static void bpf_iter_unregister(void) bpf_iter_unreg_target(&ipv6_route_reg_info); } #endif -#endif static const struct rtnl_msg_handler ip6_route_rtnl_msg_handlers[] __initconst_or_module = { {.owner = THIS_MODULE, .protocol = PF_INET6, .msgtype = RTM_NEWROUTE, @@ -6921,13 +6925,11 @@ int __init ip6_route_init(void) if (ret) goto out_register_late_subsys; -#if IS_BUILTIN(CONFIG_IPV6) #if defined(CONFIG_BPF_SYSCALL) && defined(CONFIG_PROC_FS) ret = bpf_iter_register(); if (ret) goto out_register_late_subsys; #endif -#endif for_each_possible_cpu(cpu) { struct uncached_list *ul = per_cpu_ptr(&rt6_uncached_list, cpu); @@ -6961,11 +6963,9 @@ out_kmem_cache: void ip6_route_cleanup(void) { -#if IS_BUILTIN(CONFIG_IPV6) #if defined(CONFIG_BPF_SYSCALL) && defined(CONFIG_PROC_FS) bpf_iter_unregister(); #endif -#endif unregister_netdevice_notifier(&ip6_route_dev_notifier); unregister_pernet_subsys(&ip6_route_net_late_ops); fib6_rules_cleanup(); diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c index d6a0f7df9080..97b50d9b1365 100644 --- a/net/ipv6/seg6_iptunnel.c +++ b/net/ipv6/seg6_iptunnel.c @@ -50,6 +50,7 @@ static size_t seg6_lwt_headroom(struct seg6_iptunnel_encap *tuninfo) struct seg6_lwt { struct dst_cache cache_input; struct dst_cache cache_output; + struct in6_addr tunsrc; struct seg6_iptunnel_encap tuninfo[]; }; @@ -66,6 +67,7 @@ seg6_encap_lwtunnel(struct lwtunnel_state *lwt) static const struct nla_policy seg6_iptunnel_policy[SEG6_IPTUNNEL_MAX + 1] = { [SEG6_IPTUNNEL_SRH] = { .type = NLA_BINARY }, + [SEG6_IPTUNNEL_SRC] = NLA_POLICY_EXACT_LEN(sizeof(struct in6_addr)), }; static int nla_put_srh(struct sk_buff *skb, int attrtype, @@ -88,23 +90,32 @@ static int nla_put_srh(struct sk_buff *skb, int attrtype, } static void set_tun_src(struct net *net, struct net_device *dev, - struct in6_addr *daddr, struct in6_addr *saddr) + struct in6_addr *daddr, struct in6_addr *saddr, + struct in6_addr *route_tunsrc) { struct seg6_pernet_data *sdata = seg6_pernet(net); struct in6_addr *tun_src; - rcu_read_lock(); - - tun_src = rcu_dereference(sdata->tun_src); - - if (!ipv6_addr_any(tun_src)) { - memcpy(saddr, tun_src, sizeof(struct in6_addr)); + /* Priority order to select tunnel source address: + * 1. per route source address (if configured) + * 2. per network namespace source address (if configured) + * 3. dynamic resolution + */ + if (route_tunsrc && !ipv6_addr_any(route_tunsrc)) { + memcpy(saddr, route_tunsrc, sizeof(struct in6_addr)); } else { - ipv6_dev_get_saddr(net, dev, daddr, IPV6_PREFER_SRC_PUBLIC, - saddr); - } + rcu_read_lock(); + tun_src = rcu_dereference(sdata->tun_src); + + if (!ipv6_addr_any(tun_src)) { + memcpy(saddr, tun_src, sizeof(struct in6_addr)); + } else { + ipv6_dev_get_saddr(net, dev, daddr, + IPV6_PREFER_SRC_PUBLIC, saddr); + } - rcu_read_unlock(); + rcu_read_unlock(); + } } /* Compute flowlabel for outer IPv6 header */ @@ -126,7 +137,8 @@ static __be32 seg6_make_flowlabel(struct net *net, struct sk_buff *skb, } static int __seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, - int proto, struct dst_entry *cache_dst) + int proto, struct dst_entry *cache_dst, + struct in6_addr *route_tunsrc) { struct dst_entry *dst = skb_dst(skb); struct net_device *dev = dst_dev(dst); @@ -183,7 +195,7 @@ static int __seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, isrh->nexthdr = proto; hdr->daddr = isrh->segments[isrh->first_segment]; - set_tun_src(net, dev, &hdr->daddr, &hdr->saddr); + set_tun_src(net, dev, &hdr->daddr, &hdr->saddr, route_tunsrc); #ifdef CONFIG_IPV6_SEG6_HMAC if (sr_has_hmac(isrh)) { @@ -203,14 +215,15 @@ static int __seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, /* encapsulate an IPv6 packet within an outer IPv6 header with a given SRH */ int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, int proto) { - return __seg6_do_srh_encap(skb, osrh, proto, NULL); + return __seg6_do_srh_encap(skb, osrh, proto, NULL, NULL); } EXPORT_SYMBOL_GPL(seg6_do_srh_encap); /* encapsulate an IPv6 packet within an outer IPv6 header with reduced SRH */ static int seg6_do_srh_encap_red(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, int proto, - struct dst_entry *cache_dst) + struct dst_entry *cache_dst, + struct in6_addr *route_tunsrc) { __u8 first_seg = osrh->first_segment; struct dst_entry *dst = skb_dst(skb); @@ -273,7 +286,7 @@ static int seg6_do_srh_encap_red(struct sk_buff *skb, if (skip_srh) { hdr->nexthdr = proto; - set_tun_src(net, dev, &hdr->daddr, &hdr->saddr); + set_tun_src(net, dev, &hdr->daddr, &hdr->saddr, route_tunsrc); goto out; } @@ -309,7 +322,7 @@ static int seg6_do_srh_encap_red(struct sk_buff *skb, srcaddr: isrh->nexthdr = proto; - set_tun_src(net, dev, &hdr->daddr, &hdr->saddr); + set_tun_src(net, dev, &hdr->daddr, &hdr->saddr, route_tunsrc); #ifdef CONFIG_IPV6_SEG6_HMAC if (unlikely(!skip_srh && sr_has_hmac(isrh))) { @@ -384,9 +397,11 @@ static int seg6_do_srh(struct sk_buff *skb, struct dst_entry *cache_dst) { struct dst_entry *dst = skb_dst(skb); struct seg6_iptunnel_encap *tinfo; + struct seg6_lwt *slwt; int proto, err = 0; - tinfo = seg6_encap_lwtunnel(dst->lwtstate); + slwt = seg6_lwt_lwtunnel(dst->lwtstate); + tinfo = slwt->tuninfo; switch (tinfo->mode) { case SEG6_IPTUN_MODE_INLINE: @@ -411,11 +426,11 @@ static int seg6_do_srh(struct sk_buff *skb, struct dst_entry *cache_dst) return -EINVAL; if (tinfo->mode == SEG6_IPTUN_MODE_ENCAP) - err = __seg6_do_srh_encap(skb, tinfo->srh, - proto, cache_dst); + err = __seg6_do_srh_encap(skb, tinfo->srh, proto, + cache_dst, &slwt->tunsrc); else - err = seg6_do_srh_encap_red(skb, tinfo->srh, - proto, cache_dst); + err = seg6_do_srh_encap_red(skb, tinfo->srh, proto, + cache_dst, &slwt->tunsrc); if (err) return err; @@ -437,12 +452,12 @@ static int seg6_do_srh(struct sk_buff *skb, struct dst_entry *cache_dst) if (tinfo->mode == SEG6_IPTUN_MODE_L2ENCAP) err = __seg6_do_srh_encap(skb, tinfo->srh, - IPPROTO_ETHERNET, - cache_dst); + IPPROTO_ETHERNET, cache_dst, + &slwt->tunsrc); else err = seg6_do_srh_encap_red(skb, tinfo->srh, - IPPROTO_ETHERNET, - cache_dst); + IPPROTO_ETHERNET, cache_dst, + &slwt->tunsrc); if (err) return err; @@ -679,6 +694,10 @@ static int seg6_build_state(struct net *net, struct nlattr *nla, if (family != AF_INET6) return -EINVAL; + if (tb[SEG6_IPTUNNEL_SRC]) { + NL_SET_ERR_MSG(extack, "incompatible mode for tunsrc"); + return -EINVAL; + } break; case SEG6_IPTUN_MODE_ENCAP: break; @@ -712,6 +731,18 @@ static int seg6_build_state(struct net *net, struct nlattr *nla, memcpy(&slwt->tuninfo, tuninfo, tuninfo_len); + if (tb[SEG6_IPTUNNEL_SRC]) { + slwt->tunsrc = nla_get_in6_addr(tb[SEG6_IPTUNNEL_SRC]); + + if (ipv6_addr_any(&slwt->tunsrc) || + ipv6_addr_is_multicast(&slwt->tunsrc) || + ipv6_addr_loopback(&slwt->tunsrc)) { + NL_SET_ERR_MSG(extack, "invalid tunsrc address"); + err = -EINVAL; + goto err_destroy_output; + } + } + newts->type = LWTUNNEL_ENCAP_SEG6; newts->flags |= LWTUNNEL_STATE_INPUT_REDIRECT; @@ -724,6 +755,8 @@ static int seg6_build_state(struct net *net, struct nlattr *nla, return 0; +err_destroy_output: + dst_cache_destroy(&slwt->cache_output); err_destroy_input: dst_cache_destroy(&slwt->cache_input); err_free_newts: @@ -743,29 +776,46 @@ static int seg6_fill_encap_info(struct sk_buff *skb, struct lwtunnel_state *lwtstate) { struct seg6_iptunnel_encap *tuninfo = seg6_encap_lwtunnel(lwtstate); + struct seg6_lwt *slwt = seg6_lwt_lwtunnel(lwtstate); if (nla_put_srh(skb, SEG6_IPTUNNEL_SRH, tuninfo)) return -EMSGSIZE; + if (!ipv6_addr_any(&slwt->tunsrc) && + nla_put_in6_addr(skb, SEG6_IPTUNNEL_SRC, &slwt->tunsrc)) + return -EMSGSIZE; + return 0; } static int seg6_encap_nlsize(struct lwtunnel_state *lwtstate) { struct seg6_iptunnel_encap *tuninfo = seg6_encap_lwtunnel(lwtstate); + struct seg6_lwt *slwt = seg6_lwt_lwtunnel(lwtstate); + int nlsize; + + nlsize = nla_total_size(SEG6_IPTUN_ENCAP_SIZE(tuninfo)); + + if (!ipv6_addr_any(&slwt->tunsrc)) + nlsize += nla_total_size(sizeof(slwt->tunsrc)); - return nla_total_size(SEG6_IPTUN_ENCAP_SIZE(tuninfo)); + return nlsize; } static int seg6_encap_cmp(struct lwtunnel_state *a, struct lwtunnel_state *b) { struct seg6_iptunnel_encap *a_hdr = seg6_encap_lwtunnel(a); struct seg6_iptunnel_encap *b_hdr = seg6_encap_lwtunnel(b); + struct seg6_lwt *a_slwt = seg6_lwt_lwtunnel(a); + struct seg6_lwt *b_slwt = seg6_lwt_lwtunnel(b); int len = SEG6_IPTUN_ENCAP_SIZE(a_hdr); if (len != SEG6_IPTUN_ENCAP_SIZE(b_hdr)) return 1; + if (!ipv6_addr_equal(&a_slwt->tunsrc, &b_slwt->tunsrc)) + return 1; + return memcmp(a_hdr, b_hdr, len); } diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 6a7b8abb0477..201347b4e127 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -21,6 +21,7 @@ #include <linux/types.h> #include <linux/socket.h> #include <linux/sockios.h> +#include <linux/string.h> #include <linux/net.h> #include <linux/in6.h> #include <linux/netdevice.h> @@ -256,9 +257,9 @@ static struct ip_tunnel *ipip6_tunnel_locate(struct net *net, if (parms->name[0]) { if (!dev_valid_name(parms->name)) goto failed; - strscpy(name, parms->name, IFNAMSIZ); + strscpy(name, parms->name); } else { - strcpy(name, "sit%d"); + strscpy(name, "sit%d"); } dev = alloc_netdev(sizeof(*t), name, NET_NAME_UNKNOWN, ipip6_tunnel_setup); @@ -275,7 +276,7 @@ static struct ip_tunnel *ipip6_tunnel_locate(struct net *net, goto failed_free; if (!parms->name[0]) - strcpy(parms->name, dev->name); + strscpy(parms->name, dev->name); return nt; @@ -308,7 +309,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u struct ip_tunnel_prl kprl, *kp; struct ip_tunnel_prl_entry *prl; unsigned int cmax, c = 0, ca, len; - int ret = 0; + int ret; if (dev == dev_to_sit_net(dev)->fb_tunnel_dev) return -EINVAL; @@ -1442,7 +1443,7 @@ static int ipip6_tunnel_init(struct net_device *dev) int err; tunnel->dev = dev; - strcpy(tunnel->parms.name, dev->name); + strscpy(tunnel->parms.name, dev->name); ipip6_tunnel_bind_dev(dev); @@ -1863,7 +1864,7 @@ static int __net_init sit_init_net(struct net *net) ipip6_tunnel_clone_6rd(sitn->fb_tunnel_dev, sitn); ipip6_fb_tunnel_init(sitn->fb_tunnel_dev); - strcpy(t->parms.name, sitn->fb_tunnel_dev->name); + strscpy(t->parms.name, sitn->fb_tunnel_dev->name); return 0; err_reg_dev: diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 823bf4fff963..2c3f7a739709 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -105,7 +105,7 @@ static void inet6_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb) } } -static union tcp_seq_and_ts_off +INDIRECT_CALLABLE_SCOPE union tcp_seq_and_ts_off tcp_v6_init_seq_and_ts_off(const struct net *net, const struct sk_buff *skb) { return secure_tcpv6_seq_and_ts_off(net, @@ -325,7 +325,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr_unsized *uaddr, inet->inet_dport); if (!tp->write_seq) WRITE_ONCE(tp->write_seq, st.seq); - tp->tsoffset = st.ts_off; + WRITE_ONCE(tp->tsoffset, st.ts_off); } if (tcp_fastopen_defer_connect(sk, &err)) @@ -348,6 +348,21 @@ failure: return err; } +static struct dst_entry *inet6_csk_update_pmtu(struct sock *sk, u32 mtu) +{ + struct flowi6 *fl6 = &inet_sk(sk)->cork.fl.u.ip6; + struct dst_entry *dst; + + dst = inet6_csk_route_socket(sk, fl6); + + if (IS_ERR(dst)) + return NULL; + dst->ops->update_pmtu(dst, sk, NULL, mtu, true); + + dst = inet6_csk_route_socket(sk, fl6); + return IS_ERR(dst) ? NULL : dst; +} + static void tcp_v6_mtu_reduced(struct sock *sk) { struct dst_entry *dst; @@ -1581,7 +1596,7 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb) sock_rps_save_rxhash(sk, skb); sk_mark_napi_id(sk, skb); - if (dst) { + if (dst && unlikely(dst != skb_dst(skb))) { if (sk->sk_rx_dst_ifindex != skb->skb_iif || INDIRECT_CALL_1(dst->ops->check, ip6_dst_check, dst, sk->sk_rx_dst_cookie) == NULL) { @@ -1779,7 +1794,8 @@ lookup: } refcounted = true; nsk = NULL; - if (!tcp_filter(sk, skb, &drop_reason)) { + drop_reason = tcp_filter(sk, skb); + if (!drop_reason) { th = (const struct tcphdr *)skb->data; hdr = ipv6_hdr(skb); tcp_v6_fill_cb(skb, hdr, th); @@ -1840,7 +1856,8 @@ process: nf_reset_ct(skb); - if (tcp_filter(sk, skb, &drop_reason)) + drop_reason = tcp_filter(sk, skb); + if (drop_reason) goto discard_and_relse; th = (const struct tcphdr *)skb->data; @@ -1862,7 +1879,8 @@ process: if (!sock_owned_by_user(sk)) { ret = tcp_v6_do_rcv(sk, skb); } else { - if (tcp_add_backlog(sk, skb, &drop_reason)) + drop_reason = tcp_add_backlog(sk, skb); + if (drop_reason) goto discard_and_relse; } bh_unlock_sock(sk); @@ -1957,56 +1975,12 @@ do_time_wait: goto discard_it; } -void tcp_v6_early_demux(struct sk_buff *skb) -{ - struct net *net = dev_net_rcu(skb->dev); - const struct ipv6hdr *hdr; - const struct tcphdr *th; - struct sock *sk; - - if (skb->pkt_type != PACKET_HOST) - return; - - if (!pskb_may_pull(skb, skb_transport_offset(skb) + sizeof(struct tcphdr))) - return; - - hdr = ipv6_hdr(skb); - th = tcp_hdr(skb); - - if (th->doff < sizeof(struct tcphdr) / 4) - return; - - /* Note : We use inet6_iif() here, not tcp_v6_iif() */ - sk = __inet6_lookup_established(net, &hdr->saddr, th->source, - &hdr->daddr, ntohs(th->dest), - inet6_iif(skb), inet6_sdif(skb)); - if (sk) { - skb->sk = sk; - skb->destructor = sock_edemux; - if (sk_fullsock(sk)) { - struct dst_entry *dst = rcu_dereference(sk->sk_rx_dst); - - if (dst) - dst = dst_check(dst, sk->sk_rx_dst_cookie); - if (dst && - sk->sk_rx_dst_ifindex == skb->skb_iif) - skb_dst_set_noref(skb, dst); - } - } -} - static struct timewait_sock_ops tcp6_timewait_sock_ops = { .twsk_obj_size = sizeof(struct tcp6_timewait_sock), }; -INDIRECT_CALLABLE_SCOPE void tcp_v6_send_check(struct sock *sk, struct sk_buff *skb) -{ - __tcp_v6_send_check(skb, &sk->sk_v6_rcv_saddr, &sk->sk_v6_daddr); -} - const struct inet_connection_sock_af_ops ipv6_specific = { .queue_xmit = inet6_csk_xmit, - .send_check = tcp_v6_send_check, .rebuild_header = inet6_sk_rebuild_header, .sk_rx_dst_set = inet6_sk_rx_dst_set, .conn_request = tcp_v6_conn_request, @@ -2038,7 +2012,6 @@ static const struct tcp_sock_af_ops tcp_sock_ipv6_specific = { */ static const struct inet_connection_sock_af_ops ipv6_mapped = { .queue_xmit = ip_queue_xmit, - .send_check = tcp_v4_send_check, .rebuild_header = inet_sk_rebuild_header, .sk_rx_dst_set = inet_sk_rx_dst_set, .conn_request = tcp_v6_conn_request, diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 010b909275dd..15e032194ecc 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -37,6 +37,7 @@ #include <trace/events/udp.h> #include <net/addrconf.h> +#include <net/aligned_data.h> #include <net/ndisc.h> #include <net/protocol.h> #include <net/transp_v6.h> @@ -57,7 +58,6 @@ #include <linux/proc_fs.h> #include <linux/seq_file.h> #include <trace/events/skb.h> -#include "udp_impl.h" static void udpv6_destruct_sock(struct sock *sk) { @@ -65,7 +65,7 @@ static void udpv6_destruct_sock(struct sock *sk) inet6_sock_destruct(sk); } -int udpv6_init_sock(struct sock *sk) +static int udpv6_init_sock(struct sock *sk) { int res = udp_lib_init_sock(sk); @@ -95,7 +95,7 @@ u32 udp6_ehashfn(const struct net *net, udp6_ehash_secret + net_hash_mix(net)); } -int udp_v6_get_port(struct sock *sk, unsigned short snum) +static int udp_v6_get_port(struct sock *sk, unsigned short snum) { unsigned int hash2_nulladdr = ipv6_portaddr_hash(sock_net(sk), &in6addr_any, snum); @@ -107,7 +107,7 @@ int udp_v6_get_port(struct sock *sk, unsigned short snum) return udp_lib_get_port(sk, snum, hash2_nulladdr); } -void udp_v6_rehash(struct sock *sk) +static void udp_v6_rehash(struct sock *sk) { u16 new_hash = ipv6_portaddr_hash(sock_net(sk), &sk->sk_v6_rcv_saddr, @@ -127,10 +127,11 @@ void udp_v6_rehash(struct sock *sk) udp_lib_rehash(sk, new_hash, new_hash4); } -static int compute_score(struct sock *sk, const struct net *net, - const struct in6_addr *saddr, __be16 sport, - const struct in6_addr *daddr, unsigned short hnum, - int dif, int sdif) +static __always_inline int +compute_score(struct sock *sk, const struct net *net, + const struct in6_addr *saddr, __be16 sport, + const struct in6_addr *daddr, unsigned short hnum, + int dif, int sdif) { int bound_dev_if, score; struct inet_sock *inet; @@ -260,8 +261,8 @@ rescore: continue; /* compute_score is too long of a function to be - * inlined, and calling it again here yields - * measurable overhead for some + * inlined twice here, and calling it uninlined + * here yields measurable overhead for some * workloads. Work around it by jumping * backwards to rescore 'result'. */ @@ -344,9 +345,9 @@ static void udp6_hash4(struct sock *sk) struct sock *__udp6_lib_lookup(const struct net *net, const struct in6_addr *saddr, __be16 sport, const struct in6_addr *daddr, __be16 dport, - int dif, int sdif, struct udp_table *udptable, - struct sk_buff *skb) + int dif, int sdif, struct sk_buff *skb) { + struct udp_table *udptable = net->ipv4.udp_table; unsigned short hnum = ntohs(dport); struct udp_hslot *hslot2; struct sock *result, *sk; @@ -370,8 +371,7 @@ struct sock *__udp6_lib_lookup(const struct net *net, goto done; /* Lookup redirect from BPF */ - if (static_branch_unlikely(&bpf_sk_lookup_enabled) && - udptable == net->ipv4.udp_table) { + if (static_branch_unlikely(&bpf_sk_lookup_enabled)) { sk = inet6_lookup_run_sk_lookup(net, IPPROTO_UDP, skb, sizeof(struct udphdr), saddr, sport, daddr, hnum, dif, udp6_ehashfn); @@ -407,14 +407,13 @@ done: EXPORT_SYMBOL_GPL(__udp6_lib_lookup); static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb, - __be16 sport, __be16 dport, - struct udp_table *udptable) + __be16 sport, __be16 dport) { const struct ipv6hdr *iph = ipv6_hdr(skb); return __udp6_lib_lookup(dev_net(skb->dev), &iph->saddr, sport, &iph->daddr, dport, inet6_iif(skb), - inet6_sdif(skb), udptable, skb); + inet6_sdif(skb), skb); } struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb, @@ -422,14 +421,12 @@ struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb, { const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation]; const struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + offset); - struct net *net = dev_net(skb->dev); int iif, sdif; inet6_get_iif_sdif(skb, &iif, &sdif); - return __udp6_lib_lookup(net, &iph->saddr, sport, - &iph->daddr, dport, iif, - sdif, net->ipv4.udp_table, NULL); + return __udp6_lib_lookup(dev_net(skb->dev), &iph->saddr, sport, + &iph->daddr, dport, iif, sdif, NULL); } /* Must be called under rcu_read_lock(). @@ -441,8 +438,7 @@ struct sock *udp6_lib_lookup(const struct net *net, const struct in6_addr *saddr { struct sock *sk; - sk = __udp6_lib_lookup(net, saddr, sport, daddr, dport, - dif, 0, net->ipv4.udp_table, NULL); + sk = __udp6_lib_lookup(net, saddr, sport, daddr, dport, dif, 0, NULL); if (sk && !refcount_inc_not_zero(&sk->sk_refcnt)) sk = NULL; return sk; @@ -464,24 +460,23 @@ static int udp6_skb_len(struct sk_buff *skb) * return it, otherwise we block. */ +INDIRECT_CALLABLE_SCOPE int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, - int flags, int *addr_len) + int flags) { + int off, is_udp4, err, peeking = flags & MSG_PEEK; struct ipv6_pinfo *np = inet6_sk(sk); struct inet_sock *inet = inet_sk(sk); - struct sk_buff *skb; - unsigned int ulen, copied; - int off, err, peeking = flags & MSG_PEEK; - int is_udplite = IS_UDPLITE(sk); struct udp_mib __percpu *mib; bool checksum_valid = false; - int is_udp4; + unsigned int ulen, copied; + struct sk_buff *skb; if (flags & MSG_ERRQUEUE) - return ipv6_recv_error(sk, msg, len, addr_len); + return ipv6_recv_error(sk, msg, len); if (np->rxopt.bits.rxpmtu && READ_ONCE(np->rxpmtu)) - return ipv6_recv_rxpmtu(sk, msg, len, addr_len); + return ipv6_recv_rxpmtu(sk, msg, len); try_again: off = sk_peek_offset(sk, flags); @@ -499,14 +494,10 @@ try_again: is_udp4 = (skb->protocol == htons(ETH_P_IP)); mib = __UDPX_MIB(sk, is_udp4); - /* - * If checksum is needed at all, try to do it while copying the - * data. If the data is truncated, or if we only want a partial - * coverage checksum (UDP-Lite), do it before the copy. + /* If checksum is needed at all, try to do it while copying the + * data. If the data is truncated, do it before the copy. */ - - if (copied < ulen || peeking || - (is_udplite && UDP_SKB_CB(skb)->partial_cov)) { + if (copied < ulen || peeking) { checksum_valid = udp_skb_csum_unnecessary(skb) || !__udp_lib_checksum_complete(skb); if (!checksum_valid) @@ -553,11 +544,11 @@ try_again: ipv6_iface_scope_id(&sin6->sin6_addr, inet6_iif(skb)); } - *addr_len = sizeof(*sin6); + msg->msg_namelen = sizeof(*sin6); BPF_CGROUP_RUN_PROG_UDP6_RECVMSG_LOCK(sk, (struct sockaddr *)sin6, - addr_len); + &msg->msg_namelen); } if (udp_test_bit(GRO_ENABLED, sk)) @@ -648,7 +639,6 @@ static int __udp6_lib_err_encap_no_sk(struct sk_buff *skb, static struct sock *__udp6_lib_err_encap(struct net *net, const struct ipv6hdr *hdr, int offset, struct udphdr *uh, - struct udp_table *udptable, struct sock *sk, struct sk_buff *skb, struct inet6_skb_parm *opt, @@ -679,7 +669,7 @@ static struct sock *__udp6_lib_err_encap(struct net *net, sk = __udp6_lib_lookup(net, &hdr->daddr, uh->source, &hdr->saddr, uh->dest, - inet6_iif(skb), 0, udptable, skb); + inet6_iif(skb), 0, skb); if (sk) { up = udp_sk(sk); @@ -700,29 +690,28 @@ out: return sk; } -int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt, - u8 type, u8 code, int offset, __be32 info, - struct udp_table *udptable) +static int udpv6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, + u8 type, u8 code, int offset, __be32 info) { - struct ipv6_pinfo *np; const struct ipv6hdr *hdr = (const struct ipv6hdr *)skb->data; - const struct in6_addr *saddr = &hdr->saddr; - const struct in6_addr *daddr = seg6_get_daddr(skb, opt) ? : &hdr->daddr; - struct udphdr *uh = (struct udphdr *)(skb->data+offset); + struct udphdr *uh = (struct udphdr *)(skb->data + offset); + const struct in6_addr *saddr, *daddr; + struct net *net = dev_net(skb->dev); + struct ipv6_pinfo *np; bool tunnel = false; struct sock *sk; int harderr; int err; - struct net *net = dev_net(skb->dev); + daddr = seg6_get_daddr(skb, opt) ? : &hdr->daddr; + saddr = &hdr->saddr; sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source, - inet6_iif(skb), inet6_sdif(skb), udptable, NULL); + inet6_iif(skb), inet6_sdif(skb), NULL); if (!sk || READ_ONCE(udp_sk(sk)->encap_type)) { /* No socket for error: try tunnels before discarding */ if (static_branch_unlikely(&udpv6_encap_needed_key)) { - sk = __udp6_lib_err_encap(net, hdr, offset, uh, - udptable, sk, skb, + sk = __udp6_lib_err_encap(net, hdr, offset, uh, sk, skb, opt, type, code, info); if (!sk) return 0; @@ -794,20 +783,18 @@ static int __udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) rc = __udp_enqueue_schedule_skb(sk, skb); if (rc < 0) { - int is_udplite = IS_UDPLITE(sk); enum skb_drop_reason drop_reason; + struct net *net = sock_net(sk); /* Note that an ENOMEM error is charged twice */ if (rc == -ENOMEM) { - UDP6_INC_STATS(sock_net(sk), - UDP_MIB_RCVBUFERRORS, is_udplite); + UDP6_INC_STATS(net, UDP_MIB_RCVBUFERRORS); drop_reason = SKB_DROP_REASON_SOCKET_RCVBUFF; } else { - UDP6_INC_STATS(sock_net(sk), - UDP_MIB_MEMERRORS, is_udplite); + UDP6_INC_STATS(net, UDP_MIB_MEMERRORS); drop_reason = SKB_DROP_REASON_PROTO_MEM; } - UDP6_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); + UDP6_INC_STATS(net, UDP_MIB_INERRORS); trace_udp_fail_queue_rcv_skb(rc, sk, skb); sk_skb_reason_drop(sk, skb, drop_reason); return -1; @@ -816,19 +803,11 @@ static int __udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) return 0; } -static __inline__ int udpv6_err(struct sk_buff *skb, - struct inet6_skb_parm *opt, u8 type, - u8 code, int offset, __be32 info) -{ - return __udp6_lib_err(skb, opt, type, code, offset, info, - dev_net(skb->dev)->ipv4.udp_table); -} - static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb) { enum skb_drop_reason drop_reason = SKB_DROP_REASON_NOT_SPECIFIED; struct udp_sock *up = udp_sk(sk); - int is_udplite = IS_UDPLITE(sk); + struct net *net = sock_net(sk); if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) { drop_reason = SKB_DROP_REASON_XFRM_POLICY; @@ -862,9 +841,7 @@ static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb) ret = encap_rcv(sk, skb); if (ret <= 0) { - __UDP6_INC_STATS(sock_net(sk), - UDP_MIB_INDATAGRAMS, - is_udplite); + __UDP6_INC_STATS(net, UDP_MIB_INDATAGRAMS); return -ret; } } @@ -872,31 +849,13 @@ static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb) /* FALLTHROUGH -- it's a UDP Packet */ } - /* - * UDP-Lite specific tests, ignored on UDP sockets (see net/ipv4/udp.c). - */ - if (unlikely(udp_test_bit(UDPLITE_RECV_CC, sk) && - UDP_SKB_CB(skb)->partial_cov)) { - u16 pcrlen = READ_ONCE(up->pcrlen); - - if (pcrlen == 0) { /* full coverage was set */ - net_dbg_ratelimited("UDPLITE6: partial coverage %d while full coverage %d requested\n", - UDP_SKB_CB(skb)->cscov, skb->len); - goto drop; - } - if (UDP_SKB_CB(skb)->cscov < pcrlen) { - net_dbg_ratelimited("UDPLITE6: coverage %d too small, need min %d\n", - UDP_SKB_CB(skb)->cscov, pcrlen); - goto drop; - } - } - prefetch(&sk->sk_rmem_alloc); if (rcu_access_pointer(sk->sk_filter) && udp_lib_checksum_complete(skb)) goto csum_error; - if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr), &drop_reason)) + drop_reason = sk_filter_trim_cap(sk, skb, sizeof(struct udphdr)); + if (drop_reason) goto drop; udp_csum_pull_header(skb); @@ -907,9 +866,9 @@ static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb) csum_error: drop_reason = SKB_DROP_REASON_UDP_CSUM; - __UDP6_INC_STATS(sock_net(sk), UDP_MIB_CSUMERRORS, is_udplite); + __UDP6_INC_STATS(net, UDP_MIB_CSUMERRORS); drop: - __UDP6_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); + __UDP6_INC_STATS(net, UDP_MIB_INERRORS); udp_drops_inc(sk); sk_skb_reason_drop(sk, skb, drop_reason); return -1; @@ -976,19 +935,26 @@ static void udp6_csum_zero_error(struct sk_buff *skb) * so we don't need to lock the hashes. */ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb, - const struct in6_addr *saddr, const struct in6_addr *daddr, - struct udp_table *udptable, int proto) + const struct in6_addr *saddr, + const struct in6_addr *daddr) { - struct sock *sk, *first = NULL; + struct udp_table *udptable = net->ipv4.udp_table; const struct udphdr *uh = udp_hdr(skb); + unsigned int hash2, hash2_any, offset; unsigned short hnum = ntohs(uh->dest); - struct udp_hslot *hslot = udp_hashslot(udptable, net, hnum); - unsigned int offset = offsetof(typeof(*sk), sk_node); - unsigned int hash2 = 0, hash2_any = 0, use_hash2 = (hslot->count > 10); - int dif = inet6_iif(skb); + struct sock *sk, *first = NULL; int sdif = inet6_sdif(skb); + int dif = inet6_iif(skb); struct hlist_node *node; + struct udp_hslot *hslot; struct sk_buff *nskb; + bool use_hash2; + + hash2_any = 0; + hash2 = 0; + hslot = udp_hashslot(udptable, net, hnum); + use_hash2 = hslot->count > 10; + offset = offsetof(typeof(*sk), sk_node); if (use_hash2) { hash2_any = ipv6_portaddr_hash(net, &in6addr_any, hnum) & @@ -1016,10 +982,8 @@ start_lookup: nskb = skb_clone(skb, GFP_ATOMIC); if (unlikely(!nskb)) { udp_drops_inc(sk); - __UDP6_INC_STATS(net, UDP_MIB_RCVBUFERRORS, - IS_UDPLITE(sk)); - __UDP6_INC_STATS(net, UDP_MIB_INERRORS, - IS_UDPLITE(sk)); + __UDP6_INC_STATS(net, UDP_MIB_RCVBUFERRORS); + __UDP6_INC_STATS(net, UDP_MIB_INERRORS); continue; } @@ -1038,8 +1002,7 @@ start_lookup: consume_skb(skb); } else { kfree_skb(skb); - __UDP6_INC_STATS(net, UDP_MIB_IGNOREDMULTI, - proto == IPPROTO_UDPLITE); + __UDP6_INC_STATS(net, UDP_MIB_IGNOREDMULTI); } return 0; } @@ -1058,7 +1021,7 @@ static int udp6_unicast_rcv_skb(struct sock *sk, struct sk_buff *skb, { int ret; - if (inet_get_convert_csum(sk) && uh->check && !IS_UDPLITE(sk)) + if (inet_get_convert_csum(sk) && uh->check) skb_checksum_try_convert(skb, IPPROTO_UDP, ip6_compute_pseudo); ret = udpv6_queue_rcv_skb(sk, skb); @@ -1069,8 +1032,39 @@ static int udp6_unicast_rcv_skb(struct sock *sk, struct sk_buff *skb, return 0; } -int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, - int proto) +static int udp6_csum_init(struct sk_buff *skb, struct udphdr *uh) +{ + int err; + + /* To support RFC 6936 (allow zero checksum in UDP/IPV6 for tunnels) + * we accept a checksum of zero here. When we find the socket + * for the UDP packet we'll check if that socket allows zero checksum + * for IPv6 (set by socket option). + * + * Note, we are only interested in != 0 or == 0, thus the + * force to int. + */ + err = (__force int)skb_checksum_init_zero_check(skb, IPPROTO_UDP, uh->check, + ip6_compute_pseudo); + if (err) + return err; + + if (skb->ip_summed == CHECKSUM_COMPLETE && !skb->csum_valid) { + /* If SW calculated the value, we know it's bad */ + if (skb->csum_complete_sw) + return 1; + + /* HW says the value is bad. Let's validate that. + * skb->csum is no longer the full packet checksum, + * so don't treat is as such. + */ + skb_checksum_complete_unset(skb); + } + + return 0; +} + +INDIRECT_CALLABLE_SCOPE int udpv6_rcv(struct sk_buff *skb) { enum skb_drop_reason reason = SKB_DROP_REASON_NOT_SPECIFIED; const struct in6_addr *saddr, *daddr; @@ -1091,26 +1085,23 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, if (ulen > skb->len) goto short_packet; - if (proto == IPPROTO_UDP) { - /* UDP validates ulen. */ + /* Check for jumbo payload */ + if (ulen == 0) + ulen = skb->len; - /* Check for jumbo payload */ - if (ulen == 0) - ulen = skb->len; + if (ulen < sizeof(*uh)) + goto short_packet; - if (ulen < sizeof(*uh)) + if (ulen < skb->len) { + if (pskb_trim_rcsum(skb, ulen)) goto short_packet; - if (ulen < skb->len) { - if (pskb_trim_rcsum(skb, ulen)) - goto short_packet; - saddr = &ipv6_hdr(skb)->saddr; - daddr = &ipv6_hdr(skb)->daddr; - uh = udp_hdr(skb); - } + saddr = &ipv6_hdr(skb)->saddr; + daddr = &ipv6_hdr(skb)->daddr; + uh = udp_hdr(skb); } - if (udp6_csum_init(skb, uh, proto)) + if (udp6_csum_init(skb, uh)) goto csum_error; /* Check if the socket is already available, e.g. due to early demux */ @@ -1142,11 +1133,10 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, * Multicast receive code */ if (ipv6_addr_is_multicast(daddr)) - return __udp6_lib_mcast_deliver(net, skb, - saddr, daddr, udptable, proto); + return __udp6_lib_mcast_deliver(net, skb, saddr, daddr); /* Unicast */ - sk = __udp6_lib_lookup_skb(skb, uh->source, uh->dest, udptable); + sk = __udp6_lib_lookup_skb(skb, uh->source, uh->dest); if (sk) { if (!uh->check && !udp_get_no_check6_rx(sk)) goto report_csum_error; @@ -1165,7 +1155,7 @@ no_sk: if (udp_lib_checksum_complete(skb)) goto csum_error; - __UDP6_INC_STATS(net, UDP_MIB_NOPORTS, proto == IPPROTO_UDPLITE); + __UDP6_INC_STATS(net, UDP_MIB_NOPORTS); icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_PORT_UNREACH, 0); sk_skb_reason_drop(sk, skb, reason); @@ -1174,8 +1164,7 @@ no_sk: short_packet: if (reason == SKB_DROP_REASON_NOT_SPECIFIED) reason = SKB_DROP_REASON_PKT_TOO_SMALL; - net_dbg_ratelimited("UDP%sv6: short packet: From [%pI6c]:%u %d/%d to [%pI6c]:%u\n", - proto == IPPROTO_UDPLITE ? "-Lite" : "", + net_dbg_ratelimited("UDPv6: short packet: From [%pI6c]:%u %d/%d to [%pI6c]:%u\n", saddr, ntohs(uh->source), ulen, skb->len, daddr, ntohs(uh->dest)); @@ -1186,9 +1175,9 @@ report_csum_error: csum_error: if (reason == SKB_DROP_REASON_NOT_SPECIFIED) reason = SKB_DROP_REASON_UDP_CSUM; - __UDP6_INC_STATS(net, UDP_MIB_CSUMERRORS, proto == IPPROTO_UDPLITE); + __UDP6_INC_STATS(net, UDP_MIB_CSUMERRORS); discard: - __UDP6_INC_STATS(net, UDP_MIB_INERRORS, proto == IPPROTO_UDPLITE); + __UDP6_INC_STATS(net, UDP_MIB_INERRORS); sk_skb_reason_drop(sk, skb, reason); return 0; } @@ -1262,11 +1251,6 @@ void udp_v6_early_demux(struct sk_buff *skb) } } -INDIRECT_CALLABLE_SCOPE int udpv6_rcv(struct sk_buff *skb) -{ - return __udp6_lib_rcv(skb, dev_net(skb->dev)->ipv4.udp_table, IPPROTO_UDP); -} - /* * Throw away all pending data and cancel the corking. Socket is locked. */ @@ -1371,13 +1355,13 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6, struct inet_cork *cork) { struct sock *sk = skb->sk; + int offset, len, datalen; struct udphdr *uh; int err = 0; - int is_udplite = IS_UDPLITE(sk); - __wsum csum = 0; - int offset = skb_transport_offset(skb); - int len = skb->len - offset; - int datalen = len - sizeof(*uh); + + offset = skb_transport_offset(skb); + len = skb->len - offset; + datalen = len - sizeof(*uh); /* * Create a UDP header @@ -1404,7 +1388,7 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6, kfree_skb(skb); return -EINVAL; } - if (is_udplite || dst_xfrm(skb_dst(skb))) { + if (dst_xfrm(skb_dst(skb))) { kfree_skb(skb); return -EIO; } @@ -1420,21 +1404,18 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6, } } - if (is_udplite) - csum = udplite_csum(skb); - else if (udp_get_no_check6_tx(sk)) { /* UDP csum disabled */ + if (udp_get_no_check6_tx(sk)) { /* UDP csum disabled */ skb->ip_summed = CHECKSUM_NONE; goto send; } else if (skb->ip_summed == CHECKSUM_PARTIAL) { /* UDP hardware csum */ csum_partial: udp6_hwcsum_outgoing(sk, skb, &fl6->saddr, &fl6->daddr, len); goto send; - } else - csum = udp_csum(skb); + } /* add protocol-dependent pseudo-header */ uh->check = csum_ipv6_magic(&fl6->saddr, &fl6->daddr, - len, fl6->flowi6_proto, csum); + len, IPPROTO_UDP, udp_csum(skb)); if (uh->check == 0) uh->check = CSUM_MANGLED_0; @@ -1442,13 +1423,11 @@ send: err = ip6_send_skb(skb); if (unlikely(err)) { if (err == -ENOBUFS && !inet6_test_bit(RECVERR6, sk)) { - UDP6_INC_STATS(sock_net(sk), - UDP_MIB_SNDBUFERRORS, is_udplite); + UDP6_INC_STATS(sock_net(sk), UDP_MIB_SNDBUFERRORS); err = 0; } } else { - UDP6_INC_STATS(sock_net(sk), - UDP_MIB_OUTDATAGRAMS, is_udplite); + UDP6_INC_STATS(sock_net(sk), UDP_MIB_OUTDATAGRAMS); } return err; } @@ -1476,27 +1455,26 @@ out: int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) { - struct ipv6_txoptions opt_space; - struct udp_sock *up = udp_sk(sk); - struct inet_sock *inet = inet_sk(sk); - struct ipv6_pinfo *np = inet6_sk(sk); + int corkreq = udp_test_bit(CORK, sk) || msg->msg_flags & MSG_MORE; DECLARE_SOCKADDR(struct sockaddr_in6 *, sin6, msg->msg_name); - struct in6_addr *daddr, *final_p, final; - struct ipv6_txoptions *opt = NULL; struct ipv6_txoptions *opt_to_free = NULL; + struct in6_addr *daddr, *final_p, final; struct ip6_flowlabel *flowlabel = NULL; + struct inet_sock *inet = inet_sk(sk); + struct ipv6_pinfo *np = inet6_sk(sk); + struct ipv6_txoptions *opt = NULL; + struct udp_sock *up = udp_sk(sk); + struct ipv6_txoptions opt_space; + int addr_len = msg->msg_namelen; struct inet_cork_full cork; - struct flowi6 *fl6 = &cork.fl.u.ip6; - struct dst_entry *dst; struct ipcm6_cookie ipc6; - int addr_len = msg->msg_namelen; bool connected = false; + struct dst_entry *dst; + struct flowi6 *fl6; int ulen = len; - int corkreq = udp_test_bit(CORK, sk) || msg->msg_flags & MSG_MORE; int err; - int is_udplite = IS_UDPLITE(sk); - int (*getfrag)(void *, char *, int, int, int, struct sk_buff *); + fl6 = &cork.fl.u.ip6; ipcm6_init_sk(&ipc6, sk); ipc6.gso_size = READ_ONCE(up->gso_size); @@ -1555,7 +1533,6 @@ do_udp_sendmsg: if (len > INT_MAX - sizeof(struct udphdr)) return -EMSGSIZE; - getfrag = is_udplite ? udplite_getfrag : ip_generic_getfrag; if (READ_ONCE(up->pending)) { if (READ_ONCE(up->pending) == AF_INET) return udp_sendmsg(sk, msg, len); @@ -1657,7 +1634,7 @@ do_udp_sendmsg: opt = ipv6_fixup_options(&opt_space, opt); ipc6.opt = opt; - fl6->flowi6_proto = sk->sk_protocol; + fl6->flowi6_proto = IPPROTO_UDP; fl6->flowi6_mark = ipc6.sockc.mark; fl6->daddr = *daddr; if (ipv6_addr_any(&fl6->saddr) && !ipv6_addr_any(&np->saddr)) @@ -1724,7 +1701,7 @@ back_from_confirm: if (!corkreq) { struct sk_buff *skb; - skb = ip6_make_skb(sk, getfrag, msg, ulen, + skb = ip6_make_skb(sk, ip_generic_getfrag, msg, ulen, sizeof(struct udphdr), &ipc6, dst_rt6_info(dst), msg->msg_flags, &cork); @@ -1750,8 +1727,9 @@ back_from_confirm: do_append_data: up->len += ulen; - err = ip6_append_data(sk, getfrag, msg, ulen, sizeof(struct udphdr), - &ipc6, fl6, dst_rt6_info(dst), + err = ip6_append_data(sk, ip_generic_getfrag, msg, ulen, + sizeof(struct udphdr), &ipc6, fl6, + dst_rt6_info(dst), corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags); if (err) udp_v6_flush_pending_frames(sk); @@ -1778,10 +1756,9 @@ out_no_dst: * things). We could add another new stat but at least for now that * seems like overkill. */ - if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { - UDP6_INC_STATS(sock_net(sk), - UDP_MIB_SNDBUFERRORS, is_udplite); - } + if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) + UDP6_INC_STATS(sock_net(sk), UDP_MIB_SNDBUFERRORS); + return err; do_confirm: @@ -1808,7 +1785,7 @@ static void udpv6_splice_eof(struct socket *sock) release_sock(sk); } -void udpv6_destroy_sock(struct sock *sk) +static void udpv6_destroy_sock(struct sock *sk) { struct udp_sock *up = udp_sk(sk); lock_sock(sk); @@ -1836,20 +1813,20 @@ void udpv6_destroy_sock(struct sock *sk) /* * Socket option code for UDP */ -int udpv6_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval, - unsigned int optlen) +static int udpv6_setsockopt(struct sock *sk, int level, int optname, + sockptr_t optval, unsigned int optlen) { - if (level == SOL_UDP || level == SOL_UDPLITE || level == SOL_SOCKET) + if (level == SOL_UDP || level == SOL_SOCKET) return udp_lib_setsockopt(sk, level, optname, optval, optlen, udp_v6_push_pending_frames); return ipv6_setsockopt(sk, level, optname, optval, optlen); } -int udpv6_getsockopt(struct sock *sk, int level, int optname, - char __user *optval, int __user *optlen) +static int udpv6_getsockopt(struct sock *sk, int level, int optname, + char __user *optval, int __user *optlen) { - if (level == SOL_UDP || level == SOL_UDPLITE) + if (level == SOL_UDP) return udp_lib_getsockopt(sk, level, optname, optval, optlen); return ipv6_getsockopt(sk, level, optname, optval, optlen); } @@ -1857,7 +1834,7 @@ int udpv6_getsockopt(struct sock *sk, int level, int optname, /* ------------------------------------------------------------------------ */ #ifdef CONFIG_PROC_FS -int udp6_seq_show(struct seq_file *seq, void *v) +static int udp6_seq_show(struct seq_file *seq, void *v) { if (v == SEQ_START_TOKEN) { seq_puts(seq, IPV6_SEQ_DGRAM_HEADER); @@ -1872,17 +1849,15 @@ int udp6_seq_show(struct seq_file *seq, void *v) return 0; } -const struct seq_operations udp6_seq_ops = { +static const struct seq_operations udp6_seq_ops = { .start = udp_seq_start, .next = udp_seq_next, .stop = udp_seq_stop, .show = udp6_seq_show, }; -EXPORT_SYMBOL(udp6_seq_ops); static struct udp_seq_afinfo udp6_seq_afinfo = { .family = AF_INET6, - .udp_table = NULL, }; int __net_init udp6_proc_init(struct net *net) @@ -1934,7 +1909,6 @@ struct proto udpv6_prot = { .sysctl_rmem_offset = offsetof(struct net, ipv4.sysctl_udp_rmem_min), .obj_size = sizeof(struct udp6_sock), .ipv6_pinfo_offset = offsetof(struct udp6_sock, inet6), - .h.udp_table = NULL, .diag_destroy = udp_abort, }; diff --git a/net/ipv6/udp_impl.h b/net/ipv6/udp_impl.h deleted file mode 100644 index 8a406be25a3a..000000000000 --- a/net/ipv6/udp_impl.h +++ /dev/null @@ -1,32 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef _UDP6_IMPL_H -#define _UDP6_IMPL_H -#include <net/aligned_data.h> -#include <net/udp.h> -#include <net/udplite.h> -#include <net/protocol.h> -#include <net/addrconf.h> -#include <net/inet_common.h> -#include <net/transp_v6.h> - -int __udp6_lib_rcv(struct sk_buff *, struct udp_table *, int); -int __udp6_lib_err(struct sk_buff *, struct inet6_skb_parm *, u8, u8, int, - __be32, struct udp_table *); - -int udpv6_init_sock(struct sock *sk); -int udp_v6_get_port(struct sock *sk, unsigned short snum); -void udp_v6_rehash(struct sock *sk); - -int udpv6_getsockopt(struct sock *sk, int level, int optname, - char __user *optval, int __user *optlen); -int udpv6_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval, - unsigned int optlen); -int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len); -int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, - int *addr_len); -void udpv6_destroy_sock(struct sock *sk); - -#ifdef CONFIG_PROC_FS -int udp6_seq_show(struct seq_file *seq, void *v); -#endif -#endif /* _UDP6_IMPL_H */ diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c index e003b8494dc0..778afc7453ce 100644 --- a/net/ipv6/udp_offload.c +++ b/net/ipv6/udp_offload.c @@ -128,8 +128,7 @@ static struct sock *udp6_gro_lookup_skb(struct sk_buff *skb, __be16 sport, inet6_get_iif_sdif(skb, &iif, &sdif); return __udp6_lib_lookup(net, &iph->saddr, sport, - &iph->daddr, dport, iif, - sdif, net->ipv4.udp_table, NULL); + &iph->daddr, dport, iif, sdif, NULL); } struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb) diff --git a/net/ipv6/udplite.c b/net/ipv6/udplite.c deleted file mode 100644 index e867721cda4d..000000000000 --- a/net/ipv6/udplite.c +++ /dev/null @@ -1,139 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * UDPLITEv6 An implementation of the UDP-Lite protocol over IPv6. - * See also net/ipv4/udplite.c - * - * Authors: Gerrit Renker <gerrit@erg.abdn.ac.uk> - * - * Changes: - * Fixes: - */ -#define pr_fmt(fmt) "UDPLite6: " fmt - -#include <linux/export.h> -#include <linux/proc_fs.h> -#include "udp_impl.h" - -static int udplitev6_sk_init(struct sock *sk) -{ - pr_warn_once("UDP-Lite is deprecated and scheduled to be removed in 2025, " - "please contact the netdev mailing list\n"); - return udpv6_init_sock(sk); -} - -static int udplitev6_rcv(struct sk_buff *skb) -{ - return __udp6_lib_rcv(skb, &udplite_table, IPPROTO_UDPLITE); -} - -static int udplitev6_err(struct sk_buff *skb, - struct inet6_skb_parm *opt, - u8 type, u8 code, int offset, __be32 info) -{ - return __udp6_lib_err(skb, opt, type, code, offset, info, - &udplite_table); -} - -static const struct inet6_protocol udplitev6_protocol = { - .handler = udplitev6_rcv, - .err_handler = udplitev6_err, - .flags = INET6_PROTO_NOPOLICY|INET6_PROTO_FINAL, -}; - -struct proto udplitev6_prot = { - .name = "UDPLITEv6", - .owner = THIS_MODULE, - .close = udp_lib_close, - .connect = ip6_datagram_connect, - .disconnect = udp_disconnect, - .ioctl = udp_ioctl, - .init = udplitev6_sk_init, - .destroy = udpv6_destroy_sock, - .setsockopt = udpv6_setsockopt, - .getsockopt = udpv6_getsockopt, - .sendmsg = udpv6_sendmsg, - .recvmsg = udpv6_recvmsg, - .hash = udp_lib_hash, - .unhash = udp_lib_unhash, - .rehash = udp_v6_rehash, - .get_port = udp_v6_get_port, - - .memory_allocated = &net_aligned_data.udp_memory_allocated, - .per_cpu_fw_alloc = &udp_memory_per_cpu_fw_alloc, - - .sysctl_mem = sysctl_udp_mem, - .sysctl_wmem_offset = offsetof(struct net, ipv4.sysctl_udp_wmem_min), - .sysctl_rmem_offset = offsetof(struct net, ipv4.sysctl_udp_rmem_min), - .obj_size = sizeof(struct udp6_sock), - .ipv6_pinfo_offset = offsetof(struct udp6_sock, inet6), - .h.udp_table = &udplite_table, -}; - -static struct inet_protosw udplite6_protosw = { - .type = SOCK_DGRAM, - .protocol = IPPROTO_UDPLITE, - .prot = &udplitev6_prot, - .ops = &inet6_dgram_ops, - .flags = INET_PROTOSW_PERMANENT, -}; - -int __init udplitev6_init(void) -{ - int ret; - - ret = inet6_add_protocol(&udplitev6_protocol, IPPROTO_UDPLITE); - if (ret) - goto out; - - ret = inet6_register_protosw(&udplite6_protosw); - if (ret) - goto out_udplitev6_protocol; -out: - return ret; - -out_udplitev6_protocol: - inet6_del_protocol(&udplitev6_protocol, IPPROTO_UDPLITE); - goto out; -} - -void udplitev6_exit(void) -{ - inet6_unregister_protosw(&udplite6_protosw); - inet6_del_protocol(&udplitev6_protocol, IPPROTO_UDPLITE); -} - -#ifdef CONFIG_PROC_FS -static struct udp_seq_afinfo udplite6_seq_afinfo = { - .family = AF_INET6, - .udp_table = &udplite_table, -}; - -static int __net_init udplite6_proc_init_net(struct net *net) -{ - if (!proc_create_net_data("udplite6", 0444, net->proc_net, - &udp6_seq_ops, sizeof(struct udp_iter_state), - &udplite6_seq_afinfo)) - return -ENOMEM; - return 0; -} - -static void __net_exit udplite6_proc_exit_net(struct net *net) -{ - remove_proc_entry("udplite6", net->proc_net); -} - -static struct pernet_operations udplite6_net_ops = { - .init = udplite6_proc_init_net, - .exit = udplite6_proc_exit_net, -}; - -int __init udplite6_proc_init(void) -{ - return register_pernet_subsys(&udplite6_net_ops); -} - -void udplite6_proc_exit(void) -{ - unregister_pernet_subsys(&udplite6_net_ops); -} -#endif |
