Age | Commit message (Collapse) | Author |
|
Use the UDP encap_rcv hook to cut the bit out of the rxrpc packet reception
in which a packet is placed onto the UDP receive queue and then immediately
removed again by rxrpc. Going via the queue in this manner seems like it
should be unnecessary.
This does, however, require the invention of a value to place in encap_type
as that's one of the conditions to switch packets out to the encap_rcv
hook. Possibly the value doesn't actually matter for anything other than
sockopts on the UDP socket, which aren't accessible outside of rxrpc
anyway.
This seems to cut a bit of time out of the time elapsed between each
sk_buff being timestamped and turning up in rxrpc (the final number in the
following trace excerpts). I measured this by making the rxrpc_rx_packet
trace point print the time elapsed between the skb being timestamped and
the current time (in ns), e.g.:
... 424.278721: rxrpc_rx_packet: ... ACK 25026
So doing a 512MiB DIO read from my test server, with an unmodified kernel:
N min max sum mean stddev
27605 2626 7581 7.83992e+07 2840.04 181.029
and with the patch applied:
N min max sum mean stddev
27547 1895 12165 6.77461e+07 2459.29 255.02
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix the rxrpc_data_ready() function to pick up all packets and to not miss
any. There are two problems:
(1) The sk_data_ready pointer on the UDP socket is set *after* it is
bound. This means that it's open for business before we're ready to
dequeue packets and there's a tiny window exists in which a packet can
sneak onto the receive queue, but we never know about it.
Fix this by setting the pointers on the socket prior to binding it.
(2) skb_recv_udp() will return an error (such as ENETUNREACH) if there was
an error on the transmission side, even though we set the
sk_error_report hook. Because rxrpc_data_ready() returns immediately
in such a case, it never actually removes its packet from the receive
queue.
Fix this by abstracting out the UDP dequeuing and checksumming into a
separate function that keeps hammering on skb_recv_udp() until it
returns -EAGAIN, passing the packets extracted to the remainder of the
function.
and two potential problems:
(3) It might be possible in some circumstances or in the future for
packets to be being added to the UDP receive queue whilst rxrpc is
running consuming them, so the data_ready() handler might get called
less often than once per packet.
Allow for this by fully draining the queue on each call as (2).
(4) If a packet fails the checksum check, the code currently returns after
discarding the packet without checking for more.
Allow for this by fully draining the queue on each call as (2).
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
|
|
Fix some refs to init_net that should've been changed to the appropriate
network namespace.
Fixes: 2baec2c3f854 ("rxrpc: Support network namespacing")
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
|
|
When connecting SFP PHY to phylink use the detected interface.
Otherwise, the link fails to come up when the configured 'phy-mode'
differs from the SFP detected mode.
Move most of phylink_connect_phy() into __phylink_connect_phy(), and
leave phylink_connect_phy() as a wrapper. phylink_sfp_connect_phy() can
now pass the SFP detected PHY interface to __phylink_connect_phy().
This fixes 1GB SFP module link up on eth3 of the Macchiatobin board that
is configured in the DT to "2500base-x" phy-mode.
Fixes: 9525ae83959b6 ("phylink: add phylink infrastructure")
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
the be2net implementation of .ndo_tunnel_{add,del}() changes the value of
NETIF_F_GSO_UDP_TUNNEL bit in 'features' and 'hw_features', but it forgets
to call netdev_features_change(). Moreover, ethtool setting for that bit
can potentially be reverted after a tunnel is added or removed.
GSO already does software segmentation when 'hw_enc_features' is 0, even
if VXLAN offload is turned on. In addition, commit 096de2f83ebc ("benet:
stricter vxlan offloading check in be_features_check") avoids hardware
segmentation of non-VXLAN tunneled packets, or VXLAN packets having wrong
destination port. So, it's safe to avoid flipping the above feature on
addition/deletion of VXLAN tunnels.
Fixes: 630f4b70567f ("be2net: Export tunnel offloads only when a VxLAN tunnel is created")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we use raw socket as the vhost backend, a packet from virito with
gso offloading information, cannot be sent out in later validaton at
xmit path, as we did not set correct skb->protocol which is further used
for looking up the gso function.
To fix this, we set this field according to virito hdr information.
Fixes: e858fae2b0b8f4 ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion")
Signed-off-by: Jianfeng Tan <jianfeng.tan@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit c499696e7901 ("net: dsa: b53: Stop using dev->cpu_port
incorrectly") was a bit too trigger happy in removing the CPU port from
the VLAN membership because we rely on DSA to program the CPU port VLAN,
which it does, except it does not bother itself with tagged/untagged and
just usese untagged.
Having the CPU port "follow" the user ports tagged/untagged is not great
and does not allow for properly differentiating, so keep the CPU port
tagged in all VLANs.
Reported-by: Gerhard Wiesinger <lists@wiesinger.com>
Fixes: c499696e7901 ("net: dsa: b53: Stop using dev->cpu_port incorrectly")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Load the respective NAT helper module if the flow uses it.
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Michael Chan says:
====================
bnxt_en: Misc. bug fixes.
4 small bug fixes related to setting firmware message enables bits, possible
memory leak when probe fails, and ring accouting when RDMA driver is loaded.
Please queue these for -stable as well. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When getting the max rings supported, get the reduced max_irqs
by the ones used by RDMA.
If the number MSIX is the limiting factor, this bug may cause the
max ring count to be higher than it should be when RDMA driver is
loaded and may result in ring allocation failures.
Fixes: 30f529473ec9 ("bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the driver probe fails, all the resources that were allocated prior
to the failure must be freed. However, hwrm dma response memory is not
getting freed.
This patch fixes the problem described above.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In HWRM_QUEUE_COS2BW_CFG request, enables field should have the bits
set only for the queue ids which are having the valid parameters.
This causes firmware to return error when the TC to hardware CoS queue
mapping is not 1:1 during DCBNL ETS setup.
Fixes: 2e8ef77ee0ff ("bnxt_en: Add TC to hardware QoS queue mapping logic.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The enables bit for VNIC was set wrong when calling the HWRM_FUNC_CFG
firmware call to reserve VNICs. This has the effect that the firmware
will keep a large number of VNICs for the PF, and having very few for
VFs. DPDK driver running on the VFs, which requires more VNICs, may not
work properly as a result.
Fixes: 674f50a5b026 ("bnxt_en: Implement new method to reserve rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
team's ndo_add_slave() acquires 'team->lock' and later tries to open the
newly enslaved device via dev_open(). This emits a 'NETDEV_UP' event
that causes the VLAN driver to add VLAN 0 on the team device. team's
ndo_vlan_rx_add_vid() will also try to acquire 'team->lock' and
deadlock.
Fix this by checking early at the enslavement function that a team
device is not being enslaved to itself.
A similar check was added to the bond driver in commit 09a89c219baf
("bonding: disallow enslaving a bond to itself").
WARNING: possible recursive locking detected
4.18.0-rc7+ #176 Not tainted
--------------------------------------------
syz-executor4/6391 is trying to acquire lock:
(____ptrval____) (&team->lock){+.+.}, at: team_vlan_rx_add_vid+0x3b/0x1e0 drivers/net/team/team.c:1868
but task is already holding lock:
(____ptrval____) (&team->lock){+.+.}, at: team_add_slave+0xdb/0x1c30 drivers/net/team/team.c:1947
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&team->lock);
lock(&team->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by syz-executor4/6391:
#0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:77 [inline]
#0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x412/0xc30 net/core/rtnetlink.c:4662
#1: (____ptrval____) (&team->lock){+.+.}, at: team_add_slave+0xdb/0x1c30 drivers/net/team/team.c:1947
stack backtrace:
CPU: 1 PID: 6391 Comm: syz-executor4 Not tainted 4.18.0-rc7+ #176
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
print_deadlock_bug kernel/locking/lockdep.c:1765 [inline]
check_deadlock kernel/locking/lockdep.c:1809 [inline]
validate_chain kernel/locking/lockdep.c:2405 [inline]
__lock_acquire.cold.64+0x1fb/0x486 kernel/locking/lockdep.c:3435
lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
__mutex_lock_common kernel/locking/mutex.c:757 [inline]
__mutex_lock+0x176/0x1820 kernel/locking/mutex.c:894
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:909
team_vlan_rx_add_vid+0x3b/0x1e0 drivers/net/team/team.c:1868
vlan_add_rx_filter_info+0x14a/0x1d0 net/8021q/vlan_core.c:210
__vlan_vid_add net/8021q/vlan_core.c:278 [inline]
vlan_vid_add+0x63e/0x9d0 net/8021q/vlan_core.c:308
vlan_device_event.cold.12+0x2a/0x2f net/8021q/vlan.c:381
notifier_call_chain+0x180/0x390 kernel/notifier.c:93
__raw_notifier_call_chain kernel/notifier.c:394 [inline]
raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1735
call_netdevice_notifiers net/core/dev.c:1753 [inline]
dev_open+0x173/0x1b0 net/core/dev.c:1433
team_port_add drivers/net/team/team.c:1219 [inline]
team_add_slave+0xa8b/0x1c30 drivers/net/team/team.c:1948
do_set_master+0x1c9/0x220 net/core/rtnetlink.c:2248
do_setlink+0xba4/0x3e10 net/core/rtnetlink.c:2382
rtnl_setlink+0x2a9/0x400 net/core/rtnetlink.c:2636
rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4665
netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2455
rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4683
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
netlink_sendmsg+0xa18/0xfd0 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:642 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:652
___sys_sendmsg+0x7fd/0x930 net/socket.c:2126
__sys_sendmsg+0x11d/0x290 net/socket.c:2164
__do_sys_sendmsg net/socket.c:2173 [inline]
__se_sys_sendmsg net/socket.c:2171 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2171
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x456b29
Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9706bf8c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f9706bf96d4 RCX: 0000000000456b29
RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000004
RBP: 00000000009300a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000004d3548 R14: 00000000004c8227 R15: 0000000000000000
Fixes: 87002b03baab ("net: introduce vlan_vid_[add/del] and use them instead of direct [add/kill]_vid ndo calls")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-and-tested-by: syzbot+bd051aba086537515cdb@syzkaller.appspotmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cancel pending work before freeing smsc75xx private data structure
during binding. This fixes the following crash in the driver:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
IP: mutex_lock+0x2b/0x3f
<snipped>
Workqueue: events smsc75xx_deferred_multicast_write [smsc75xx]
task: ffff8caa83e85700 task.stack: ffff948b80518000
RIP: 0010:mutex_lock+0x2b/0x3f
<snipped>
Call Trace:
smsc75xx_deferred_multicast_write+0x40/0x1af [smsc75xx]
process_one_work+0x18d/0x2fc
worker_thread+0x1a2/0x269
? pr_cont_work+0x58/0x58
kthread+0xfa/0x10a
? pr_cont_work+0x58/0x58
? rcu_read_unlock_sched_notrace+0x48/0x48
ret_from_fork+0x22/0x40
Signed-off-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Just three small fixes:
* fix use-after-free in regulatory code
* fix rx-mgmt key flag in AP mode (mac80211)
* fix wireless extensions compat code memory leak
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ido Schimmel says:
====================
mlxsw: Couple of fixes
First patch works around an hardware issue in Spectrum-2 where a field
indicating the event type is always set to the same value. Since there
are only two event types and they are reported using different queues,
we can use the queue number to derive the event type.
Second patch prevents a router interface (RIF) leakage when a VLAN
device is deleted from on top a bridge device.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit 602b74eda813 ("mlxsw: spectrum_switchdev: Do not leak RIFs
when removing bridge") I handled the case where RIFs created for VLAN
devices were not properly cleaned up when their real device (a bridge)
was removed.
However, I forgot to handle the case of the VLAN device itself being
removed. Do so now when the VLAN device is being unlinked from its real
device.
Fixes: 99f44bb3527b ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reported-by: Artem Shvorin <art@qrator.net>
Tested-by: Artem Shvorin <art@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Due to a hardware issue in Spectrum-2, the field event_type of the event
queue element (EQE) has become reserved. It was used to distinguish between
command interface completion events and completion events.
Use queue number to determine event type, as command interface completion
events are always received on EQ0 and mlxsw driver maps completion events
to EQ1.
Fixes: c3ab435466d5 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
David writes:
"Networking fixes:
1) Prefix length validation in xfrm layer, from Steffen Klassert.
2) TX status reporting fix in mac80211, from Andrei Otcheretianski.
3) Fix hangs due to TX_DROP in mac80211, from Bob Copeland.
4) Fix DMA error regression in b43, from Larry Finger.
5) Add input validation to xenvif_set_hash_mapping(), from Jan Beulich.
6) SMMU unmapping fix in hns driver, from Yunsheng Lin.
7) Bluetooh crash in unpairing on SMP, from Matias Karhumaa.
8) WoL handling fixes in the phy layer, from Heiner Kallweit.
9) Fix deadlock in bonding, from Mahesh Bandewar.
10) Fill ttl inherit infor in vxlan driver, from Hangbin Liu.
11) Fix TX timeouts during netpoll, from Michael Chan.
12) RXRPC layer fixes from David Howells.
13) Another batch of ndo_poll_controller() removals to deal with
excessive resource consumption during load. From Eric Dumazet.
14) Fix a specific TIPC failure secnario, from LUU Duc Canh.
15) Really disable clocks in r8169 during suspend so that low
power states can actually be reached.
16) Fix SYN backlog lockdep issue in tcp and dccp, from Eric Dumazet.
17) Fix RCU locking in netpoll SKB send, which shows up in bonding,
from Dave Jones.
18) Fix TX stalls in r8169, from Heiner Kallweit.
19) Fix locksup in nfp due to control message storms, from Jakub
Kicinski.
20) Various rmnet bug fixes from Subash Abhinov Kasiviswanathan and
Sean Tranchetti.
21) Fix use after free in ip_cmsg_recv_dstaddr(), from Eric Dumazet."
* gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (122 commits)
ixgbe: check return value of napi_complete_done()
sctp: fix fall-through annotation
r8169: always autoneg on resume
ipv4: fix use-after-free in ip_cmsg_recv_dstaddr()
net: qualcomm: rmnet: Fix incorrect allocation flag in receive path
net: qualcomm: rmnet: Fix incorrect allocation flag in transmit
net: qualcomm: rmnet: Skip processing loopback packets
net: systemport: Fix wake-up interrupt race during resume
rtnl: limit IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES to 4096
bonding: fix warning message
inet: make sure to grab rcu_read_lock before using ireq->ireq_opt
nfp: avoid soft lockups under control message storm
declance: Fix continuation with the adapter identification message
net: fec: fix rare tx timeout
r8169: fix network stalls due to missing bit TXCFG_AUTO_FIFO
tun: napi flags belong to tfile
tun: initialize napi_mutex unconditionally
tun: remove unused parameters
bond: take rcu lock in netpoll_send_skb_on_dev
rtnetlink: Fail dump if target netnsid is invalid
...
|
|
The NIC driver should only enable interrupts when napi_complete_done()
returns true. This patch adds the check for ixgbe.
Cc: stable@vger.kernel.org # 4.10+
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Shuah writes:
"kselftest fixes for 4.19-rc7
This fixes update for 4.19-rc7 consists one fix to rseq test to
prevent it from seg-faulting when compiled with -fpie."
* tag 'linux-kselftest-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
rseq/selftests: fix parametrized test with -fpie
|
|
Replace "fallthru" with a proper "fall through" annotation.
This fix is part of the ongoing efforts to enabling
-Wimplicit-fallthrough
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Mauro writes:
"media fixes for v4.19-rc6"
* tag 'media/v4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: v4l: event: Prevent freeing event subscriptions while accessed
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Jiri writes:
"HID fixes:
- hantick touchpad fix from Anisse Astier
- device ID addition for Ice Lake mobile from Srinivas Pandruvada
- touchscreen resume fix for certain i2c-hid driven devices from Hans
de Goede"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: intel-ish-hid: Enable Ice Lake mobile
HID: i2c-hid: Remove RESEND_REPORT_DESCR quirk and its handling
HID: i2c-hid: disable runtime PM operations on hantick touchpad
|
|
Al writes:
"xattrs regression fix from Andreas; sat in -next for quite a while."
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sysfs: Do not return POSIX ACL xattrs via listxattr
|
|
The event subscriptions are added to the subscribed event list while
holding a spinlock, but that lock is subsequently released while still
accessing the subscription object. This makes it possible to unsubscribe
the event --- and freeing the subscription object's memory --- while
the subscription object is simultaneously accessed.
Prevent this by adding a mutex to serialise the event subscription and
unsubscription. This also gives a guarantee to the callback ops that the
add op has returned before the del op is called.
This change also results in making the elems field less special:
subscriptions are only added to the event list once they are fully
initialised.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: stable@vger.kernel.org # for 4.14 and up
Fixes: c3b5b0241f62 ("V4L/DVB: V4L: Events: Add backend")
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
This affects at least versions 25 and 33, so assume all cards are broken
and just renegotiate by default.
Fixes: 10bc6a6042c9 ("r8169: fix autoneg issue on resume with RTL8168E")
Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Caching ip_hdr(skb) before a call to pskb_may_pull() is buggy,
do not do it.
Fixes: 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-10-01
This pull request includes some fixes to mlx5 driver,
Please pull and let me know if there's any problem.
For -stable v4.11:
"6e0a4a23c59a ('net/mlx5: E-Switch, Fix out of bound access when setting vport rate')"
For -stable v4.18:
"98d6627c372a ('net/mlx5e: Set vlan masks for all offloaded TC rules')"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Subash Abhinov Kasiviswanathan says:
====================
net: qualcomm: rmnet: Updates 2018-10-02
This series is a set of small fixes for rmnet driver
Patch 1 is a fix for a scenario reported by syzkaller
Patch 2 & 3 are fixes for incorrect allocation flags
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The incoming skb needs to be reallocated in case the headroom
is not sufficient to adjust the ethernet header. This allocation
needs to be atomic otherwise it results in this splat
[<600601bb>] ___might_sleep+0x185/0x1a3
[<603f6314>] ? _raw_spin_unlock_irqrestore+0x0/0x27
[<60069bb0>] ? __wake_up_common_lock+0x95/0xd1
[<600602b0>] __might_sleep+0xd7/0xe2
[<60065598>] ? enqueue_task_fair+0x112/0x209
[<600eea13>] __kmalloc_track_caller+0x5d/0x124
[<600ee9b6>] ? __kmalloc_track_caller+0x0/0x124
[<602696d5>] __kmalloc_reserve.isra.34+0x30/0x7e
[<603f629b>] ? _raw_spin_lock_irqsave+0x0/0x3d
[<6026b744>] pskb_expand_head+0xbf/0x310
[<6025ca6a>] rmnet_rx_handler+0x7e/0x16b
[<6025c9ec>] ? rmnet_rx_handler+0x0/0x16b
[<6027ad0c>] __netif_receive_skb_core+0x301/0x96f
[<60033c17>] ? set_signals+0x0/0x40
[<6027bbcb>] __netif_receive_skb+0x24/0x8e
Fixes: 74692caf1b0b ("net: qualcomm: rmnet: Process packets over ethernet")
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The incoming skb needs to be reallocated in case the headroom
is not sufficient to add the MAP header. This allocation needs to
be atomic otherwise it results in the following splat
[32805.801456] BUG: sleeping function called from invalid context
[32805.841141] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[32805.904773] task: ffffffd7c5f62280 task.stack: ffffff80464a8000
[32805.910851] pc : ___might_sleep+0x180/0x188
[32805.915143] lr : ___might_sleep+0x180/0x188
[32806.131520] Call trace:
[32806.134041] ___might_sleep+0x180/0x188
[32806.137980] __might_sleep+0x50/0x84
[32806.141653] __kmalloc_track_caller+0x80/0x3bc
[32806.146215] __kmalloc_reserve+0x3c/0x88
[32806.150241] pskb_expand_head+0x74/0x288
[32806.154269] rmnet_egress_handler+0xb0/0x1d8
[32806.162239] rmnet_vnd_start_xmit+0xc8/0x13c
[32806.166627] dev_hard_start_xmit+0x148/0x280
[32806.181181] sch_direct_xmit+0xa4/0x198
[32806.185125] __qdisc_run+0x1f8/0x310
[32806.188803] net_tx_action+0x23c/0x26c
[32806.192655] __do_softirq+0x220/0x408
[32806.196420] do_softirq+0x4c/0x70
Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RMNET RX handler was processing invalid packets that were
originally sent on the real device and were looped back via
dev_loopback_xmit(). This was detected using syzkaller.
Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The AON_PM_L2 is normally used to trigger and identify the source of a
wake-up event. Since the RX_SYS clock is no longer turned off, we also
have an interrupt being sent to the SYSTEMPORT INTRL_2_0 controller, and
that interrupt remains active up until the magic packet detector is
disabled which happens much later during the driver resumption.
The race happens if we have a CPU that is entering the SYSTEMPORT
INTRL2_0 handler during resume, and another CPU has managed to clear the
wake-up interrupt during bcm_sysport_resume_from_wol(). In that case, we
have the first CPU stuck in the interrupt handler with an interrupt
cause that has been cleared under its feet, and so we keep returning
IRQ_NONE and we never make any progress.
This was not a problem before because we would always turn off the
RX_SYS clock during WoL, so the SYSTEMPORT INTRL2_0 would also be turned
off as well, thus not latching the interrupt.
The fix is to make sure we do not enable either the MPD or
BRCM_TAG_MATCH interrupts since those are redundant with what the
AON_PM_L2 interrupt controller already processes and they would cause
such a race to occur.
Fixes: bb9051a2b230 ("net: systemport: Add support for WAKE_FILTER")
Fixes: 83e82f4c706b ("net: systemport: add Wake-on-LAN support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.19
First, and also hopefully the last, set of fixes for 4.19. All small
but still important fixes
mt76x0
* fix a bug when a virtual interface is removed multiple times
b43
* fix DMA error related regression with proprietary firmware
iwlwifi
* fix an oops which was a regression in v4.19-rc1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We have an impressive number of syzkaller bugs that are linked
to the fact that syzbot was able to create a networking device
with millions of TX (or RX) queues.
Let's limit the number of RX/TX queues to 4096, this really should
cover all known cases.
A separate patch will add various cond_resched() in the loops
handling sysfs entries at device creation and dismantle.
Tested:
lpaa6:~# ip link add gre-4097 numtxqueues 4097 numrxqueues 4097 type ip6gretap
RTNETLINK answers: Invalid argument
lpaa6:~# time ip link add gre-4096 numtxqueues 4096 numrxqueues 4096 type ip6gretap
real 0m0.180s
user 0m0.000s
sys 0m0.107s
Fixes: 76ff5cc91935 ("rtnl: allow to specify number of rx and tx queues on device creation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RX queue config for bonding master could be different from its slave
device(s). With the commit 6a9e461f6fe4 ("bonding: pass link-local
packets to bonding master also."), the packet is reinjected into stack
with skb->dev as bonding master. This potentially triggers the
message:
"bondX received packet on queue Y, but number of RX queues is Z"
whenever the queue that packet is received on is higher than the
numrxqueues on bonding master (Y > Z).
Fixes: 6a9e461f6fe4 ("bonding: pass link-local packets to bonding master also.")
Reported-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Timer handlers do not imply rcu_read_lock(), so my recent fix
triggered a LOCKDEP warning when SYNACK is retransmit.
Lets add rcu_read_lock()/rcu_read_unlock() pairs around ireq->ireq_opt
usages instead of guessing what is done by callers, since it is
not worth the pain.
Get rid of ireq_opt_deref() helper since it hides the logic
without real benefit, since it is now a standard rcu_dereference().
Fixes: 1ad98e9d1bdf ("tcp/dccp: fix lockdep issue when SYN is backlogged")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When FW floods the driver with control messages try to exit the cmsg
processing loop every now and then to avoid soft lockups. Cmsg
processing is generally very lightweight so 512 seems like a reasonable
budget, which should not be exceeded under normal conditions.
Fixes: 77ece8d5f196 ("nfp: add control vNIC datapath")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix a commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing
continuation lines") regression with the `declance' driver, which caused
the adapter identification message to be split between two lines, e.g.:
declance.c: v0.011 by Linux MIPS DECstation task force
tc6: PMAD-AA
, addr = 08:00:2b:1b:2a:6a, irq = 14
tc6: registered as eth0.
Address that properly, by printing identification with a single call,
making the messages now look like:
declance.c: v0.011 by Linux MIPS DECstation task force
tc6: PMAD-AA, addr = 08:00:2b:1b:2a:6a, irq = 14
tc6: registered as eth0.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines")
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
During certain heavy network loads TX could time out
with TX ring dump.
TX is sometimes never restarted after reaching
"tx_stop_threshold" because function "fec_enet_tx_queue"
only tests the first queue.
In addition the TX timeout callback function failed to
recover because it also operated only on the first queue.
Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Bartlomiej writes:
"fbdev fixes for v4.19-rc7:
- fix OMAPFB_MEMORY_READ ioctl to not leak kernel memory in omapfb driver
(Tomi Valkeinen)
- add missing prepare/unprepare clock operations in pxa168fb driver
(Lubomir Rintel)
- add nobgrt option in efifb driver to disable ACPI BGRT logo restore
(Hans de Goede)
- fix spelling mistake in fall-through annotation in stifb driver
(Gustavo A. R. Silva)
- fix URL for uvesafb repository in the documentation (Adam Jackson)"
* tag 'fbdev-v4.19-rc7' of https://github.com/bzolnier/linux:
video/fbdev/stifb: Fix spelling mistake in fall-through annotation
uvesafb: Fix URLs in the documentation
efifb: BGRT: Add nobgrt option
fbdev/omapfb: fix omapfb_memory_read infoleak
pxa168fb: prepare the clock
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Ulf writes:
"MMC core:
- Fixup conversion of debounce time to/from ms/us
MMC host:
- sdhi: Fixup whitelisting for Gen3 types"
* tag 'mmc-v4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: slot-gpio: Fix debounce time to use miliseconds again
mmc: core: Fix debounce time to use microseconds
mmc: sdhi: sys_dmac: check for all Gen3 types when whitelisting
|
|
Some of the chip-specific hw_start functions set bit TXCFG_AUTO_FIFO
in register TxConfig. The original patch changed the order of some
calls resulting in these changes being overwritten by
rtl_set_tx_config_registers() in rtl_hw_start(). This eventually
resulted in network stalls especially under high load.
Analyzing the chip-specific hw_start functions all chip version from
34, with the exception of version 39, need this bit set.
This patch moves setting this bit to rtl_set_tx_config_registers().
Fixes: 4fd48c4ac0a0 ("r8169: move common initializations to tp->hw_start")
Reported-by: Ortwin Glück <odi@odi.ch>
Reported-by: David Arendt <admin@prnet.org>
Root-caused-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Tested-by: Tony Atkinson <tatkinson@linux.com>
Tested-by: David Arendt <admin@prnet.org>
Tested-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Eric Dumazet says:
====================
tun: address two syzbot reports
Small changes addressing races discovered by syzbot.
First patch is a cleanup.
Second patch moves a mutex init sooner.
Third patch makes sure each tfile gets its own napi enable flags.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since tun->flags might be shared by multiple tfile structures,
it is better to make sure tun_get_user() is using the flags
for the current tfile.
Presence of the READ_ONCE() in tun_napi_frags_enabled() gave a hint
of what could happen, but we need something stronger to please
syzbot.
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 13647 Comm: syz-executor5 Not tainted 4.19.0-rc5+ #59
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:dev_gro_receive+0x132/0x2720 net/core/dev.c:5427
Code: 48 c1 ea 03 80 3c 02 00 0f 85 6e 20 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 6e 10 49 8d bd d0 00 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 59 20 00 00 4d 8b a5 d0 00 00 00 31 ff 41 81 e4
RSP: 0018:ffff8801c400f410 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8618d325
RDX: 000000000000001a RSI: ffffffff86189f97 RDI: 00000000000000d0
RBP: ffff8801c400f608 R08: ffff8801c8fb4300 R09: 0000000000000000
R10: ffffed0038801ed7 R11: 0000000000000003 R12: ffff8801d327d358
R13: 0000000000000000 R14: ffff8801c16dd8c0 R15: 0000000000000004
FS: 00007fe003615700(0000) GS:ffff8801dac00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe1f3c43db8 CR3: 00000001bebb2000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
napi_gro_frags+0x3f4/0xc90 net/core/dev.c:5715
tun_get_user+0x31d5/0x42a0 drivers/net/tun.c:1922
tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:1967
call_write_iter include/linux/fs.h:1808 [inline]
new_sync_write fs/read_write.c:474 [inline]
__vfs_write+0x6b8/0x9f0 fs/read_write.c:487
vfs_write+0x1fc/0x560 fs/read_write.c:549
ksys_write+0x101/0x260 fs/read_write.c:598
__do_sys_write fs/read_write.c:610 [inline]
__se_sys_write fs/read_write.c:607 [inline]
__x64_sys_write+0x73/0xb0 fs/read_write.c:607
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fe003614c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457579
RDX: 0000000000000012 RSI: 0000000020000000 RDI: 000000000000000a
RBP: 000000000072c040 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe0036156d4
R13: 00000000004c5574 R14: 00000000004d8e98 R15: 00000000ffffffff
Modules linked in:
RIP: 0010:dev_gro_receive+0x132/0x2720 net/core/dev.c:5427
Code: 48 c1 ea 03 80 3c 02 00 0f 85 6e 20 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 6e 10 49 8d bd d0 00 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 59 20 00 00 4d 8b a5 d0 00 00 00 31 ff 41 81 e4
RSP: 0018:ffff8801c400f410 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8618d325
RDX: 000000000000001a RSI: ffffffff86189f97 RDI: 00000000000000d0
RBP: ffff8801c400f608 R08: ffff8801c8fb4300 R09: 0000000000000000
R10: ffffed0038801ed7 R11: 0000000000000003 R12: ffff8801d327d358
R13: 0000000000000000 R14: ffff8801c16dd8c0 R15: 0000000000000004
FS: 00007fe003615700(0000) GS:ffff8801dac00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe1f3c43db8 CR3: 00000001bebb2000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is the first part to fix following syzbot report :
console output: https://syzkaller.appspot.com/x/log.txt?x=145378e6400000
kernel config: https://syzkaller.appspot.com/x/.config?x=443816db871edd66
dashboard link: https://syzkaller.appspot.com/bug?extid=e662df0ac1d753b57e80
Following patch is fixing the race condition, but it seems safer
to initialize this mutex at tfile creation anyway.
Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+e662df0ac1d753b57e80@syzkaller.appspotmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tun_napi_disable() and tun_napi_del() do not need
a pointer to the tun_struct
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The bonding driver lacks the rcu lock when it calls down into
netdev_lower_get_next_private_rcu from bond_poll_controller, which
results in a trace like:
WARNING: CPU: 2 PID: 179 at net/core/dev.c:6567 netdev_lower_get_next_private_rcu+0x34/0x40
CPU: 2 PID: 179 Comm: kworker/u16:15 Not tainted 4.19.0-rc5-backup+ #1
Workqueue: bond0 bond_mii_monitor
RIP: 0010:netdev_lower_get_next_private_rcu+0x34/0x40
Code: 48 89 fb e8 fe 29 63 ff 85 c0 74 1e 48 8b 45 00 48 81 c3 c0 00 00 00 48 8b 00 48 39 d8 74 0f 48 89 45 00 48 8b 40 f8 5b 5d c3 <0f> 0b eb de 31 c0 eb f5 0f 1f 40 00 0f 1f 44 00 00 48 8>
RSP: 0018:ffffc9000087fa68 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff880429614560 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 00000000ffffffff RDI: ffffffffa184ada0
RBP: ffffc9000087fa80 R08: 0000000000000001 R09: 0000000000000000
R10: ffffc9000087f9f0 R11: ffff880429798040 R12: ffff8804289d5980
R13: ffffffffa1511f60 R14: 00000000000000c8 R15: 00000000ffffffff
FS: 0000000000000000(0000) GS:ffff88042f880000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4b78fce180 CR3: 000000018180f006 CR4: 00000000001606e0
Call Trace:
bond_poll_controller+0x52/0x170
netpoll_poll_dev+0x79/0x290
netpoll_send_skb_on_dev+0x158/0x2c0
netpoll_send_udp+0x2d5/0x430
write_ext_msg+0x1e0/0x210
console_unlock+0x3c4/0x630
vprintk_emit+0xfa/0x2f0
printk+0x52/0x6e
? __netdev_printk+0x12b/0x220
netdev_info+0x64/0x80
? bond_3ad_set_carrier+0xe9/0x180
bond_select_active_slave+0x1fc/0x310
bond_mii_monitor+0x709/0x9b0
process_one_work+0x221/0x5e0
worker_thread+0x4f/0x3b0
kthread+0x100/0x140
? process_one_work+0x5e0/0x5e0
? kthread_delayed_work_timer_fn+0x90/0x90
ret_from_fork+0x24/0x30
We're also doing rcu dereferences a layer up in netpoll_send_skb_on_dev
before we call down into netpoll_poll_dev, so just take the lock there.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|