summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
15 hoursMerge tag 'mm-hotfixes-stable-2026-06-08-20-51' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "11 hotfixes. 9 are for MM. 8 are cc:stable and the remaining 3 address post-7.1 issues or aren't considered suitable for backporting. Thre's a two-patch series "mm/damon/{reclaim,lru_sort}: handle ctx allocation failures" from SeongJae Park which fixes a couple of DAMON -ENOMEM bloopers. The rest are singletons - please see the individual changelogs for details" * tag 'mm-hotfixes-stable-2026-06-08-20-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/mincore: handle non-swap entries before !CONFIG_SWAP guard arm64: mm: call pagetable dtor when freeing hot-removed page tables mm/list_lru: drain before clearing xarray entry on reparent mm/huge_memory: use correct flags for device private PMD entry mm/damon/lru_sort: handle ctx allocation failure mm/damon/reclaim: handle ctx allocation failure zram: fix use-after-free in zram_bvec_write_partial() MAINTAINERS: update Baoquan He's email address tools headers UAPI: sync linux/taskstats.h for procacct.c mm/cma_sysfs: skip inactive CMA areas in sysfs ipc/shm: serialize orphan cleanup with shm_nattch updates
15 hoursMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Several significant bug fixes of pre-existing issues: - Missing validation on ucap fd types passed from userspace - Missing validation of HW DMA space vs userpace expected sizes in EFA queue setup - DMA corruption when using DMA block sizes >= 4G when setting up MRs in all drivers - Missing validation of CPU IDs when setting up dma handles - Missing validation of IB_MR_REREG_ACCESS when changing writability of a MR - Missing validation of received message/packet size in ISER and SRP" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/srp: bound SRP_RSP sense copy by the received length IB/isert: Reject login PDUs shorter than ISER_HEADERS_LEN RDMA: During rereg_mr ensure that REREG_ACCESS is compatible RDMA/core: Validate cpu_id against nr_cpu_ids in DMAH alloc RDMA/umem: Fix truncation for block sizes >= 4G RDMA/efa: Validate SQ ring size against max LLQ size RDMA/core: Validate the passed in fops for ib_get_ucaps()
38 hoursRDMA/srp: bound SRP_RSP sense copy by the received lengthMichael Bommarito
srp_process_rsp() copies sense data from rsp->data + resp_data_len, where resp_data_len is the full 32-bit value supplied by the SRP target and is never checked against the number of bytes actually received (wc->byte_len). The copy length is bounded to SCSI_SENSE_BUFFERSIZE, so at most 96 bytes are copied, but the source offset is not bounded. A malicious or compromised SRP target on the InfiniBand/RoCE fabric that the initiator has logged into can return an SRP_RSP with SRP_RSP_FLAG_SNSVALID set and a large resp_data_len. The receive buffer is allocated at the target-chosen max_ti_iu_len, so the source of the sense copy lands past the bytes actually received; with resp_data_len near 0xFFFFFFFF it is gigabytes past the buffer and the read faults. Copy the sense data only if it has not been truncated, that is, only if the response header, the response data, and the sense region fit within the bytes actually received; otherwise drop the sense and log. The in-tree iSER and NVMe-RDMA receive paths already bound their parse by wc->byte_len; this brings ib_srp into line with them. Fixes: aef9ec39c47f ("IB: Add SCSI RDMA Protocol (SRP) initiator") Link: https://patch.msgid.link/r/20260602220457.2542840-1-michael.bommarito@gmail.com Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
38 hoursIB/isert: Reject login PDUs shorter than ISER_HEADERS_LENMichael Bommarito
In drivers/infiniband/ulp/isert/ib_isert.c, isert_login_recv_done() computes the login request payload length as wc->byte_len minus ISER_HEADERS_LEN with no lower bound, and login_req_len is a signed int. A remote iSER initiator can post a login Send work request carrying fewer than ISER_HEADERS_LEN (76) bytes, so the subtraction underflows and login_req_len becomes negative. isert_rx_login_req() then reads that negative length back into a signed int, takes size = min(rx_buflen, MAX_KEY_VALUE_PAIRS), and because the min() is signed it keeps the negative value; the value is then passed as the memcpy() length and sign-extended to a multi-gigabyte size_t. The copy into the 8192-byte login->req_buf runs far out of bounds and faults, crashing the target node. The login phase precedes iSCSI authentication, so no credentials are required to reach this path. Reject any login PDU shorter than ISER_HEADERS_LEN before the subtraction, mirroring the existing early return on a failed work completion, so login_req_len can never go negative. The upper bound was already safe: a posted login buffer cannot deliver more than ISER_RX_PAYLOAD_SIZE, so the difference stays at or below MAX_KEY_VALUE_PAIRS and the existing min() clamps it; only the missing lower bound needs to be added. Fixes: b8d26b3be8b3 ("iser-target: Add iSCSI Extensions for RDMA (iSER) target driver") Link: https://patch.msgid.link/r/20260602194642.2273217-1-michael.bommarito@gmail.com Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
38 hoursRDMA: During rereg_mr ensure that REREG_ACCESS is compatibleJason Gunthorpe
If IB_MR_REREG_ACCESS changes from RO to RW then the umem has to be re-evaluated to ensure it is properly pinned as RW. Since the umem is hidden inside each driver's mr struct add a ib_umem_check_rereg() function that each driver has to call before processing IB_MR_REREG_ACCESS. mlx4 has to retain its duplicate ib_access_writable check because it implements IB_MR_REREG_ACCESS | IB_MR_REREG_TRANS by changing both items in place sequentially while the MR is live, so it will continue to not support this combination. Cc: stable@vger.kernel.org Fixes: b40656aa7d55 ("RDMA/umem: remove FOLL_FORCE usage") Link: https://patch.msgid.link/r/0-v1-06fb1a2d6cf5+107-rereg_access_jgg@nvidia.com Reported-by: Philip Tsukerman <philiptsukerman@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
40 hoursMerge tag 'hyperv-fixes-signed-20260607' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - MSHV driver fixes from various people (Anirudh Rayabharam, Can Peng, Dexuan Cui, Michael Kelley, Jork Loeser, Wei Liu) - Hyper-V user space tools fixes (Thorsten Blum) - Allow VMBus to be unloaded after frame buffer is flushed (Michael Kelley) * tag 'hyperv-fixes-signed-20260607' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: mshv: support 1G hugepages by passing them as 2M-aligned chunks Drivers: hv: vmbus: Improve the logic of reserving fb_mmio on Gen2 VMs mshv: use kmalloc_array in mshv_root_scheduler_init mshv: Add conditional VMBus dependency hyperv: Clean up and fix the guest ID comment in hvgdk.h drm/hyperv: During panic do VMBus unload after frame buffer is flushed Drivers: hv: vmbus: Provide option to skip VMBus unload on panic mshv: unmap debugfs stats pages on kexec mshv: clean up SynIC state on kexec for L1VH mshv: limit SynIC management to MSHV-owned resources hv: utils: replace deprecated strcpy with strscpy in kvp_register hv: utils: handle and propagate errors in kvp_register mshv: add a missing padding field
2 daysMerge tag 'regulator-fix-v7.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "Arnd's randconfig testing turned up a missing selection of CONFIG_IRQ_DOMAIN which was causing build breaks" * tag 'regulator-fix-v7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: mt6363: select CONFIG_IRQ_DOMAIN
3 daysMerge tag 'input-for-v7.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - two quirks for atkbd to deal with laptops that can not handle "deactivate" command on the keyboard PS/2 port * tag 'input-for-v7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: atkbd - skip deactivate for HONOR BCC-N's internal keyboard Input: atkbd - add DMI quirk for Lenovo Yoga Air 14 (83QK)
3 daysInput: atkbd - skip deactivate for HONOR BCC-N's internal keyboardCryolitia PukNgae
After commit 9cf6e24c9fbf17e52de9fff07f12be7565ea6d61 ("Input: atkbd - do not skip atkbd_deactivate() when skipping ATKBD_CMD_GETID"), HONOR BCC-N, aka HONOR MagicBook 14 2026's internal keyboard stops working. Adding the atkbd_deactivate_fixup quirk fixes it. DMI: HONOR BCC-N/BCC-N-PCB, BIOS 1.04 04/07/2026 Fixes: 9cf6e24c9fbf17e52de9fff07f12be7565ea6d61 ("Input: atkbd - do not skip atkbd_deactivate() when skipping ATKBD_CMD_GETID") Reported-by: Hongfei Ren <lcrhf@outlook.com> Link: https://github.com/colorcube/Linux-on-Honor-Magicbook-14-Pro/issues/1#issuecomment-4562679891 Tested-by: Hongfei Ren <lcrhf@outlook.com> Cc: stable@kernel.org Signed-off-by: Cryolitia PukNgae <cryolitia.pukngae@linux.dev> Link: https://patch.msgid.link/20260605-honor-v1-1-78e05e491193@linux.dev Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
4 daysMerge tag 'drm-fixes-2026-06-06' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Weekly drm fixes, not contributing to things settling down unfortunately. Lots of driver fixes for various bounds checks, leaks and UAF type things, i915/xe probably the most sane, amdgpu has a mix of fixes all over, then ethosu has lots of small fixes. The problem of fixing thing in private has really hit us with the change handle ioctl, and "Sima was right" and we should have disabled the ioctl, since it was only introduced a couple of kernels ago and failed to upstream it's tests in time. The patch here fixes the problems Sima identified, but disables the ioctl as well, with a list of known problems in it and a request for proper tests to be written and upstreamed. It's a niche user ioctl designed for CRIU with AMD ROCm, so I think it's fine to just disable it. Maybe this week will settle down. core: - disable the gem change handle ioctl for security reasons (plan to fix it on list later with proper test coverage) dumb-buffer: - remove strict limits on buffer geometry amdgpu: - BT.2020 fix for DCE - DC bounds checking fixes - SDMA 7.1 fix - UserQ fixes - SI fix - SMU 13 fixes - SMU 14 fixes - GC 12.1 fix - Userptr fix - GC 10.1 fix - GART fix for non-4K pages amdkfd: - UAF race fix - Fix a potential NULL pointer dereference - GC 11 buffer overflow fix for SDMA xe: - Revert removing support for unpublished NVL-S GuC - Suspend fixes related to multi-queue i915: - Fix color blob reference handling in intel_plane_state - Revert "drm/i915/backlight: Remove try_vesa_interface" ethosu: - reject unsupported NPU_OP_RESIZE - fix index of IFM region - fix weight index - fix overflows in DMA-size calculations - reject DMA commands with uninitialized length - fix OOB write in ethosu_gem_cmdstream_copy_and_validate imx: - fix kernel-doc warnings ivpu: - add overflow checks in firmware handling and get_info_ioctl v3d: - wait for pending L2T flush before cleaning caches - fix leak of vaddr - skip CSD when it has zeroed workgroups - fix ref counting in performance monitoring" * tag 'drm-fixes-2026-06-06' of https://gitlab.freedesktop.org/drm/kernel: (50 commits) drm/gem: Try to fix change_handle ioctl, attempt 4 Revert "drm/i915/backlight: Remove try_vesa_interface" accel/ethosu: fix OOB write in ethosu_gem_cmdstream_copy_and_validate() accel/ethosu: reject DMA commands with uninitialized length accel/ethosu: fix arithmetic issues in dma_length() accel/ethosu: fix wrong weight index in NPU_SET_SCALE1_LENGTH on U85 accel/ethosu: reject NPU_OP_RESIZE commands from userspace accel/ethosu: fix IFM region index out-of-bounds in command stream parser drm/v3d: Fix global performance monitor reference counting drm/xe/multi_queue: skip submit when primary queue is suspended drm/xe: Clear pending_disable before signaling suspend fence Revert "drm/xe: Skip exec queue schedule toggle if queue is idle during suspend" drm/amd/pm: smu_v14_0_0: use SoftMin for gfxclk in set_soft_freq_limited_range drm/amdgpu: Fix incorrect VRAM GART mappings on non-4K page size systems drm/amdgpu/userq: move wptr_obj cleanup in mqd_destroy drm/amdgpu: improve the userq seq BO free bit lookup drm/amdgpu/userq: remove the vital queue unmap logging drm/amdkfd: Fix buffer overflow in SDMA queue checkpoint/restore on GFX11 drm/amdkfd: fix NULL dereference in get_queue_ids() drm/amdgpu: set noretry=1 as default for GFX 10.1.x (Navi10/12/14) ...
4 daysdrm/gem: Try to fix change_handle ioctl, attempt 4Simona Vetter
[airlied: just added some comments on how to reenable] On-list because the cat is out of the bag and we're clearly not good enough to figure this out in private. The story thus far: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle") tried to fix a race condition between the gem_close and gem_change_handle ioctls, but got a few things wrong: - There's a confusion with the local variable handle, which is actually the new handle, and so the two-stage trick was actually applied to the wrong idr slot. 7164d78559b0 ("drm/gem: fix race between change_handle and handle_delete") tried to fix that by adding yet another code block, but forgot to add the error handling. Which meant we now have two paths, both kinda wrong. - dc366607c41c ("drm: Replace old pointer to new idr") tried to apply another fix, but inconsistently, again because of the handle confusion - this would be the right fix (kinda, somewhat, it's a mess) if we'd do the two-stage approach for the new handle. Except that wasn't the intent of the original fix. We also didn't have an igt merged for the original ioctl, which is a big no-go. This was attempted to address off-list in the original bugfix, and amd QA people claimed the bug was fixed now. Very clearly that's not the case. Here's my attempt to sort this out: - Rename the local variable to new_handle, the old aliasing with args->handle is just too dangerously confusing. - Merge the gem obj lookup with the two-stage idr_replace so that we avoid getting ourselves confused there. - This means we don't have a surplus temporary reference anymore, only an inherited from the idr. A concurrent gem_close on the new_handle could steal that. Fix that with the same two-stage approach create_tail uses. This is a bit overkill as documented in the comment, but I also don't trust my ability to understand this all correctly, so go with the established pattern we have from other ioctls instead for maximum paranoia. - Adjust error paths. I've tried to make the error and success paths common, because they are identical except for which handle is removed and on which we call idr_replace to (re)install the object again. But that made things messier to read, so I've left it at the more verbose version, which unfortunately hides the symmetry in the entire code flow a bit. - While at it, also replace the 7 space indent with 1 tab. And finally, because I flat out don't trust my abilities here at all anymore: - Disable the ioctl until we have the igt situation and everything else sorted out on-list and with full consensus. v2: Sashiko noticed that I didn't handle the error path for idr_replace correctly, it must be checked with IS_ERR_OR_NULL like in gem_handle_delete. So yeah, definitely should just the existing paths 1:1 because this is endless amounts of tricky. Also add the Fixes: line for the original ioctl, I forgot that too. Reported-by: DARKNAVY (@DarkNavyOrg) <vr@darknavy.com> Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> Fixes: dc366607c41c ("drm: Replace old pointer to new idr") Cc: syzbot+d7c9eed171647e421013@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Cc: Edward Adam Davis <eadavis@qq.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle") Cc: David Francis <David.Francis@amd.com> Cc: Puttimet Thammasaeng <pwn8official@gmail.com> Cc: Christian Koenig <Christian.Koenig@amd.com> Fixes: 7164d78559b0 ("drm/gem: fix race between change_handle and handle_delete") Cc: Zhenghang Xiao <kipreyyy@gmail.com> Fixes: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle") Reviewed-by: David Francis <David.Francis@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patch.msgid.link/20260604194437.1725314-1-simona.vetter@ffwll.ch
4 daysMerge tag 'drm-intel-fixes-2026-06-05' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix color blob reference handling in intel_plane_state (Chaitanya Kumar Borah) - Revert "drm/i915/backlight: Remove try_vesa_interface" [backlight] (Suraj Kandpal) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://patch.msgid.link/aiKgmwz7VGOaFXIv@linux
4 daysMerge tag 'drm-misc-fixes-2026-06-05' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: dumb-buffer: - remove strict limits on buffer geometry ethosu: - reject unsupported NPU_OP_RESIZE - fix index of IFM region - fix weight index - fix overflows in DMA-size calculations - reject DMA commands with uninitialized length - fix OOB write in ethosu_gem_cmdstream_copy_and_validate imx: - fix kernel-doc warnings ivpu: - add overflow checks in firmware handling and get_info_ioctl v3d: - wait for pending L2T flush before cleaning caches - fix leak of vaddr - skip CSD when it has zeroed workgroups - fix ref counting in performance monitoring Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260605072602.GA268798@linux.fritz.box
5 daysRevert "drm/i915/backlight: Remove try_vesa_interface"Suraj Kandpal
This reverts commit 40d2f5820951dee818d05c14677277048bd85f9f. Removing the try_vesa_interface gate caused a backlight regression on panels whose VBT correctly reports INTEL_BACKLIGHT_DISPLAY_DDI and whose PWM path is the actual backlight control, but whose DPCD optimistically advertises DP_EDP_BACKLIGHT_AUX_ENABLE_CAP / _BRIGHTNESS_AUX_SET_CAP. After the commit such panels silently bind to the VESA AUX backlight funcs; AUX writes complete but the panel ignores them, leaving brightness stuck (no-op backlight). Observed on at least KBL and TGL eDP setups. Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://patch.msgid.link/20260517024709.1016121-1-suraj.kandpal@intel.com (cherry picked from commit f30fddb4402313aa5301a74d721638d343395269) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
5 daysaccel/ethosu: fix OOB write in ethosu_gem_cmdstream_copy_and_validate()Muhammad Bilal
The command stream parsing loop increments the index variable a second time when a 64-bit command word is encountered (bit 14 set), but does not re-check the loop bound before writing the second word: for (i = 0; i < size / 4; i++) { bocmds[i] = cmds[0]; if (cmd & 0x4000) { i++; bocmds[i] = cmds[1]; /* unchecked */ } } The buffer bocmds is backed by a DMA allocation of exactly size bytes from drm_gem_dma_create(ddev, size), giving valid indices [0, size/4-1]. When i == size/4 - 1 on entry to an iteration and bit 14 of cmds[0] is set, bocmds[size/4-1] is written in bounds, i is then incremented to size/4, and bocmds[size/4] writes four bytes past the end of the allocation. Userspace controls both the buffer contents and the size argument via the ioctl, making this a userspace-triggerable heap out-of-bounds write. Fix by checking the incremented index against the buffer bound before the second write and returning -EINVAL if the buffer is too small to contain the extended command. Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Link: https://patch.msgid.link/20260523190843.33977-1-meatuni001@gmail.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
5 daysaccel/ethosu: reject DMA commands with uninitialized lengthMuhammad Bilal
cmd_state_init() initializes the command state with memset(0xff), leaving dma->len at U64_MAX to signal missing setup. The only setter is NPU_SET_DMA0_LEN; if userspace omits this command and issues NPU_OP_DMA_START, dma->len remains U64_MAX. In dma_length(), a positive stride added to U64_MAX wraps to a small value. With size0 == 1, check_mul_overflow() does not trigger and dma_length() returns 0 instead of U64_MAX. The caller's U64_MAX check then passes, region_size[] stays 0, and the bounds check in ethosu_job.c is bypassed, allowing hardware to execute DMA with stale physical addresses. Fix by checking for U64_MAX at the start of dma_length() before any arithmetic, consistent with the sentinel value used throughout the driver to detect uninitialized fields. Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Link: https://patch.msgid.link/20260524130319.12747-1-meatuni001@gmail.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
5 daysaccel/ethosu: fix arithmetic issues in dma_length()Muhammad Bilal
dma_length() derives DMA region usage from command stream values and updates region_size[]: len = ((len + stride[0]) * size0 + stride[1]) * size1 region_size[region] = max(..., len + dma->offset) Several arithmetic issues can corrupt the derived region size: - signed stride values may underflow when added to len - intermediate multiplications may overflow - len + dma->offset may overflow during region_size updates - dma_length() error returns were not validated by the caller region_size[] is later used by ethosu_job.c to validate command stream accesses against GEM buffer sizes. Arithmetic wraparound can therefore under-report region usage and bypass the bounds validation. Fix by validating signed additions, using overflow helpers for multiplications and offset updates, and propagating dma_length() failures to the caller. Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Link: https://patch.msgid.link/20260524103710.47397-1-meatuni001@gmail.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
5 daysaccel/ethosu: fix wrong weight index in NPU_SET_SCALE1_LENGTH on U85Muhammad Bilal
On non-U65 hardware (e.g. U85), opcode 0x4093 is NPU_SET_WEIGHT2_LENGTH. The BASE handler for the same opcode correctly assigns to st.weight[2].base, but the LENGTH handler mistakenly assigns cmds[1] to st.weight[1].length instead of st.weight[2].length. This leaves weight[2].length at its initialised sentinel value of 0xffffffff and corrupts weight[1].length with the user-supplied value, breaking the software bounds-check state for both weight buffers on U85. Fix the index to match the BASE handler. Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Link: https://patch.msgid.link/20260523210840.92039-3-meatuni001@gmail.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
5 daysaccel/ethosu: reject NPU_OP_RESIZE commands from userspaceMuhammad Bilal
NPU_OP_RESIZE is a U85-only command that the driver does not yet implement. The existing WARN_ON(1) placeholder fires unconditionally whenever userspace submits this command via DRM_IOCTL_ETHOSU_GEM_CREATE, causing unbounded kernel log spam. If panic_on_warn is set the kernel panics, giving any unprivileged user with access to the DRM device a trivial denial-of-service primitive. Replace the WARN_ON(1) with an explicit -EINVAL return so the ioctl rejects the command before it reaches hardware. Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Link: https://patch.msgid.link/20260523210840.92039-2-meatuni001@gmail.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
5 daysaccel/ethosu: fix IFM region index out-of-bounds in command stream parserMuhammad Bilal
NPU_SET_IFM_REGION extracts the region index with param & 0x7f, giving a maximum value of 127. However region_size[] and output_region[] in struct ethosu_validated_cmdstream_info are both sized to NPU_BASEP_REGION_MAX (8), giving valid indices [0..7]. Every other region assignment in the same switch uses param & 0x7: NPU_SET_OFM_REGION: st.ofm.region = param & 0x7; NPU_SET_IFM2_REGION: st.ifm2.region = param & 0x7; NPU_SET_WEIGHT_REGION: st.weight[0].region = param & 0x7; NPU_SET_SCALE_REGION: st.scale[0].region = param & 0x7; The 0x7f mask on IFM is inconsistent and appears to be a typo. feat_matrix_length() and calc_sizes() use the region index directly as an array subscript into the kzalloc'd info struct: info->region_size[fm->region] = max(...); A userspace caller supplying NPU_SET_IFM_REGION with param > 7 causes a write up to 127*8 = 1016 bytes past the start of region_size[], corrupting adjacent kernel heap data. Fix by applying the same & 0x7 mask used by all other region assignments. Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Link: https://patch.msgid.link/20260523195159.55801-1-meatuni001@gmail.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
5 daysMerge tag 'net-7.1-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Netfilter, wireless and Bluetooth. Current release - fix to a fix: - Bluetooth: MGMT: fix backward compatibility with bluetoothd which adds stray bytes to MGMT_OP_ADD_EXT_ADV_DATA Previous releases - regressions: - af_unix: fix inq_len update inaccuracy on partial read - eth: fec: fix pinctrl default state restore order on resume - wifi: iwlwifi: - mvm: don't support the reset handshake for old firmwares - pcie: simplify the resume flow if fast resume is not used, work around NIC access failures Previous releases - always broken: - Bluetooth: L2CAP: reject BR/EDR signaling packets over MTUsig - sctp: fix a couple of bugs in COOKIE_ECHO processing - sched: fix pedit partial COW leading to page cache corruption - wifi: nl80211: reject oversized EMA RNR lists - netfilter: - conntrack_irc: fix possible out-of-bounds read - bridge: make ebt_snat ARP rewrite writable - appletalk: zero-initialize aarp_entry to prevent heap info leak - ipv4: restrict IPOPT_SSRR and IPOPT_LSRR options - mptcp: fix number of bugs reported by AI scans and discovered during NVMe over MPTCP testing" * tag 'net-7.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits) Reapply "bnxt_en: bring back rtnl_lock() in the bnxt_open() path" udp: clear skb->dev before running a sockmap verdict sctp: purge outqueue on stale COOKIE-ECHO handling bonding: annotate data-races arcound churn variables net/802/mrp: fix vector attribute parsing in mrp_pdu_parse_vecattr rtase: Avoid sleeping in get_stats64() ieee802154: 6lowpan: only accept IPv6 packets in lowpan_xmit() ipv6: mcast: Fix use-after-free when processing MLD queries selftests: net: add vxlan vnifilter notification test vxlan: vnifilter: fix spurious notification on VNI update vxlan: vnifilter: send notification on VNI add rtase: Reset TX subqueue when clearing TX ring octeontx2-af: npc: Fix CPT channel mask in npc_install_flow dt-bindings: ethernet: eswin: fix hsp-sp-csr backward compatibility sctp: validate cached peer INIT chunk length in COOKIE_ECHO processing net/sched: fix pedit partial COW leading to page cache corruption vsock/vmci: fix sk_ack_backlog leak on failed handshake net: bonding: fix NULL pointer dereference in bond_do_ioctl() geneve: fix length used in GRO hint UDP checksum adjustment net: ethernet: mtk_eth_soc: Fix use-after-free in metadata dst teardown ...
5 daysMerge tag 'drm-xe-fixes-2026-06-04' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Revert removing support for unpublished NVL-S GuC (Daniele) - Suspend fixes related to multi-queue (Niranjana) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/aiHPGiPrAyHgwBZl@intel.com
6 daysReapply "bnxt_en: bring back rtnl_lock() in the bnxt_open() path"Jakub Kicinski
This reverts commit 850d9248d2eac662f869c766a598c877690c74e5. This reapplies commit 325eb217e41f ("bnxt_en: bring back rtnl_lock() in the bnxt_open() path"). Breno reports a lockdep warning in bnxt. During FW reset the driver may end up calling netif_set_real_num_tx_queues() (if queue count changes), so calls to bnxt_open() still require rtnl_lock. net/sched/sch_generic.c:1416 suspicious rcu_dereference_protected() usage! dev_qdisc_change_real_num_tx+0x54/0xe0 netif_set_real_num_tx_queues+0x4ed/0xa80 __bnxt_open_nic+0x9cb/0x3490 bnxt_open+0x1cb/0x370 bnxt_fw_reset_task+0x80d/0x1e80 process_scheduled_works+0x9c1/0x13b0 The reverted commit was just an optimization / experiment so let's go back to taking the lock. Reported-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/ah726OtFX-Qw3U-R@gmail.com Fixes: 850d9248d2ea ("Revert "bnxt_en: bring back rtnl_lock() in the bnxt_open() path"") Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260603195845.2574426-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysbonding: annotate data-races arcound churn variablesEric Dumazet
These fields are updated asynchronously by the bonding state machine in ad_churn_machine() while holding bond->mode_lock. bond_info_show_slave() and bond_fill_slave_info() read them without bond->mode_lock being held, we need to add READ_ONCE() and WRITE_ONCE() annotations. Note that AD_CHURN_MONITOR, AD_CHURN, and AD_NO_CHURN are defined exclusively in (kernel private) include/net/bond_3ad.h header. They should be moved to include/uapi/linux/if_bonding.h or userspace tools will have to hardcode their values. Fixes: 4916f2e2f3fc ("bonding: print churn state via netlink") Fixes: 14c9551a32eb ("bonding: Implement port churn-machine (AD standard 43.4.17).") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260603123514.388226-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysrtase: Avoid sleeping in get_stats64()Justin Lai
The .ndo_get_stats64 callback must not sleep because it can be called when reading /proc/net/dev. rtase_get_stats64() calls rtase_dump_tally_counter(), which polls the tally counter dump bit with read_poll_timeout(). This may sleep while waiting for the hardware counter dump to complete. Use read_poll_timeout_atomic() instead to avoid sleeping in the get_stats64() path. Fixes: 079600489960 ("rtase: Implement net_device_ops") Cc: stable@vger.kernel.org Signed-off-by: Justin Lai <justinlai0215@realtek.com> Link: https://patch.msgid.link/20260603061816.31356-1-justinlai0215@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysvxlan: vnifilter: fix spurious notification on VNI updateAndy Roulin
When a VNI is re-added with the same attributes (e.g. same group or no group), vxlan_vni_update() sends a spurious RTM_NEWTUNNEL notification even though nothing changed. The bug is that 'if (changed)' tests whether the pointer is non-NULL, not the bool value it points to. Since every caller passes a valid pointer, the condition is always true and the notification fires unconditionally. Fix by dereferencing the pointer: 'if (*changed)'. Reproducer: # ip link add vxlan100 type vxlan dstport 4789 local 10.0.0.1 \ nolearning external vnifilter # ip link set vxlan100 up # bridge monitor vni & # bridge vni add vni 1000 dev vxlan100 # bridge vni add vni 1000 dev vxlan100 # spurious notification Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Andy Roulin <aroulin@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20260602185138.253265-3-aroulin@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysvxlan: vnifilter: send notification on VNI addAndy Roulin
When a new VNI is added to a vxlan device with vnifilter enabled, no RTM_NEWTUNNEL notification is sent to userspace. This means 'bridge monitor vni' never shows VNI add events, even though VNI delete events are reported correctly. The bug is in vxlan_vni_add(), where the notification is guarded by 'if (changed)'. The 'changed' flag is set by vxlan_vni_update_group() only when the multicast group or remote IP is modified, but for a new VNI added without a group (e.g. in L3 VxLAN interface scenarios), the function returns early without setting changed=true. Since this is a new VNI, the notification should be sent unconditionally. The notification is not guarded by the return value of vxlan_vni_update_group() because, at this point, the VNI has already been inserted into the hash table and list with no rollback on error. The VNI will be visible in 'bridge vni show' regardless, so userspace should be informed. This is consistent with vxlan_vni_del() which also notifies unconditionally. The 'if (changed)' guard remains correct in vxlan_vni_update(), which handles the case where a VNI already exists and is being re-added -- there, we only want to notify if the group/remote actually changed. Reproducer: # ip link add vxlan100 type vxlan dstport 4789 local 10.0.0.1 \ nolearning external vnifilter # ip link set vxlan100 up # bridge monitor vni & # bridge vni add vni 1000 dev vxlan100 # no notification # bridge vni delete vni 1000 dev vxlan100 # notification received Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Reported-by: Chirag Shah <chirag@nvidia.com> Signed-off-by: Andy Roulin <aroulin@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20260602185138.253265-2-aroulin@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysrtase: Reset TX subqueue when clearing TX ringJustin Lai
rtase_tx_clear() clears the TX ring and resets the ring indexes. However, the TX queue state and BQL accounting are not reset at the same time. This may leave __QUEUE_STATE_STACK_XOFF asserted after rtase_sw_reset(), preventing new TX packets from being scheduled. Reset the TX subqueue when clearing the TX ring so the TX queue state and BQL accounting are restored together. Fixes: 5a2a2f15244c ("rtase: Implement the rtase_down function") Cc: stable@vger.kernel.org Signed-off-by: Justin Lai <justinlai0215@realtek.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20260602114659.12335-1-justinlai0215@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysocteontx2-af: npc: Fix CPT channel mask in npc_install_flowNithin Dabilpuram
Use the CPT-aware NIX channel mask in the npc_install_flow path so that when the host PF installs steering rules in kernel for a VF used from userspace (e.g. DPDK), MCAM entries see the same channel mask semantics as other RX paths. Fixes: 56bcef528bd8 ("octeontx2-af: Use npc_install_flow API for promisc and broadcast entries") Cc: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260602045853.1558530-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysdrm/v3d: Fix global performance monitor reference countingMaíra Canal
In the SET_GLOBAL ioctl, v3d_perfmon_find() bumps the reference count on the perfmon it returns, but v3d_perfmon_set_global_ioctl() and v3d_perfmon_delete() fail to release that reference on several paths: 1. v3d_perfmon_set_global_ioctl() leaks the reference on its error paths. 2. CLEAR_GLOBAL leaks both the find reference and the reference previously stashed in v3d->global_perfmon by the SET_GLOBAL ioctl that configured it. 3. Destroying a perfmon that is the current global perfmon leaks the reference stashed by the SET_GLOBAL ioctl. Release each of these references explicitly. Cc: stable@vger.kernel.org Fixes: c6eabbab359c ("drm/v3d: Add DRM_IOCTL_V3D_PERFMON_SET_GLOBAL") Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patch.msgid.link/20260531-v3d-perfmon-lifetime-v2-1-60ed4485a203@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
6 daysdrm/xe/multi_queue: skip submit when primary queue is suspendedNiranjana Vishwanathapura
Return early in submit path when the multi-queue primary exec queue is suspended to avoid submitting while suspended. v2: Remove idle_skip_suspend fix as that feature is being reverted here https://patchwork.freedesktop.org/series/167262/ Fixes: bc5775c59258 ("drm/xe/multi_queue: Add GuC interface for multi queue support") Cc: stable@vger.kernel.org # v7.0+ Assisted-by: GitHub-Copilot:claude-sonnet-4.6 Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patch.msgid.link/20260603233946.863663-2-niranjana.vishwanathapura@intel.com (cherry picked from commit b7fb55cc3364ca128cfff9d50649ffd4327cd01e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
6 daysdrm/xe: Clear pending_disable before signaling suspend fenceTangudu Tilak Tirumalesh
In the schedule-disable done path for suspend, we signal the suspend fence before clearing pending_disable. That wakeup can let suspend_wait complete and resume be queued immediately. The resume path may then reach enable_scheduling() while pending_disable is still set and hit the !exec_queue_pending_disable(q) assertion. Fix this by clearing pending_disable before signaling the suspend fence, so any resumed transition observes a consistent state. Fixes: 87651f31ae4e ("drm/xe/guc_submit: fix race around suspend_pending") Cc: stable@vger.kernel.org # v7.0+ Signed-off-by: Tangudu Tilak Tirumalesh <tilak.tirumalesh.tangudu@intel.com> Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patch.msgid.link/20260603065217.3131066-3-tilak.tirumalesh.tangudu@intel.com (cherry picked from commit 4b1ae138b0e103d753773956a84eebc2edbf62c4) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
6 daysRevert "drm/xe: Skip exec queue schedule toggle if queue is idle during suspend"Tangudu Tilak Tirumalesh
This reverts commit 8533051ce92015e9cc6f75e0d52119b9d91610b6. The idle-skip optimization bypasses GuC suspend, so the GPU may not perform the context switch that flushes TLB entries for invalidated userptr VMAs. In LR/preempt-fence VM mode, this can lead to missed TLB invalidation and page faults during userptr invalidation tests. Restore unconditional schedule toggling on suspend so the context-switch TLB flush is always performed. This optimization will be reintroduced with a fix that does not skip suspend in LR/preempt-fence VM mode. Fixes: 8533051ce920 ("drm/xe: Skip exec queue schedule toggle if queue is idle during suspend") Cc: stable@vger.kernel.org # v7.0+ Suggested-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Signed-off-by: Tangudu Tilak Tirumalesh <tilak.tirumalesh.tangudu@intel.com> Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patch.msgid.link/20260603065217.3131066-2-tilak.tirumalesh.tangudu@intel.com (cherry picked from commit 6a1e7934d9a6cf46aecae00a99c2603d1295e170) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
6 daysnet: bonding: fix NULL pointer dereference in bond_do_ioctl()ZhaoJinming
In bond_do_ioctl(), slave_dev is obtained via __dev_get_by_name() which can return NULL if the requested interface name does not exist. However, the subsequent slave_dbg() call is placed before the NULL check: slave_dev = __dev_get_by_name(net, ifr->ifr_slave); slave_dbg(bond_dev, slave_dev, "slave_dev=%p:\n", slave_dev); //here if (!slave_dev) return -ENODEV; The slave_dbg() macro expands to netdev_dbg(bond_dev, "(slave %s): " fmt, (slave_dev)->name, ...) which unconditionally dereferences slave_dev->name before the NULL check is performed. This results in a NULL pointer dereference kernel oops when a user calls bonding ioctl (e.g. SIOCBONDENSLAVE, SIOCBONDRELEASE, etc.) with a non-existent slave interface name. This is reachable from userspace via the bonding ioctl interface with CAP_NET_ADMIN capability, making it a potential local denial-of-service vector. Fix by moving the slave_dbg() call after the NULL check. Fixes: e2a7420df2e0 ("bonding/main: convert to using slave printk macros") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: ZhaoJinming <zhaojinming@uniontech.com> Link: https://patch.msgid.link/20260601085649.4029067-1-zhaojinming@uniontech.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
6 daysgeneve: fix length used in GRO hint UDP checksum adjustmentAntoine Tenart
In geneve_post_decap_hint the length used for adjusting the UDP checksum should be 'skb->len - gro_hint->nested_tp_offset' (UDP length) instead of 'skb->len - gro_hint->nested_nh_offset' (IP length). Fixes: fd0dd796576e ("geneve: use GRO hint option in the RX path") Cc: Paolo Abeni <pabeni@redhat.com> Reported-by: Sashiko <sashiko-bot@kernel.org> Closes: https://sashiko.dev/#/patchset/20260521131436.748832-1-jhs%40mojatatu.com Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260529144713.780938-1-atenart@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
6 daysnet: ethernet: mtk_eth_soc: Fix use-after-free in metadata dst teardownLorenzo Bianconi
mtk_free_dev() calls metadata_dst_free() which frees the metadata_dst with kfree() immediately, bypassing the RCU grace period. In the RX path, skb_dst_set_noref() sets a non-refcounted pointer from the skb to the metadata_dst. This function requires RCU read-side protection and the dst must remain valid until all RCU readers complete. Since metadata_dst_free() calls kfree() directly, a use-after-free can occur if any skb still holds a noref pointer to the dst when the driver tears it down. Replace metadata_dst_free() with dst_release() which properly goes through the refcount path: when the refcount drops to zero, it schedules the actual free via call_rcu_hurry(), ensuring all RCU readers have completed before the memory is freed. Fixes: 2d7605a72906 ("net: ethernet: mtk_eth_soc: enable hardware DSA untagging") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260602-airoha-mtk-metadata-uaf-fix-v1-2-3aaa99d83351@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet: airoha: Fix use-after-free in metadata dst teardownLorenzo Bianconi
airoha_metadata_dst_free() runs metadata_dst_free() which frees the metadata_dst with kfree() immediately, bypassing the RCU grace period. In the RX path, skb_dst_set_noref() sets a non-refcounted pointer from the skb to the metadata_dst. This function requires RCU read-side protection and the dst must remain valid until all RCU readers complete. Since metadata_dst_free() calls kfree() directly, an use-after-free can occur if any skb still holds a noref pointer to the dst when the driver tears it down. Replace metadata_dst_free() with dst_release() which properly goes through the refcount path: when the refcount drops to zero, it schedules the actual free via call_rcu_hurry(), ensuring all RCU readers have completed before the memory is freed. Fixes: af3cf757d5c9 ("net: airoha: Move DSA tag in DMA descriptor") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260602-airoha-mtk-metadata-uaf-fix-v1-1-3aaa99d83351@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysMerge tag 'wireless-2026-06-03' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Things are finally quieting down: - iwlwifi: - FW reset handshake removal for older devices - NIC access fix in fast resume - avoid too large command for some BIOSes - fix TX power constraints in AP mode - cfg80211: - fix netlink parse overflow - fix potential 6 GHz scan memory leak - enforce HE/EHT consistency to avoid mac80211 crash - mac80211: guard radiotap antenna parsing * tag 'wireless-2026-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cfg80211: enforce HE/EHT cap/oper consistency wifi: fix leak if split 6 GHz scanning fails wifi: mac80211: limit injected antenna index in ieee80211_parse_tx_radiotap wifi: nl80211: reject oversized EMA RNR lists wifi: iwlwifi: pcie: simplify the resume flow if fast resume is not used wifi: iwlwifi: mvm: avoid oversized UATS command copy wifi: iwlwifi: mld: send tx power constraints before link activation wifi: iwlwifi: mvm: don't support the reset handshake for old firmwares ==================== Link: https://patch.msgid.link/20260603113208.171874-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysptp: vclock: Switch from RCU to SRCUKurt Kanzenbach
The usage of PTP vClocks leads immediately to the following issues with ptp4l with LOCKDEP and DEBUG_ATOMIC_SLEEP enabled: "BUG: sleeping function called from invalid context". ptp_convert_timestamp() acquires a mutex_t within a RCU read section. This is illegal, because acquiring a mutex_t can result in voluntary scheduling request which is not allowed within a RCU read section. Replace the RCU usage with SRCU where sleeping is allowed. Reported-by: Florian Zeitz <florian.zeitz@schettke.com> Closes: https://lore.kernel.org/all/00a8cce8-410e-4038-98af-49be6d93d7bd@schettke.com/ Fixes: 67d93ffc0f3c ("ptp: vclock: use mutex to fix "sleep on atomic" bug") Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20260529-vclock_rcu-v2-1-02a5531fab92@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysocteontx2-af: Fix initialization of mcam's entry2target_pffunc fieldSuman Ghosh
NPC mcam entry stores a mapping between mcam entry and target pcifunc. During initialization of this field, API kmalloc_array has been used which caused some junk values to array. Whereas, the array is expected to be initialized by 0. This patch fixes the same by using kcalloc instead of kmalloc_array. Fixes: 55307fcb9258 ("octeontx2-af: Add mbox messages to install and delete MCAM rules") Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1780054625-17090-1-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysocteontx2-pf: Fix NDC sync operation errorsGeetha sowjanya
On system reboot "rvu_nicpf 0002:03:00.0: NDC sync operation failed" error messages are shown, even if the operations is successful. This is due to wrong if error check in ndc_syc() function. Fixes: 42c45ac1419c ("octeontx2-af: Sync NIX and NPA contexts from NDC to LLC/DRAM") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1780054677-17249-1-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet: sfp: initialize i2c_block_size at adapter configure timeJonas Jelonek
sfp->i2c_block_size is only assigned in sfp_sm_mod_probe(), which runs from the state machine timer after SFP_F_PRESENT has been set. Between those two points, sfp_module_eeprom() (the ethtool -m callback) gates only on SFP_F_PRESENT and can be entered with i2c_block_size still at its kzalloc'd value of 0. On a pure-I2C adapter, sfp_i2c_read() then issues an i2c_transfer() with msgs[1].len = 0 inside a loop that subtracts this_len from len each iteration; on adapters that succeed a zero-length read the loop never advances, spinning while holding rtnl_lock. This was previously addressed by initializing i2c_block_size in sfp_alloc() (commit 813c2dd78618), but the initialization was dropped when i2c_block_size was split from i2c_max_block_size. Initialize sfp->i2c_block_size from sfp->i2c_max_block_size in sfp_i2c_configure(), so the field is valid as soon as the adapter is known. sfp_sm_mod_probe() still reassigns it on each module insertion to recover from a per-module clamp to 1 (sfp_id_needs_byte_io). Fixes: 7662abf4db94 ("net: phy: sfp: Add support for SMBus module access") Cc: stable@vger.kernel.org Signed-off-by: Jonas Jelonek <jelonek.jonas@gmail.com> Link: https://patch.msgid.link/20260528205242.971410-2-jelonek.jonas@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 dayszram: fix use-after-free in zram_bvec_write_partial()Cunlong Li
zram_read_page() picks the sync or async backing device read path based on whether the parent bio is NULL. zram_bvec_write_partial() passes its parent bio down, so for ZRAM_WB slots the read is dispatched asynchronously and zram_read_page() returns 0 while the bio is still in flight. The caller then runs memcpy_from_bvec(), zram_write_page() and __free_page() on the buffer, leaving the async read to write into a freed page. zram_bvec_read_partial() was switched to NULL in commit 4e3c87b9421d ("zram: fix synchronous reads") for the same reason; the write_partial counterpart was missed. Link: https://lore.kernel.org/20260528-zram-v3-1-cab86eef8764@gmail.com Fixes: 8e654f8fbff5 ("zram: read page from backing device") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Cunlong Li <shenxiaogll@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Minchan Kim <minchan@kernel.org> Cc: Yisheng Xie <xieyisheng1@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
6 daysdrm/amd/pm: smu_v14_0_0: use SoftMin for gfxclk in set_soft_freq_limited_rangePriya Hosur
In smu_v14_0_0_set_soft_freq_limited_range(), the gfxclk floor is programmed via SetHardMinGfxClk together with SetSoftMaxGfxClk. Under power_dpm_force_performance_level=high this pins HardMin to peak gfxclk. In PMFW arbitration HardMin has higher priority than SoftMax, so the firmware thermal/PPT throttler cannot clamp gfxclk via SoftMax once HardMin is set to peak. Replace SetHardMinGfxClk with SetSoftMinGfxclk so the driver still requests peak performance but the firmware throttler retains the ability to clamp gfxclk under thermal/PPT pressure. SoftMax handling is unchanged and no other clock domains are affected. Signed-off-by: Priya Hosur <Priya.Hosur@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3ea273267fd29cbf6d83ee72329f59eb5042605b) Cc: stable@vger.kernel.org
6 daysdrm/amdgpu: Fix incorrect VRAM GART mappings on non-4K page size systemsDonet Tom
When mapping VRAM pages into the GART page table, amdgpu_gart_map_vram_range() assumes that the system page size is the same as the GPU page size. On systems with non-4K page sizes, multiple GPU pages can exist within a single CPU page. As a result, the mappings are created incorrectly because fewer page table entries are programmed than required. Fix this by programming the mappings correctly for non-4K page size systems. Fixes: 237d623ae659 ("drm/amdgpu/gart: Add helper to bind VRAM pages (v2)") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a8f0bc22388f74e0cf4ed8b7d1846c580eaf44cc) Cc: stable@vger.kernel.org
6 daysdrm/amdgpu/userq: move wptr_obj cleanup in mqd_destroySunil Khatri
In case when queue_create fails and mqd has already been allocated and hence wptr_obj is not cleaned up. So moving that cleanup part to mqd_destroy so it takes care of all the cases of clean up and during tear down of the queue. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 43355f62cd2ef5386c2693df537c232ea0f2ce6c)
6 daysdrm/amdgpu: improve the userq seq BO free bit lookupPrike Liang
Use find_next_zero_bit() to locate the next free seq slot bit instead of the current walk, for more efficient bitmap scanning. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ff905a9b6228de9eedd0db71ecb1bdde91fb898d)
6 daysdrm/amdgpu/userq: remove the vital queue unmap loggingSunil Khatri
Mesa userqueues free does not wait for the free to complete and go ahead in unmapping the vital bos while kernel is still in queue free and corresponding cleanup. So ideally we don't need the logging for that and hence remove the warn message as this is expected behaviour and functionally, we are making sure to wait for the required fences before unmap. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 758a868043dcb07eca923bc451c16da3e73dc47c)
6 daysdrm/amdkfd: Fix buffer overflow in SDMA queue checkpoint/restore on GFX11Andrew Martin
The v11 MQD manager incorrectly assigned the CP-compute variants of checkpoint_mqd/restore_mqd for KFD_MQD_TYPE_SDMA queues. These functions use sizeof(struct v11_compute_mqd) (2048 bytes) instead of sizeof(struct v11_sdma_mqd) (512 bytes), causing a 1536-byte overflow. During CRIU checkpoint of an SDMA queue on Navi3x: - checkpoint_mqd() reads 2048 bytes from a 512-byte SDMA MQD buffer, leaking 1536 bytes of adjacent GTT memory to userspace During CRIU restore: - restore_mqd() writes 2048 bytes into a 512-byte SDMA MQD buffer, corrupting 1536 bytes of adjacent GTT memory (often the ring buffer or neighboring MQDs) This is a copy-paste regression unique to v11. All other ASIC backends (cik, vi, v9, v10, v12) correctly use the SDMA-specific variants. Add checkpoint_mqd_sdma() and restore_mqd_sdma() functions that properly handle the smaller v11_sdma_mqd structure, matching the pattern used in other MQD managers. Fixes: cc009e613de6 ("drm/amdkfd: Add KFD support for soc21 v3") Assisted-by: Claude:Sonnet 4-5 Signed-off-by: Andrew Martin <andrew.martin@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6fa41db7ffdec97d62433adf03b7b9b759af8c2c) Cc: stable@vger.kernel.org
6 daysdrm/amdkfd: fix NULL dereference in get_queue_ids()Muhammad Bilal
When usr_queue_id_array is NULL and num_queues is non-zero, get_queue_ids() returns NULL. The callers check only IS_ERR() on the return value; since IS_ERR(NULL) == false the check passes, and suspend_queues() calls q_array_invalidate() which immediately dereferences NULL while iterating num_queues times. Userspace can trigger this via kfd_ioctl_set_debug_trap() by supplying num_queues > 0 with a zero queue_array_ptr, causing a kernel panic. A NULL usr_queue_id_array with num_queues == 0 is a legitimate no-op (q_array_invalidate never executes, and resume_queues already guards all queue_ids dereferences behind a NULL check). Return ERR_PTR(-EINVAL) only when num_queues is non-zero and the pointer is absent; both callers already propagate IS_ERR() returns correctly to userspace. Fixes: a70a93fa568b ("drm/amdkfd: add debug suspend and resume process queues operation") Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f165a82cdf503884bb1797771c61b2fcc72113d4) Cc: stable@vger.kernel.org