linux-next.git/kernel/workqueue.c, branch master

Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git

2026-07-03T15:21:33+00:00

workqueue: annotate racy PWQ_STAT_CPU_TIME update in wq_worker_tick()

2026-07-02T17:54:10+00:00

wq_worker_tick() bumps pwq->stats[PWQ_STAT_CPU_TIME] on every scheduler tick before pool->lock is taken. For unbound workqueues the pool_workqueue is shared by all workers of the pool across CPUs, so concurrent ticks on different CPUs perform an unsynchronized 64-bit read-modify-write on the same counter. KCSAN reports this as a data-race: BUG: KCSAN: data-race in wq_worker_tick / wq_worker_tick read-write to 0xffff0004d6989500 of 8 bytes by interrupt on cpu 29: wq_worker_tick+0x70/0x418 sched_tick+0x248/0x3a0 update_process_times+0x200/0x260 tick_nohz_handler+0x230/0x2f8 __hrtimer_run_queues+0x1ec/0x6c8 hrtimer_interrupt+0x174/0x4b8 ... read-write to 0xffff0004d6989500 of 8 bytes by interrupt on cpu 24: wq_worker_tick+0x70/0x418 sched_tick+0x248/0x3a0 ... value changed: 0x000000000010a1d0 -> 0x000000000010a9a0 The counter is purely advisory, so an occasional lost update is harmless, and every other stats[] update already runs under pool->lock. Annotate the update with data_race(). Signed-off-by: Breno Leitao Signed-off-by: Tejun Heo

workqueue: dump the last woken worker for stalled pools

2026-07-01T17:54:06+00:00

To identify the task most likely responsible for a stall, add last_woken_worker (L: pool->lock) to worker_pool and record it in kick_pool() just before wake_up_process(). This captures the idle worker that was kicked to take over when the last running worker went to sleep; if the pool is now stuck with no running worker, that task is the prime suspect and its backtrace is dumped by show_pool_no_running_worker(). Using struct worker * rather than struct task_struct * avoids any lifetime concern: workers are only destroyed via set_worker_dying() which requires pool->lock, and set_worker_dying() clears last_woken_worker when the dying worker matches. show_cpu_pool_busy_workers() holds pool->lock while calling sched_show_task(), so last_woken_worker is either NULL or points to a live worker with a valid task. More precisely, set_worker_dying() clears last_woken_worker before setting WORKER_DIE, so a non-NULL last_woken_worker means the kthread has not yet exited and worker->task is still alive. Suggested-by: Petr Mladek Reviewed-by: Petr Mladek Signed-off-by: Breno Leitao Signed-off-by: Tejun Heo

workqueue: trigger a single-CPU backtrace for stalled pools

2026-06-30T16:36:53+00:00

When a CPU pool is stalled with no running worker, the task occupying the CPU may not be a workqueue worker at all. Trigger a single-CPU backtrace for the stalled CPU to capture what it is currently executing. The CPU is snapshotted under pool->lock and the backtrace is triggered after releasing the lock to avoid any potential issues with NMI delivery. Skip the backtrace when the CPU is offline. A pool disassociated by CPU hotplug keeps its pool->cpu, and an NMI to an offline CPU is never acked, so nmi_trigger_cpumask_backtrace() would busy-wait for its full timeout in the watchdog's timer context. Suggested-by: Petr Mladek Reviewed-by: Petr Mladek Signed-off-by: Breno Leitao Signed-off-by: Tejun Heo

workqueue: only show running workers in stall diagnostics

2026-06-30T16:36:40+00:00

show_cpu_pool_busy_workers() dumps every in-flight worker in the pool's busy_hash, including workers that are not currently running on the CPU. Restore the task_is_running() filter so only running workers are dumped. When no running worker is found the pool may be stuck, unable to wake an idle worker to process pending work, and the watchdog would otherwise give no feedback. Add show_pool_no_running_worker() to report the pool id, CPU, idle state, and worker counts in that case. The pool info message is printed inside pool->lock using printk_deferred_enter/exit, the same pattern used by the existing busy-worker loop, to avoid deadlocks with console drivers that queue work while holding locks also taken in their write paths. This has been running on the Meta fleet for a while and caught some real issues, for instance EFI stalls stalling the workqueue [1]. Link: https://lore.kernel.org/all/20260616-efi_timeout-v3-0-76dd1d26657b@debian.org/ [1] Suggested-by: Petr Mladek Fixes: 8823eaef45da7 ("workqueue: Show all busy workers in stall diagnostics") Reviewed-by: Petr Mladek Signed-off-by: Breno Leitao Signed-off-by: Tejun Heo

workqueue: defer the worker wakeup outside pool->lock in process_one_work()

2026-06-29T18:07:10+00:00

Use kick_pool_pick() to select and claim the worker under pool->lock and issue the wakeup with wake_up_process() after the lock is dropped. Unlike __queue_work(), this path has no surrounding RCU section, so take rcu_read_lock() before dropping pool->lock to keep the picked worker's task_struct valid across the wakeup. Signed-off-by: Breno Leitao Tested-by: Krishna Magar Signed-off-by: Tejun Heo

workqueue: defer the worker wakeup outside pool->lock in __queue_work()

2026-06-29T18:07:10+00:00

__queue_work() is the enqueue hot path: it inserts the work item and calls kick_pool() while holding pool->lock. kick_pool() ends in a wakeup, which takes the target task's rq->lock, so rq->lock nests under pool->lock on every enqueue that wakes a worker on a contended unbound pool. Use kick_pool_pick() to select and claim the worker under pool->lock and issue the wakeup with wake_up_process() right after dropping the lock. Signed-off-by: Breno Leitao Signed-off-by: Tejun Heo

workqueue: split kick_pool() into kick_pool_pick()

2026-06-29T18:07:10+00:00

Factor the worker selection out of kick_pool() into kick_pool_pick(), which picks and claims the worker under pool->lock but, instead of waking it, returns the worker's task via an out-param so the caller can issue the wakeup after dropping pool->lock. BH kicks and wake_cpu setup still happen under the lock. kick_pool() becomes a thin wrapper that wakes the returned task, so all existing callers keep waking under pool->lock. Pure refactor, no functional change. Signed-off-by: Breno Leitao Signed-off-by: Tejun Heo

net: update dev_put()/dev_hold() debugging

2026-06-29T11:26:36+00:00

This change is not for upstream. This change is for linux-next only. syzbot is still reporting unregister_netdevice: waiting for DEV to become free problem. Since commit 4c6c11ea0f7b ("net: refine dev_put()/dev_hold() debugging") is not sufficient for me, let's try to report all locations which called dev_put()/dev_hold(), with a hope that we can find some hints for locations where dev_put() is missing. Signed-off-by: Tetsuo Handa

Merge tag 'wq-for-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

2026-06-17T10:57:44+00:00

Pull workqueue updates from Tejun Heo: - Continued progress toward making alloc_workqueue() unbound by default: more callers converted to WQ_PERCPU / system_percpu_wq / system_dfl_wq, and new warnings for queues that use neither WQ_PERCPU nor WQ_UNBOUND or the legacy system_wq / system_unbound_wq. - Misc: drop the now-trivial apply_wqattrs_lock()/unlock() wrappers, forbid the TEST_WORKQUEUE benchmark from being built-in, and fix a spurious pointer level in the worker debug-dump path. * tag 'wq-for-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: drm/bridge: anx7625: Add WQ_PERCPU add to alloc_workqueue wifi: ath6kl: fix invalid workqueue flags in ath6kl_usb_create() btrfs: Drop WQ_PERCPU from ordered_flags in btrfs_init_workqueues() workqueue: Add warnings and ensure one among WQ_PERCPU or WQ_UNBOUND is present workqueue: Add warnings and fallback if system_{unbound}_wq is used workqueue: drop spurious '*' from print_worker_info() fn declaration workqueue: forbid TEST_WORKQUEUE from being built-in workqueue: drop apply_wqattrs_lock()/unlock() wrappers umh: replace use of system_unbound_wq with system_dfl_wq rapidio: rio: add WQ_PERCPU to alloc_workqueue users media: ddbridge: add WQ_PERCPU to alloc_workqueue users platform: cznic: turris-omnia-mcu: replace use of system_wq with system_percpu_wq media: synopsys: hdmirx: replace use of system_unbound_wq with system_dfl_wq virt: acrn: Add WQ_PERCPU to alloc_workqueue users