From 1e1857230c4879ae6ec37e1dbcbf6ce6217056b5 Mon Sep 17 00:00:00 2001 From: Zijun Hu Date: Thu, 17 Oct 2024 23:34:49 +0800 Subject: kernel/resource: simplify API __devm_release_region() implementation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Simplify __devm_release_region() implementation by dedicated API devres_release() which have below advantages than current __release_region() + devres_destroy(): It is simpler if __devm_release_region() is undoing what __devm_request_region() did, otherwise, it can avoid wrong and undesired __release_region(). Link: https://lkml.kernel.org/r/20241017-release_region_fix-v1-1-84a3e8441284@quicinc.com Signed-off-by: Zijun Hu Cc: Andy Shevchenko Cc: Bjorn Helgaas Cc: Ilpo Järvinen Cc: Mika Westerberg Signed-off-by: Andrew Morton --- kernel/resource.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) (limited to 'kernel') diff --git a/kernel/resource.c b/kernel/resource.c index b7c0e24d9398..12004452d999 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1683,8 +1683,7 @@ void __devm_release_region(struct device *dev, struct resource *parent, { struct region_devres match_data = { parent, start, n }; - __release_region(parent, start, n); - WARN_ON(devres_destroy(dev, devm_region_release, devm_region_match, + WARN_ON(devres_release(dev, devm_region_release, devm_region_match, &match_data)); } EXPORT_SYMBOL(__devm_release_region); -- cgit v1.2.3 From 658eb5ab916ddc92f294dbce8e3d449470be9f86 Mon Sep 17 00:00:00 2001 From: Wang Yaxin Date: Tue, 3 Dec 2024 16:48:48 +0800 Subject: delayacct: add delay max to record delay peak Introduce the use cases of delay max, which can help quickly detect potential abnormal delays in the system and record the types and specific details of delay spikes. Problem ======== Delay accounting can track the average delay of processes to show system workload. However, when a process experiences a significant delay, maybe a delay spike, which adversely affects performance, getdelays can only display the average system delay over a period of time. Yet, average delay is unhelpful for diagnosing delay peak. It is not even possible to determine which type of delay has spiked, as this information might be masked by the average delay. Solution ========= the 'delay max' can display delay peak since the system's startup, which can record potential abnormal delays over time, including the type of delay and the maximum delay. This is helpful for quickly identifying crash caused by delay. Use case ========= bash# ./getdelays -d -p 244 print delayacct stats ON PID 244 CPU count real total virtual total delay total delay average delay max 68 192000000 213676651 705643 0.010ms 0.306381ms IO count delay total delay average delay max 0 0 0.000ms 0.000000ms SWAP count delay total delay average delay max 0 0 0.000ms 0.000000ms RECLAIM count delay total delay average delay max 0 0 0.000ms 0.000000ms THRASHING count delay total delay average delay max 0 0 0.000ms 0.000000ms COMPACT count delay total delay average delay max 0 0 0.000ms 0.000000ms WPCOPY count delay total delay average delay max 235 15648284 0.067ms 0.263842ms IRQ count delay total delay average delay max 0 0 0.000ms 0.000000ms [wang.yaxin@zte.com.cn: update docs and fix some spelling errors] Link: https://lkml.kernel.org/r/20241213192700771XKZ8H30OtHSeziGqRVMs0@zte.com.cn Link: https://lkml.kernel.org/r/20241203164848805CS62CQPQWG9GLdQj2_BxS@zte.com.cn Co-developed-by: Wang Yong Signed-off-by: Wang Yong Co-developed-by: xu xin Signed-off-by: xu xin Co-developed-by: Wang Yaxin Signed-off-by: Wang Yaxin Signed-off-by: Kun Jiang Cc: Balbir Singh Cc: David Hildenbrand Cc: Fan Yu Cc: Peilin He Cc: tuqiang Cc: Yang Yang Cc: ye xingchen Cc: Yunkai Zhang Signed-off-by: Andrew Morton --- Documentation/accounting/delay-accounting.rst | 42 +++++++++---------- include/linux/delayacct.h | 7 ++++ include/linux/sched.h | 3 ++ include/uapi/linux/taskstats.h | 9 ++++ kernel/delayacct.c | 37 ++++++++++++----- kernel/sched/stats.h | 5 ++- tools/accounting/getdelays.c | 59 +++++++++++++++------------ 7 files changed, 105 insertions(+), 57 deletions(-) (limited to 'kernel') diff --git a/Documentation/accounting/delay-accounting.rst b/Documentation/accounting/delay-accounting.rst index f61c01fc376e..8a0277428ccf 100644 --- a/Documentation/accounting/delay-accounting.rst +++ b/Documentation/accounting/delay-accounting.rst @@ -100,29 +100,29 @@ Get delays, since system boot, for pid 10:: # ./getdelays -d -p 10 (output similar to next case) -Get sum of delays, since system boot, for all pids with tgid 5:: +Get sum and peak of delays, since system boot, for all pids with tgid 242:: - # ./getdelays -d -t 5 + bash-4.4# ./getdelays -d -t 242 print delayacct stats ON - TGID 5 - - - CPU count real total virtual total delay total delay average - 8 7000000 6872122 3382277 0.423ms - IO count delay total delay average - 0 0 0.000ms - SWAP count delay total delay average - 0 0 0.000ms - RECLAIM count delay total delay average - 0 0 0.000ms - THRASHING count delay total delay average - 0 0 0.000ms - COMPACT count delay total delay average - 0 0 0.000ms - WPCOPY count delay total delay average - 0 0 0.000ms - IRQ count delay total delay average - 0 0 0.000ms + TGID 242 + + + CPU count real total virtual total delay total delay average delay max + 239 296000000 307724885 1127792 0.005ms 0.238382ms + IO count delay total delay average delay max + 0 0 0.000ms 0.000000ms + SWAP count delay total delay average delay max + 0 0 0.000ms 0.000000ms + RECLAIM count delay total delay average delay max + 0 0 0.000ms 0.000000ms + THRASHING count delay total delay average delay max + 0 0 0.000ms 0.000000ms + COMPACT count delay total delay average delay max + 0 0 0.000ms 0.000000ms + WPCOPY count delay total delay average delay max + 230 19100476 0.083ms 0.383822ms + IRQ count delay total delay average delay max + 0 0 0.000ms 0.000000ms Get IO accounting for pid 1, it works only with -p:: diff --git a/include/linux/delayacct.h b/include/linux/delayacct.h index 6639f48dac36..56fbfa2c2ac5 100644 --- a/include/linux/delayacct.h +++ b/include/linux/delayacct.h @@ -29,25 +29,32 @@ struct task_delay_info { * XXX_delay contains the accumulated delay time in nanoseconds. */ u64 blkio_start; + u64 blkio_delay_max; u64 blkio_delay; /* wait for sync block io completion */ u64 swapin_start; + u64 swapin_delay_max; u64 swapin_delay; /* wait for swapin */ u32 blkio_count; /* total count of the number of sync block */ /* io operations performed */ u32 swapin_count; /* total count of swapin */ u64 freepages_start; + u64 freepages_delay_max; u64 freepages_delay; /* wait for memory reclaim */ u64 thrashing_start; + u64 thrashing_delay_max; u64 thrashing_delay; /* wait for thrashing page */ u64 compact_start; + u64 compact_delay_max; u64 compact_delay; /* wait for memory compact */ u64 wpcopy_start; + u64 wpcopy_delay_max; u64 wpcopy_delay; /* wait for write-protect copy */ + u64 irq_delay_max; u64 irq_delay; /* wait for IRQ/SOFTIRQ */ u32 freepages_count; /* total count of memory reclaim */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 64934e0830af..a0ae3923b41d 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -398,6 +398,9 @@ struct sched_info { /* Time spent waiting on a runqueue: */ unsigned long long run_delay; + /* Max time spent waiting on a runqueue: */ + unsigned long long max_run_delay; + /* Timestamps: */ /* When did we last run on a CPU? */ diff --git a/include/uapi/linux/taskstats.h b/include/uapi/linux/taskstats.h index b50b2eb257a0..e0d1c6fc9f3b 100644 --- a/include/uapi/linux/taskstats.h +++ b/include/uapi/linux/taskstats.h @@ -72,6 +72,7 @@ struct taskstats { */ __u64 cpu_count __attribute__((aligned(8))); __u64 cpu_delay_total; + __u64 cpu_delay_max; /* Following four fields atomically updated using task->delays->lock */ @@ -80,10 +81,12 @@ struct taskstats { */ __u64 blkio_count; __u64 blkio_delay_total; + __u64 blkio_delay_max; /* Delay waiting for page fault I/O (swap in only) */ __u64 swapin_count; __u64 swapin_delay_total; + __u64 swapin_delay_max; /* cpu "wall-clock" running time * On some architectures, value will adjust for cpu time stolen @@ -166,10 +169,12 @@ struct taskstats { /* Delay waiting for memory reclaim */ __u64 freepages_count; __u64 freepages_delay_total; + __u64 freepages_delay_max; /* Delay waiting for thrashing page */ __u64 thrashing_count; __u64 thrashing_delay_total; + __u64 thrashing_delay_max; /* v10: 64-bit btime to avoid overflow */ __u64 ac_btime64; /* 64-bit begin time */ @@ -177,6 +182,7 @@ struct taskstats { /* v11: Delay waiting for memory compact */ __u64 compact_count; __u64 compact_delay_total; + __u64 compact_delay_max; /* v12 begin */ __u32 ac_tgid; /* thread group ID */ @@ -198,10 +204,13 @@ struct taskstats { /* v13: Delay waiting for write-protect copy */ __u64 wpcopy_count; __u64 wpcopy_delay_total; + __u64 wpcopy_delay_max; /* v14: Delay waiting for IRQ/SOFTIRQ */ __u64 irq_count; __u64 irq_delay_total; + __u64 irq_delay_max; + /* v15: add Delay max */ }; diff --git a/kernel/delayacct.c b/kernel/delayacct.c index dead51de8eb5..23212a0c88e4 100644 --- a/kernel/delayacct.c +++ b/kernel/delayacct.c @@ -93,9 +93,9 @@ void __delayacct_tsk_init(struct task_struct *tsk) /* * Finish delay accounting for a statistic using its timestamps (@start), - * accumalator (@total) and @count + * accumulator (@total) and @count */ -static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count) +static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count, u64 *max) { s64 ns = local_clock() - *start; unsigned long flags; @@ -104,6 +104,8 @@ static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *cou raw_spin_lock_irqsave(lock, flags); *total += ns; (*count)++; + if (ns > *max) + *max = ns; raw_spin_unlock_irqrestore(lock, flags); } } @@ -122,7 +124,8 @@ void __delayacct_blkio_end(struct task_struct *p) delayacct_end(&p->delays->lock, &p->delays->blkio_start, &p->delays->blkio_delay, - &p->delays->blkio_count); + &p->delays->blkio_count, + &p->delays->blkio_delay_max); } int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) @@ -153,10 +156,11 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) d->cpu_count += t1; + d->cpu_delay_max = tsk->sched_info.max_run_delay; tmp = (s64)d->cpu_delay_total + t2; d->cpu_delay_total = (tmp < (s64)d->cpu_delay_total) ? 0 : tmp; - tmp = (s64)d->cpu_run_virtual_total + t3; + d->cpu_run_virtual_total = (tmp < (s64)d->cpu_run_virtual_total) ? 0 : tmp; @@ -164,20 +168,26 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) return 0; /* zero XXX_total, non-zero XXX_count implies XXX stat overflowed */ - raw_spin_lock_irqsave(&tsk->delays->lock, flags); + d->blkio_delay_max = tsk->delays->blkio_delay_max; tmp = d->blkio_delay_total + tsk->delays->blkio_delay; d->blkio_delay_total = (tmp < d->blkio_delay_total) ? 0 : tmp; + d->swapin_delay_max = tsk->delays->swapin_delay_max; tmp = d->swapin_delay_total + tsk->delays->swapin_delay; d->swapin_delay_total = (tmp < d->swapin_delay_total) ? 0 : tmp; + d->freepages_delay_max = tsk->delays->freepages_delay_max; tmp = d->freepages_delay_total + tsk->delays->freepages_delay; d->freepages_delay_total = (tmp < d->freepages_delay_total) ? 0 : tmp; + d->thrashing_delay_max = tsk->delays->thrashing_delay_max; tmp = d->thrashing_delay_total + tsk->delays->thrashing_delay; d->thrashing_delay_total = (tmp < d->thrashing_delay_total) ? 0 : tmp; + d->compact_delay_max = tsk->delays->compact_delay_max; tmp = d->compact_delay_total + tsk->delays->compact_delay; d->compact_delay_total = (tmp < d->compact_delay_total) ? 0 : tmp; + d->wpcopy_delay_max = tsk->delays->wpcopy_delay_max; tmp = d->wpcopy_delay_total + tsk->delays->wpcopy_delay; d->wpcopy_delay_total = (tmp < d->wpcopy_delay_total) ? 0 : tmp; + d->irq_delay_max = tsk->delays->irq_delay_max; tmp = d->irq_delay_total + tsk->delays->irq_delay; d->irq_delay_total = (tmp < d->irq_delay_total) ? 0 : tmp; d->blkio_count += tsk->delays->blkio_count; @@ -213,7 +223,8 @@ void __delayacct_freepages_end(void) delayacct_end(¤t->delays->lock, ¤t->delays->freepages_start, ¤t->delays->freepages_delay, - ¤t->delays->freepages_count); + ¤t->delays->freepages_count, + ¤t->delays->freepages_delay_max); } void __delayacct_thrashing_start(bool *in_thrashing) @@ -235,7 +246,8 @@ void __delayacct_thrashing_end(bool *in_thrashing) delayacct_end(¤t->delays->lock, ¤t->delays->thrashing_start, ¤t->delays->thrashing_delay, - ¤t->delays->thrashing_count); + ¤t->delays->thrashing_count, + ¤t->delays->thrashing_delay_max); } void __delayacct_swapin_start(void) @@ -248,7 +260,8 @@ void __delayacct_swapin_end(void) delayacct_end(¤t->delays->lock, ¤t->delays->swapin_start, ¤t->delays->swapin_delay, - ¤t->delays->swapin_count); + ¤t->delays->swapin_count, + ¤t->delays->swapin_delay_max); } void __delayacct_compact_start(void) @@ -261,7 +274,8 @@ void __delayacct_compact_end(void) delayacct_end(¤t->delays->lock, ¤t->delays->compact_start, ¤t->delays->compact_delay, - ¤t->delays->compact_count); + ¤t->delays->compact_count, + ¤t->delays->compact_delay_max); } void __delayacct_wpcopy_start(void) @@ -274,7 +288,8 @@ void __delayacct_wpcopy_end(void) delayacct_end(¤t->delays->lock, ¤t->delays->wpcopy_start, ¤t->delays->wpcopy_delay, - ¤t->delays->wpcopy_count); + ¤t->delays->wpcopy_count, + ¤t->delays->wpcopy_delay_max); } void __delayacct_irq(struct task_struct *task, u32 delta) @@ -284,6 +299,8 @@ void __delayacct_irq(struct task_struct *task, u32 delta) raw_spin_lock_irqsave(&task->delays->lock, flags); task->delays->irq_delay += delta; task->delays->irq_count++; + if (delta > task->delays->irq_delay_max) + task->delays->irq_delay_max = delta; raw_spin_unlock_irqrestore(&task->delays->lock, flags); } diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h index 8ee0add5a48a..ed72435aef51 100644 --- a/kernel/sched/stats.h +++ b/kernel/sched/stats.h @@ -244,7 +244,8 @@ static inline void sched_info_dequeue(struct rq *rq, struct task_struct *t) delta = rq_clock(rq) - t->sched_info.last_queued; t->sched_info.last_queued = 0; t->sched_info.run_delay += delta; - + if (delta > t->sched_info.max_run_delay) + t->sched_info.max_run_delay = delta; rq_sched_info_dequeue(rq, delta); } @@ -266,6 +267,8 @@ static void sched_info_arrive(struct rq *rq, struct task_struct *t) t->sched_info.run_delay += delta; t->sched_info.last_arrival = now; t->sched_info.pcount++; + if (delta > t->sched_info.max_run_delay) + t->sched_info.max_run_delay = delta; rq_sched_info_arrive(rq, delta); } diff --git a/tools/accounting/getdelays.c b/tools/accounting/getdelays.c index 1334214546d7..e570bcad185d 100644 --- a/tools/accounting/getdelays.c +++ b/tools/accounting/getdelays.c @@ -192,60 +192,69 @@ static int get_family_id(int sd) } #define average_ms(t, c) (t / 1000000ULL / (c ? c : 1)) +#define delay_max_ms(t) (t / 1000000ULL) static void print_delayacct(struct taskstats *t) { - printf("\n\nCPU %15s%15s%15s%15s%15s\n" - " %15llu%15llu%15llu%15llu%15.3fms\n" - "IO %15s%15s%15s\n" - " %15llu%15llu%15.3fms\n" - "SWAP %15s%15s%15s\n" - " %15llu%15llu%15.3fms\n" - "RECLAIM %12s%15s%15s\n" - " %15llu%15llu%15.3fms\n" - "THRASHING%12s%15s%15s\n" - " %15llu%15llu%15.3fms\n" - "COMPACT %12s%15s%15s\n" - " %15llu%15llu%15.3fms\n" - "WPCOPY %12s%15s%15s\n" - " %15llu%15llu%15.3fms\n" - "IRQ %15s%15s%15s\n" - " %15llu%15llu%15.3fms\n", + printf("\n\nCPU %15s%15s%15s%15s%15s%15s\n" + " %15llu%15llu%15llu%15llu%15.3fms%13.6fms\n" + "IO %15s%15s%15s%15s\n" + " %15llu%15llu%15.3fms%13.6fms\n" + "SWAP %15s%15s%15s%15s\n" + " %15llu%15llu%15.3fms%13.6fms\n" + "RECLAIM %12s%15s%15s%15s\n" + " %15llu%15llu%15.3fms%13.6fms\n" + "THRASHING%12s%15s%15s%15s\n" + " %15llu%15llu%15.3fms%13.6fms\n" + "COMPACT %12s%15s%15s%15s\n" + " %15llu%15llu%15.3fms%13.6fms\n" + "WPCOPY %12s%15s%15s%15s\n" + " %15llu%15llu%15.3fms%13.6fms\n" + "IRQ %15s%15s%15s%15s\n" + " %15llu%15llu%15.3fms%13.6fms\n", "count", "real total", "virtual total", - "delay total", "delay average", + "delay total", "delay average", "delay max", (unsigned long long)t->cpu_count, (unsigned long long)t->cpu_run_real_total, (unsigned long long)t->cpu_run_virtual_total, (unsigned long long)t->cpu_delay_total, average_ms((double)t->cpu_delay_total, t->cpu_count), - "count", "delay total", "delay average", + delay_max_ms((double)t->cpu_delay_max), + "count", "delay total", "delay average", "delay max", (unsigned long long)t->blkio_count, (unsigned long long)t->blkio_delay_total, average_ms((double)t->blkio_delay_total, t->blkio_count), - "count", "delay total", "delay average", + delay_max_ms((double)t->blkio_delay_max), + "count", "delay total", "delay average", "delay max", (unsigned long long)t->swapin_count, (unsigned long long)t->swapin_delay_total, average_ms((double)t->swapin_delay_total, t->swapin_count), - "count", "delay total", "delay average", + delay_max_ms((double)t->swapin_delay_max), + "count", "delay total", "delay average", "delay max", (unsigned long long)t->freepages_count, (unsigned long long)t->freepages_delay_total, average_ms((double)t->freepages_delay_total, t->freepages_count), - "count", "delay total", "delay average", + delay_max_ms((double)t->freepages_delay_max), + "count", "delay total", "delay average", "delay max", (unsigned long long)t->thrashing_count, (unsigned long long)t->thrashing_delay_total, average_ms((double)t->thrashing_delay_total, t->thrashing_count), - "count", "delay total", "delay average", + delay_max_ms((double)t->thrashing_delay_max), + "count", "delay total", "delay average", "delay max", (unsigned long long)t->compact_count, (unsigned long long)t->compact_delay_total, average_ms((double)t->compact_delay_total, t->compact_count), - "count", "delay total", "delay average", + delay_max_ms((double)t->compact_delay_max), + "count", "delay total", "delay average", "delay max", (unsigned long long)t->wpcopy_count, (unsigned long long)t->wpcopy_delay_total, average_ms((double)t->wpcopy_delay_total, t->wpcopy_count), - "count", "delay total", "delay average", + delay_max_ms((double)t->wpcopy_delay_max), + "count", "delay total", "delay average", "delay max", (unsigned long long)t->irq_count, (unsigned long long)t->irq_delay_total, - average_ms((double)t->irq_delay_total, t->irq_count)); + average_ms((double)t->irq_delay_total, t->irq_count), + delay_max_ms((double)t->irq_delay_max)); } static void task_context_switch_counts(struct taskstats *t) -- cgit v1.2.3 From f49b42d415a32faee6bc08923821f432f64a4e90 Mon Sep 17 00:00:00 2001 From: MengEn Sun Date: Fri, 6 Dec 2024 12:13:47 +0800 Subject: ucounts: move kfree() out of critical zone protected by ucounts_lock MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Although kfree is a non-sleep function, it is possible to enter a long chain of calls probabilistically, so it looks better to move kfree from alloc_ucounts() out of the critical zone of ucounts_lock. Link: https://lkml.kernel.org/r/1733458427-11794-1-git-send-email-mengensun@tencent.com Signed-off-by: MengEn Sun Reviewed-by: YueHong Wu Reviewed-by: Andrew Morton Cc: Andrei Vagin Cc: Joel Granados Cc: Thomas Weißschuh Signed-off-by: Andrew Morton --- kernel/ucount.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'kernel') diff --git a/kernel/ucount.c b/kernel/ucount.c index f950b5e59d63..86c5f1c0bad9 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -164,8 +164,8 @@ struct ucounts *get_ucounts(struct ucounts *ucounts) struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid) { struct hlist_head *hashent = ucounts_hashentry(ns, uid); - struct ucounts *ucounts, *new; bool wrapped; + struct ucounts *ucounts, *new = NULL; spin_lock_irq(&ucounts_lock); ucounts = find_ucounts(ns, uid, hashent); @@ -182,17 +182,17 @@ struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid) spin_lock_irq(&ucounts_lock); ucounts = find_ucounts(ns, uid, hashent); - if (ucounts) { - kfree(new); - } else { + if (!ucounts) { hlist_add_head(&new->node, hashent); get_user_ns(new->ns); spin_unlock_irq(&ucounts_lock); return new; } } + wrapped = !get_ucounts_or_wrap(ucounts); spin_unlock_irq(&ucounts_lock); + kfree(new); if (wrapped) { put_ucounts(ucounts); return NULL; -- cgit v1.2.3 From 3c16fc0c913a62d4ed52be1b31519f8837086187 Mon Sep 17 00:00:00 2001 From: Yunhui Cui Date: Tue, 10 Dec 2024 17:52:38 +0800 Subject: watchdog: output this_cpu when printing hard LOCKUP MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When printing "Watchdog detected hard LOCKUP on cpu", also output the detecting CPU. It's more intuitive. Link: https://lkml.kernel.org/r/20241210095238.63444-1-cuiyunhui@bytedance.com Signed-off-by: Yunhui Cui Reviewed-by: Douglas Anderson Cc: Bitao Hu Cc: Joel Granados Cc: John Ogness Cc: Liu Song Cc: Song Liu Cc: Thomas Weißschuh Signed-off-by: Andrew Morton --- kernel/watchdog.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 41e0f7e9fa35..177abb7d0d4e 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -190,7 +190,7 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) * with printk_cpu_sync_get_irqsave() that we can still at least * get the message about the lockup out. */ - pr_emerg("Watchdog detected hard LOCKUP on cpu %d\n", cpu); + pr_emerg("CPU%u: Watchdog detected hard LOCKUP on cpu %u\n", this_cpu, cpu); printk_cpu_sync_get_irqsave(flags); print_modules(); -- cgit v1.2.3 From 51f8bd6db591689fa1c67628b4cfe9778e76be6d Mon Sep 17 00:00:00 2001 From: Mateusz Guzik Date: Tue, 19 Nov 2024 15:35:26 +0100 Subject: get_task_exe_file: check PF_KTHREAD locklessly Same thing as 8ac5dc66599c ("get_task_mm: check PF_KTHREAD lockless") Nowadays PF_KTHREAD is sticky and it was never protected by ->alloc_lock. Move the PF_KTHREAD check outside of task_lock() section to make this code more understandable. Link: https://lkml.kernel.org/r/20241119143526.704986-1-mjguzik@gmail.com Signed-off-by: Mateusz Guzik Acked-by: Oleg Nesterov Signed-off-by: Andrew Morton --- kernel/fork.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) (limited to 'kernel') diff --git a/kernel/fork.c b/kernel/fork.c index 9b301180fd41..19d7fe31869b 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1514,12 +1514,13 @@ struct file *get_task_exe_file(struct task_struct *task) struct file *exe_file = NULL; struct mm_struct *mm; + if (task->flags & PF_KTHREAD) + return NULL; + task_lock(task); mm = task->mm; - if (mm) { - if (!(task->flags & PF_KTHREAD)) - exe_file = get_mm_exe_file(mm); - } + if (mm) + exe_file = get_mm_exe_file(mm); task_unlock(task); return exe_file; } -- cgit v1.2.3 From b6dcdb06c064b520a3f406bfdc4d64510cf84108 Mon Sep 17 00:00:00 2001 From: Yafang Shao Date: Thu, 19 Dec 2024 10:34:48 +0800 Subject: kernel: remove get_task_comm() and print task comm directly MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Patch series "Remove get_task_comm() and print task comm directly", v2. Since task->comm is guaranteed to be NUL-terminated, we can print it directly without the need to copy it into a separate buffer. This simplifies the code and avoids unnecessary operations. This patch (of 5): Since task->comm is guaranteed to be NUL-terminated, we can print it directly without the need to copy it into a separate buffer. This simplifies the code and avoids unnecessary operations. Link: https://lkml.kernel.org/r/20241219023452.69907-1-laoar.shao@gmail.com Link: https://lkml.kernel.org/r/20241219023452.69907-2-laoar.shao@gmail.com Signed-off-by: Yafang Shao Cc: Serge Hallyn Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Darren Hart Cc: Davidlohr Bueso Cc: "André Almeida" Cc: Andy Shevchenko Cc: Borislav Petkov (AMD) Cc: Kalle Valo Cc: Linus Torvalds Cc: Petr Mladek Cc: Danilo Krummrich Cc: Dave Hansen Cc: David Airlie Cc: Greg Kroah-Hartman Cc: "H. Peter Anvin" Cc: James Morris Cc: Jani Nikula Cc: Jiri Slaby Cc: Johannes Berg Cc: Joonas Lahtinen Cc: Karol Herbst Cc: Kees Cook Cc: Lyude Paul Cc: Oded Gabbay Cc: Paul Moore Cc: Rodrigo Vivi Cc: Simona Vetter Cc: Tvrtko Ursulin Cc: Vineet Gupta Signed-off-by: Andrew Morton --- kernel/capability.c | 8 ++------ kernel/futex/waitwake.c | 3 +-- 2 files changed, 3 insertions(+), 8 deletions(-) (limited to 'kernel') diff --git a/kernel/capability.c b/kernel/capability.c index dac4df77e376..e089d2628c29 100644 --- a/kernel/capability.c +++ b/kernel/capability.c @@ -38,10 +38,8 @@ __setup("no_file_caps", file_caps_disable); static void warn_legacy_capability_use(void) { - char name[sizeof(current->comm)]; - pr_info_once("warning: `%s' uses 32-bit capabilities (legacy support in use)\n", - get_task_comm(name, current)); + current->comm); } /* @@ -62,10 +60,8 @@ static void warn_legacy_capability_use(void) static void warn_deprecated_v2(void) { - char name[sizeof(current->comm)]; - pr_info_once("warning: `%s' uses deprecated v2 capabilities in a way that may be insecure\n", - get_task_comm(name, current)); + current->comm); } /* diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index 3a10375d9521..eb86a7ade06a 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -210,13 +210,12 @@ static int futex_atomic_op_inuser(unsigned int encoded_op, u32 __user *uaddr) if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28)) { if (oparg < 0 || oparg > 31) { - char comm[sizeof(current->comm)]; /* * kill this print and return -EINVAL when userspace * is sane again */ pr_info_ratelimited("futex_wake_op: %s tries to shift op by %d; fix this program\n", - get_task_comm(comm, current), oparg); + current->comm, oparg); oparg &= 31; } oparg = 1 << oparg; -- cgit v1.2.3 From f65c64f311ee2f1ddc1eb395ed8b20e6b9d14e85 Mon Sep 17 00:00:00 2001 From: Wang Yaxin Date: Fri, 20 Dec 2024 17:31:05 +0800 Subject: delayacct: add delay min to record delay peak Delay accounting can now calculate the average delay of processes, detect the overall system load, and also record the 'delay max' to identify potential abnormal delays. However, 'delay min' can help us identify another useful delay peak. By comparing the difference between 'delay max' and 'delay min', we can understand the optimization space for latency, providing a reference for the optimization of latency performance. Use case ========= bash-4.4# ./getdelays -d -t 242 print delayacct stats ON TGID 242 CPU count real total virtual total delay total delay average delay max delay min 39 156000000 156576579 2111069 0.054ms 0.212296ms 0.031307ms IO count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms SWAP count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms RECLAIM count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms THRASHING count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms COMPACT count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms WPCOPY count delay total delay average delay max delay min 156 11215873 0.072ms 0.207403ms 0.033913ms IRQ count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms Link: https://lkml.kernel.org/r/20241220173105906EOdsPhzjMLYNJJBqgz1ga@zte.com.cn Co-developed-by: Wang Yong Signed-off-by: Wang Yong Co-developed-by: xu xin Signed-off-by: xu xin Signed-off-by: Wang Yaxin Co-developed-by: Kun Jiang Signed-off-by: Kun Jiang Cc: Balbir Singh Cc: David Hildenbrand Cc: Fan Yu Cc: Peilin He Cc: tuqiang Cc: ye xingchen Cc: Yunkai Zhang Signed-off-by: Andrew Morton --- Documentation/accounting/delay-accounting.rst | 32 ++++++++++---------- include/linux/delayacct.h | 7 +++++ include/linux/sched.h | 3 ++ include/uapi/linux/taskstats.h | 8 +++++ kernel/delayacct.c | 32 +++++++++++++++----- kernel/sched/stats.h | 4 +++ tools/accounting/getdelays.c | 42 ++++++++++++++++----------- 7 files changed, 88 insertions(+), 40 deletions(-) (limited to 'kernel') diff --git a/Documentation/accounting/delay-accounting.rst b/Documentation/accounting/delay-accounting.rst index 8a0277428ccf..210c194d4a7b 100644 --- a/Documentation/accounting/delay-accounting.rst +++ b/Documentation/accounting/delay-accounting.rst @@ -107,22 +107,22 @@ Get sum and peak of delays, since system boot, for all pids with tgid 242:: TGID 242 - CPU count real total virtual total delay total delay average delay max - 239 296000000 307724885 1127792 0.005ms 0.238382ms - IO count delay total delay average delay max - 0 0 0.000ms 0.000000ms - SWAP count delay total delay average delay max - 0 0 0.000ms 0.000000ms - RECLAIM count delay total delay average delay max - 0 0 0.000ms 0.000000ms - THRASHING count delay total delay average delay max - 0 0 0.000ms 0.000000ms - COMPACT count delay total delay average delay max - 0 0 0.000ms 0.000000ms - WPCOPY count delay total delay average delay max - 230 19100476 0.083ms 0.383822ms - IRQ count delay total delay average delay max - 0 0 0.000ms 0.000000ms + CPU count real total virtual total delay total delay average delay max delay min + 39 156000000 156576579 2111069 0.054ms 0.212296ms 0.031307ms + IO count delay total delay average delay max delay min + 0 0 0.000ms 0.000000ms 0.000000ms + SWAP count delay total delay average delay max delay min + 0 0 0.000ms 0.000000ms 0.000000ms + RECLAIM count delay total delay average delay max delay min + 0 0 0.000ms 0.000000ms 0.000000ms + THRASHING count delay total delay average delay max delay min + 0 0 0.000ms 0.000000ms 0.000000ms + COMPACT count delay total delay average delay max delay min + 0 0 0.000ms 0.000000ms 0.000000ms + WPCOPY count delay total delay average delay max delay min + 156 11215873 0.072ms 0.207403ms 0.033913ms + IRQ count delay total delay average delay max delay min + 0 0 0.000ms 0.000000ms 0.000000ms Get IO accounting for pid 1, it works only with -p:: diff --git a/include/linux/delayacct.h b/include/linux/delayacct.h index 56fbfa2c2ac5..800dcc360db2 100644 --- a/include/linux/delayacct.h +++ b/include/linux/delayacct.h @@ -30,9 +30,11 @@ struct task_delay_info { */ u64 blkio_start; u64 blkio_delay_max; + u64 blkio_delay_min; u64 blkio_delay; /* wait for sync block io completion */ u64 swapin_start; u64 swapin_delay_max; + u64 swapin_delay_min; u64 swapin_delay; /* wait for swapin */ u32 blkio_count; /* total count of the number of sync block */ /* io operations performed */ @@ -40,21 +42,26 @@ struct task_delay_info { u64 freepages_start; u64 freepages_delay_max; + u64 freepages_delay_min; u64 freepages_delay; /* wait for memory reclaim */ u64 thrashing_start; u64 thrashing_delay_max; + u64 thrashing_delay_min; u64 thrashing_delay; /* wait for thrashing page */ u64 compact_start; u64 compact_delay_max; + u64 compact_delay_min; u64 compact_delay; /* wait for memory compact */ u64 wpcopy_start; u64 wpcopy_delay_max; + u64 wpcopy_delay_min; u64 wpcopy_delay; /* wait for write-protect copy */ u64 irq_delay_max; + u64 irq_delay_min; u64 irq_delay; /* wait for IRQ/SOFTIRQ */ u32 freepages_count; /* total count of memory reclaim */ diff --git a/include/linux/sched.h b/include/linux/sched.h index a0ae3923b41d..155012467b21 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -401,6 +401,9 @@ struct sched_info { /* Max time spent waiting on a runqueue: */ unsigned long long max_run_delay; + /* Min time spent waiting on a runqueue: */ + unsigned long long min_run_delay; + /* Timestamps: */ /* When did we last run on a CPU? */ diff --git a/include/uapi/linux/taskstats.h b/include/uapi/linux/taskstats.h index e0d1c6fc9f3b..934e20ef7f79 100644 --- a/include/uapi/linux/taskstats.h +++ b/include/uapi/linux/taskstats.h @@ -73,6 +73,7 @@ struct taskstats { __u64 cpu_count __attribute__((aligned(8))); __u64 cpu_delay_total; __u64 cpu_delay_max; + __u64 cpu_delay_min; /* Following four fields atomically updated using task->delays->lock */ @@ -82,11 +83,13 @@ struct taskstats { __u64 blkio_count; __u64 blkio_delay_total; __u64 blkio_delay_max; + __u64 blkio_delay_min; /* Delay waiting for page fault I/O (swap in only) */ __u64 swapin_count; __u64 swapin_delay_total; __u64 swapin_delay_max; + __u64 swapin_delay_min; /* cpu "wall-clock" running time * On some architectures, value will adjust for cpu time stolen @@ -170,11 +173,13 @@ struct taskstats { __u64 freepages_count; __u64 freepages_delay_total; __u64 freepages_delay_max; + __u64 freepages_delay_min; /* Delay waiting for thrashing page */ __u64 thrashing_count; __u64 thrashing_delay_total; __u64 thrashing_delay_max; + __u64 thrashing_delay_min; /* v10: 64-bit btime to avoid overflow */ __u64 ac_btime64; /* 64-bit begin time */ @@ -183,6 +188,7 @@ struct taskstats { __u64 compact_count; __u64 compact_delay_total; __u64 compact_delay_max; + __u64 compact_delay_min; /* v12 begin */ __u32 ac_tgid; /* thread group ID */ @@ -205,11 +211,13 @@ struct taskstats { __u64 wpcopy_count; __u64 wpcopy_delay_total; __u64 wpcopy_delay_max; + __u64 wpcopy_delay_min; /* v14: Delay waiting for IRQ/SOFTIRQ */ __u64 irq_count; __u64 irq_delay_total; __u64 irq_delay_max; + __u64 irq_delay_min; /* v15: add Delay max */ }; diff --git a/kernel/delayacct.c b/kernel/delayacct.c index 23212a0c88e4..b238eb8c6573 100644 --- a/kernel/delayacct.c +++ b/kernel/delayacct.c @@ -95,7 +95,7 @@ void __delayacct_tsk_init(struct task_struct *tsk) * Finish delay accounting for a statistic using its timestamps (@start), * accumulator (@total) and @count */ -static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count, u64 *max) +static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count, u64 *max, u64 *min) { s64 ns = local_clock() - *start; unsigned long flags; @@ -106,6 +106,8 @@ static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *cou (*count)++; if (ns > *max) *max = ns; + if (*min == 0 || ns < *min) + *min = ns; raw_spin_unlock_irqrestore(lock, flags); } } @@ -125,7 +127,8 @@ void __delayacct_blkio_end(struct task_struct *p) &p->delays->blkio_start, &p->delays->blkio_delay, &p->delays->blkio_count, - &p->delays->blkio_delay_max); + &p->delays->blkio_delay_max, + &p->delays->blkio_delay_min); } int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) @@ -157,6 +160,7 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) d->cpu_count += t1; d->cpu_delay_max = tsk->sched_info.max_run_delay; + d->cpu_delay_min = tsk->sched_info.min_run_delay; tmp = (s64)d->cpu_delay_total + t2; d->cpu_delay_total = (tmp < (s64)d->cpu_delay_total) ? 0 : tmp; tmp = (s64)d->cpu_run_virtual_total + t3; @@ -170,24 +174,31 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) /* zero XXX_total, non-zero XXX_count implies XXX stat overflowed */ raw_spin_lock_irqsave(&tsk->delays->lock, flags); d->blkio_delay_max = tsk->delays->blkio_delay_max; + d->blkio_delay_min = tsk->delays->blkio_delay_min; tmp = d->blkio_delay_total + tsk->delays->blkio_delay; d->blkio_delay_total = (tmp < d->blkio_delay_total) ? 0 : tmp; d->swapin_delay_max = tsk->delays->swapin_delay_max; + d->swapin_delay_min = tsk->delays->swapin_delay_min; tmp = d->swapin_delay_total + tsk->delays->swapin_delay; d->swapin_delay_total = (tmp < d->swapin_delay_total) ? 0 : tmp; d->freepages_delay_max = tsk->delays->freepages_delay_max; + d->freepages_delay_min = tsk->delays->freepages_delay_min; tmp = d->freepages_delay_total + tsk->delays->freepages_delay; d->freepages_delay_total = (tmp < d->freepages_delay_total) ? 0 : tmp; d->thrashing_delay_max = tsk->delays->thrashing_delay_max; + d->thrashing_delay_min = tsk->delays->thrashing_delay_min; tmp = d->thrashing_delay_total + tsk->delays->thrashing_delay; d->thrashing_delay_total = (tmp < d->thrashing_delay_total) ? 0 : tmp; d->compact_delay_max = tsk->delays->compact_delay_max; + d->compact_delay_min = tsk->delays->compact_delay_min; tmp = d->compact_delay_total + tsk->delays->compact_delay; d->compact_delay_total = (tmp < d->compact_delay_total) ? 0 : tmp; d->wpcopy_delay_max = tsk->delays->wpcopy_delay_max; + d->wpcopy_delay_min = tsk->delays->wpcopy_delay_min; tmp = d->wpcopy_delay_total + tsk->delays->wpcopy_delay; d->wpcopy_delay_total = (tmp < d->wpcopy_delay_total) ? 0 : tmp; d->irq_delay_max = tsk->delays->irq_delay_max; + d->irq_delay_min = tsk->delays->irq_delay_min; tmp = d->irq_delay_total + tsk->delays->irq_delay; d->irq_delay_total = (tmp < d->irq_delay_total) ? 0 : tmp; d->blkio_count += tsk->delays->blkio_count; @@ -224,7 +235,8 @@ void __delayacct_freepages_end(void) ¤t->delays->freepages_start, ¤t->delays->freepages_delay, ¤t->delays->freepages_count, - ¤t->delays->freepages_delay_max); + ¤t->delays->freepages_delay_max, + ¤t->delays->freepages_delay_min); } void __delayacct_thrashing_start(bool *in_thrashing) @@ -247,7 +259,8 @@ void __delayacct_thrashing_end(bool *in_thrashing) ¤t->delays->thrashing_start, ¤t->delays->thrashing_delay, ¤t->delays->thrashing_count, - ¤t->delays->thrashing_delay_max); + ¤t->delays->thrashing_delay_max, + ¤t->delays->thrashing_delay_min); } void __delayacct_swapin_start(void) @@ -261,7 +274,8 @@ void __delayacct_swapin_end(void) ¤t->delays->swapin_start, ¤t->delays->swapin_delay, ¤t->delays->swapin_count, - ¤t->delays->swapin_delay_max); + ¤t->delays->swapin_delay_max, + ¤t->delays->swapin_delay_min); } void __delayacct_compact_start(void) @@ -275,7 +289,8 @@ void __delayacct_compact_end(void) ¤t->delays->compact_start, ¤t->delays->compact_delay, ¤t->delays->compact_count, - ¤t->delays->compact_delay_max); + ¤t->delays->compact_delay_max, + ¤t->delays->compact_delay_min); } void __delayacct_wpcopy_start(void) @@ -289,7 +304,8 @@ void __delayacct_wpcopy_end(void) ¤t->delays->wpcopy_start, ¤t->delays->wpcopy_delay, ¤t->delays->wpcopy_count, - ¤t->delays->wpcopy_delay_max); + ¤t->delays->wpcopy_delay_max, + ¤t->delays->wpcopy_delay_min); } void __delayacct_irq(struct task_struct *task, u32 delta) @@ -301,6 +317,8 @@ void __delayacct_irq(struct task_struct *task, u32 delta) task->delays->irq_count++; if (delta > task->delays->irq_delay_max) task->delays->irq_delay_max = delta; + if (delta && (!task->delays->irq_delay_min || delta < task->delays->irq_delay_min)) + task->delays->irq_delay_min = delta; raw_spin_unlock_irqrestore(&task->delays->lock, flags); } diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h index ed72435aef51..693537b908a1 100644 --- a/kernel/sched/stats.h +++ b/kernel/sched/stats.h @@ -246,6 +246,8 @@ static inline void sched_info_dequeue(struct rq *rq, struct task_struct *t) t->sched_info.run_delay += delta; if (delta > t->sched_info.max_run_delay) t->sched_info.max_run_delay = delta; + if (delta && (!t->sched_info.min_run_delay || delta < t->sched_info.min_run_delay)) + t->sched_info.min_run_delay = delta; rq_sched_info_dequeue(rq, delta); } @@ -269,6 +271,8 @@ static void sched_info_arrive(struct rq *rq, struct task_struct *t) t->sched_info.pcount++; if (delta > t->sched_info.max_run_delay) t->sched_info.max_run_delay = delta; + if (delta && (!t->sched_info.min_run_delay || delta < t->sched_info.min_run_delay)) + t->sched_info.min_run_delay = delta; rq_sched_info_arrive(rq, delta); } diff --git a/tools/accounting/getdelays.c b/tools/accounting/getdelays.c index e570bcad185d..100ad3dc091a 100644 --- a/tools/accounting/getdelays.c +++ b/tools/accounting/getdelays.c @@ -192,7 +192,7 @@ static int get_family_id(int sd) } #define average_ms(t, c) (t / 1000000ULL / (c ? c : 1)) -#define delay_max_ms(t) (t / 1000000ULL) +#define delay_ms(t) (t / 1000000ULL) static void print_delayacct(struct taskstats *t) { @@ -213,48 +213,56 @@ static void print_delayacct(struct taskstats *t) "IRQ %15s%15s%15s%15s\n" " %15llu%15llu%15.3fms%13.6fms\n", "count", "real total", "virtual total", - "delay total", "delay average", "delay max", + "delay total", "delay average", "delay max", "delay min", (unsigned long long)t->cpu_count, (unsigned long long)t->cpu_run_real_total, (unsigned long long)t->cpu_run_virtual_total, (unsigned long long)t->cpu_delay_total, average_ms((double)t->cpu_delay_total, t->cpu_count), - delay_max_ms((double)t->cpu_delay_max), - "count", "delay total", "delay average", "delay max", + delay_ms((double)t->cpu_delay_max), + delay_ms((double)t->cpu_delay_min), + "count", "delay total", "delay average", "delay max", "delay min", (unsigned long long)t->blkio_count, (unsigned long long)t->blkio_delay_total, average_ms((double)t->blkio_delay_total, t->blkio_count), - delay_max_ms((double)t->blkio_delay_max), - "count", "delay total", "delay average", "delay max", + delay_ms((double)t->blkio_delay_max), + delay_ms((double)t->blkio_delay_min), + "count", "delay total", "delay average", "delay max", "delay min", (unsigned long long)t->swapin_count, (unsigned long long)t->swapin_delay_total, average_ms((double)t->swapin_delay_total, t->swapin_count), - delay_max_ms((double)t->swapin_delay_max), - "count", "delay total", "delay average", "delay max", + delay_ms((double)t->swapin_delay_max), + delay_ms((double)t->swapin_delay_min), + "count", "delay total", "delay average", "delay max", "delay min", (unsigned long long)t->freepages_count, (unsigned long long)t->freepages_delay_total, average_ms((double)t->freepages_delay_total, t->freepages_count), - delay_max_ms((double)t->freepages_delay_max), - "count", "delay total", "delay average", "delay max", + delay_ms((double)t->freepages_delay_max), + delay_ms((double)t->freepages_delay_min), + "count", "delay total", "delay average", "delay max", "delay min", (unsigned long long)t->thrashing_count, (unsigned long long)t->thrashing_delay_total, average_ms((double)t->thrashing_delay_total, t->thrashing_count), - delay_max_ms((double)t->thrashing_delay_max), - "count", "delay total", "delay average", "delay max", + delay_ms((double)t->thrashing_delay_max), + delay_ms((double)t->thrashing_delay_min), + "count", "delay total", "delay average", "delay max", "delay min", (unsigned long long)t->compact_count, (unsigned long long)t->compact_delay_total, average_ms((double)t->compact_delay_total, t->compact_count), - delay_max_ms((double)t->compact_delay_max), - "count", "delay total", "delay average", "delay max", + delay_ms((double)t->compact_delay_max), + delay_ms((double)t->compact_delay_min), + "count", "delay total", "delay average", "delay max", "delay min", (unsigned long long)t->wpcopy_count, (unsigned long long)t->wpcopy_delay_total, average_ms((double)t->wpcopy_delay_total, t->wpcopy_count), - delay_max_ms((double)t->wpcopy_delay_max), - "count", "delay total", "delay average", "delay max", + delay_ms((double)t->wpcopy_delay_max), + delay_ms((double)t->wpcopy_delay_min), + "count", "delay total", "delay average", "delay max", "delay min", (unsigned long long)t->irq_count, (unsigned long long)t->irq_delay_total, average_ms((double)t->irq_delay_total, t->irq_count), - delay_max_ms((double)t->irq_delay_max)); + delay_ms((double)t->irq_delay_max), + delay_ms((double)t->irq_delay_min)); } static void task_context_switch_counts(struct taskstats *t) -- cgit v1.2.3 From 97549ce6848e5e30b667378b28c3cff937d167d2 Mon Sep 17 00:00:00 2001 From: Tio Zhang Date: Tue, 24 Dec 2024 17:53:44 +0800 Subject: kthread: correct comments before kthread_queue_work() s/kthread_worker_create/kthread_create_worker/ to avoid confusion when reading comments before kthread_queue_work(). Link: https://lkml.kernel.org/r/20241224095344.GA7587@didi-ThinkCentre-M930t-N000 Signed-off-by: Tio Zhang Signed-off-by: Andrew Morton --- kernel/kthread.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/kthread.c b/kernel/kthread.c index a5ac612b1609..2fd0daa6b3b6 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -1015,7 +1015,7 @@ static void kthread_insert_work(struct kthread_worker *worker, * @work: kthread_work to queue * * Queue @work to work processor @task for async execution. @task - * must have been created with kthread_worker_create(). Returns %true + * must have been created with kthread_create_worker(). Returns %true * if @work was successfully queued, %false if it was already pending. * * Reinitialize the work if it needs to be used by another worker. -- cgit v1.2.3 From 65ef17aa07115d1a8d21d77f1e7cef3b639a396f Mon Sep 17 00:00:00 2001 From: Oxana Kharitonova Date: Fri, 10 Jan 2025 16:03:28 +0000 Subject: hung_task: add task->flags, blocked by coredump to log Resending this patch as I haven't received feedback on my initial submission https://lore.kernel.org/all/20241204182953.10854-1-oxana@cloudflare.com/ For the processes which are terminated abnormally the kernel can provide a coredump if enabled. When the coredump is performed, the process and all its threads are put into the D state (TASK_UNINTERRUPTIBLE | TASK_FREEZABLE). On the other hand, we have kernel thread khungtaskd which monitors the processes in the D state. If the task stuck in the D state more than kernel.hung_task_timeout_secs, the hung_task alert appears in the kernel log. The higher memory usage of a process, the longer it takes to create coredump, the longer tasks are in the D state. We have hung_task alerts for the processes with memory usage above 10Gb. Although, our kernel.hung_task_timeout_secs is 10 sec when the default is 120 sec. Adding additional information to the log that the task is blocked by coredump will help with monitoring. Another approach might be to completely filter out alerts for such tasks, but in that case we would lose transparency about what is putting pressure on some system resources, e.g. we saw an increase in I/O when coredump occurs due its writing to disk. Additionally, it would be helpful to have task_struct->flags in the log from the function sched_show_task(). Currently it prints task_struct->thread_info->flags, this seems misleading as the line starts with "task:xxxx". [akpm@linux-foundation.org: fix printk control string] Link: https://lkml.kernel.org/r/20250110160328.64947-1-oxana@cloudflare.com Signed-off-by: Oxana Kharitonova Cc: Al Viro Cc: Ben Segall Cc: Christian Brauner Cc: Dietmar Eggemann Cc: Ingo Molnar Cc: Jan Kara Cc: Juri Lelli Cc: Mel Gorman Cc: Peter Zijlstra (Intel) Cc: Steven Rostedt Cc: Valentin Schneider Cc: Vincent Guittot Signed-off-by: Andrew Morton --- kernel/hung_task.c | 2 ++ kernel/sched/core.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) (limited to 'kernel') diff --git a/kernel/hung_task.c b/kernel/hung_task.c index c18717189f32..953169893a95 100644 --- a/kernel/hung_task.c +++ b/kernel/hung_task.c @@ -147,6 +147,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout) print_tainted(), init_utsname()->release, (int)strcspn(init_utsname()->version, " "), init_utsname()->version); + if (t->flags & PF_POSTCOREDUMP) + pr_err(" Blocked by coredump.\n"); pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\"" " disables this message.\n"); sched_show_task(t); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 3e5a6bf587f9..109b5df48e8b 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7701,9 +7701,9 @@ void sched_show_task(struct task_struct *p) if (pid_alive(p)) ppid = task_pid_nr(rcu_dereference(p->real_parent)); rcu_read_unlock(); - pr_cont(" stack:%-5lu pid:%-5d tgid:%-5d ppid:%-6d flags:0x%08lx\n", + pr_cont(" stack:%-5lu pid:%-5d tgid:%-5d ppid:%-6d task_flags:0x%04x flags:0x%08lx\n", free, task_pid_nr(p), task_tgid_nr(p), - ppid, read_task_thread_flags(p)); + ppid, p->flags, read_task_thread_flags(p)); print_worker_info(KERN_INFO, p); print_stop_info(KERN_INFO, p); -- cgit v1.2.3 From 690794430afa75028a7daed65ba5c9f590d057ff Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 10 Jan 2025 22:30:19 -0800 Subject: latencytop: use correct kernel-doc format for func params Use a ':' instead of a '-' after function parameters to eliminate kernel-doc warnings. kernel/latencytop.c:177: warning: Function parameter or struct member 'tsk' not described in '__account_scheduler_latency' ../kernel/latencytop.c:177: warning: Function parameter or struct member 'usecs' not described in '__account_scheduler_latency' ../kernel/latencytop.c:177: warning: Function parameter or struct member 'inter' not described in '__account_scheduler_latency' Link: https://lkml.kernel.org/r/20250111063019.910730-1-rdunlap@infradead.org Fixes: ad0b0fd554df ("sched, latencytop: incorporate review feedback from Andrew Morton") Signed-off-by: Randy Dunlap Signed-off-by: Andrew Morton --- kernel/latencytop.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'kernel') diff --git a/kernel/latencytop.c b/kernel/latencytop.c index 7a75eab9c179..77ee3ea8a573 100644 --- a/kernel/latencytop.c +++ b/kernel/latencytop.c @@ -158,9 +158,9 @@ account_global_scheduler_latency(struct task_struct *tsk, /** * __account_scheduler_latency - record an occurred latency - * @tsk - the task struct of the task hitting the latency - * @usecs - the duration of the latency in microseconds - * @inter - 1 if the sleep was interruptible, 0 if uninterruptible + * @tsk: the task struct of the task hitting the latency + * @usecs: the duration of the latency in microseconds + * @inter: 1 if the sleep was interruptible, 0 if uninterruptible * * This function is the main entry point for recording latency entries * as called by the scheduler. -- cgit v1.2.3 From 9c9ce355b1013a7ef37c06007cb8d714eaf4c303 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 10 Jan 2025 22:29:44 -0800 Subject: gcov: clang: use correct function param names Fix the function parameter names to match the function so that the kernel-doc warnings disappear. clang.c:273: warning: Function parameter or struct member 'dst' not described in 'gcov_info_add' clang.c:273: warning: Function parameter or struct member 'src' not described in 'gcov_info_add' clang.c:273: warning: Excess function parameter 'dest' description in 'gcov_info_add' clang.c:273: warning: Excess function parameter 'source' description in 'gcov_info_add' Link: https://lkml.kernel.org/r/20250111062944.910638-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap Cc: Peter Oberparleiter Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Bill Wendling Cc: Justin Stitt Signed-off-by: Andrew Morton --- kernel/gcov/clang.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'kernel') diff --git a/kernel/gcov/clang.c b/kernel/gcov/clang.c index 7670a811a565..8b888a6193cc 100644 --- a/kernel/gcov/clang.c +++ b/kernel/gcov/clang.c @@ -264,10 +264,10 @@ int gcov_info_is_compatible(struct gcov_info *info1, struct gcov_info *info2) /** * gcov_info_add - add up profiling data - * @dest: profiling data set to which data is added - * @source: profiling data set which is added + * @dst: profiling data set to which data is added + * @src: profiling data set which is added * - * Adds profiling counts of @source to @dest. + * Adds profiling counts of @src to @dst. */ void gcov_info_add(struct gcov_info *dst, struct gcov_info *src) { -- cgit v1.2.3