Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"There are some new features available for this cycle. Firstly, EROFS
LZMA algorithm support, specifically called MicroLZMA, is available as
an option for embedded devices, LiveCDs and/or as the secondary
auxiliary compression algorithm besides the primary algorithm in one
file.
In order to better support the LZMA fixed-sized output compression,
especially for 4KiB pcluster size (which has lowest memory pressure
thus useful for memory-sensitive scenarios), Lasse introduced a new
LZMA header/container format called MicroLZMA to minimize the original
LZMA1 header (for example, we don't need to waste 4-byte dictionary
size and another 8-byte uncompressed size, which can be calculated by
fs directly, for each pcluster) and enable EROFS fixed-sized output
compression.
Note that MicroLZMA can also be later used by other things in addition
to EROFS too where wasting minimal amount of space for headers is
important and it can be only compiled by enabling XZ_DEC_MICROLZMA.
MicroLZMA has been supported by the latest upstream XZ embedded [1] &
XZ utils [2], apply the latest related XZ embedded upstream patches by
the XZ author Lasse here.
Secondly, multiple device is also supported in this cycle, which is
designed for multi-layer container images. By working together with
inter-layer data deduplication and compression, we can achieve the
next high-performance container image solution. Our team will announce
the new Nydus container image service [3] implementation with new RAFS
v6 (EROFS-compatible) format in Open Source Summit 2021 China [4]
soon.
Besides, the secondary compression head support and readmore
decompression strategy are also included in this cycle. There are also
some minor bugfixes and cleanups, as always.
Summary:
- support multiple devices for multi-layer container images;
- support the secondary compression head;
- support readmore decompression strategy;
- support new LZMA algorithm (specifically called MicroLZMA);
- some bugfixes & cleanups"
* tag 'erofs-for-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: don't trigger WARN() when decompression fails
erofs: get rid of ->lru usage
erofs: lzma compression support
erofs: rename some generic methods in decompressor
lib/xz, lib/decompress_unxz.c: Fix spelling in comments
lib/xz: Add MicroLZMA decoder
lib/xz: Move s->lzma.len = 0 initialization to lzma_reset()
lib/xz: Validate the value before assigning it to an enum variable
lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression
erofs: introduce readmore decompression strategy
erofs: introduce the secondary compression head
erofs: get compression algorithms directly on mapping
erofs: add multiple device support
erofs: decouple basic mount options from fs_context
erofs: remove the fast path of per-CPU buffer decompression
|
|
Pull block updates from Jens Axboe:
- mq-deadline accounting improvements (Bart)
- blk-wbt timer fix (Andrea)
- Untangle the block layer includes (Christoph)
- Rework the poll support to be bio based, which will enable adding
support for polling for bio based drivers (Christoph)
- Block layer core support for multi-actuator drives (Damien)
- blk-crypto improvements (Eric)
- Batched tag allocation support (me)
- Request completion batching support (me)
- Plugging improvements (me)
- Shared tag set improvements (John)
- Concurrent queue quiesce support (Ming)
- Cache bdev in ->private_data for block devices (Pavel)
- bdev dio improvements (Pavel)
- Block device invalidation and block size improvements (Xie)
- Various cleanups, fixes, and improvements (Christoph, Jackie,
Masahira, Tejun, Yu, Pavel, Zheng, me)
* tag 'for-5.16/block-2021-10-29' of git://git.kernel.dk/linux-block: (174 commits)
blk-mq-debugfs: Show active requests per queue for shared tags
block: improve readability of blk_mq_end_request_batch()
virtio-blk: Use blk_validate_block_size() to validate block size
loop: Use blk_validate_block_size() to validate block size
nbd: Use blk_validate_block_size() to validate block size
block: Add a helper to validate the block size
block: re-flow blk_mq_rq_ctx_init()
block: prefetch request to be initialized
block: pass in blk_mq_tags to blk_mq_rq_ctx_init()
block: add rq_flags to struct blk_mq_alloc_data
block: add async version of bio_set_polled
block: kill DIO_MULTI_BIO
block: kill unused polling bits in __blkdev_direct_IO()
block: avoid extra iter advance with async iocb
block: Add independent access ranges support
blk-mq: don't issue request directly in case that current is to be blocked
sbitmap: silence data race warning
blk-cgroup: synchronize blkg creation against policy deactivation
block: refactor bio_iov_bvec_set()
block: add single bio async direct IO helper
...
|
|
Pull memory folios from Matthew Wilcox:
"Add memory folios, a new type to represent either order-0 pages or the
head page of a compound page. This should be enough infrastructure to
support filesystems converting from pages to folios.
The point of all this churn is to allow filesystems and the page cache
to manage memory in larger chunks than PAGE_SIZE. The original plan
was to use compound pages like THP does, but I ran into problems with
some functions expecting only a head page while others expect the
precise page containing a particular byte.
The folio type allows a function to declare that it's expecting only a
head page. Almost incidentally, this allows us to remove various calls
to VM_BUG_ON(PageTail(page)) and compound_head().
This converts just parts of the core MM and the page cache. For 5.17,
we intend to convert various filesystems (XFS and AFS are ready; other
filesystems may make it) and also convert more of the MM and page
cache to folios. For 5.18, multi-page folios should be ready.
The multi-page folios offer some improvement to some workloads. The
80% win is real, but appears to be an artificial benchmark (postgres
startup, which isn't a serious workload). Real workloads (eg building
the kernel, running postgres in a steady state, etc) seem to benefit
between 0-10%. I haven't heard of any performance losses as a result
of this series. Nobody has done any serious performance tuning; I
imagine that tweaking the readahead algorithm could provide some more
interesting wins. There are also other places where we could choose to
create large folios and currently do not, such as writes that are
larger than PAGE_SIZE.
I'd like to thank all my reviewers who've offered review/ack tags:
Christoph Hellwig, David Howells, Jan Kara, Jeff Layton, Johannes
Weiner, Kirill A. Shutemov, Michal Hocko, Mike Rapoport, Vlastimil
Babka, William Kucharski, Yu Zhao and Zi Yan.
I'd also like to thank those who gave feedback I incorporated but
haven't offered up review tags for this part of the series: Nick
Piggin, Mel Gorman, Ming Lei, Darrick Wong, Ted Ts'o, John Hubbard,
Hugh Dickins, and probably a few others who I forget"
* tag 'folio-5.16' of git://git.infradead.org/users/willy/pagecache: (90 commits)
mm/writeback: Add folio_write_one
mm/filemap: Add FGP_STABLE
mm/filemap: Add filemap_get_folio
mm/filemap: Convert mapping_get_entry to return a folio
mm/filemap: Add filemap_add_folio()
mm/filemap: Add filemap_alloc_folio
mm/page_alloc: Add folio allocation functions
mm/lru: Add folio_add_lru()
mm/lru: Convert __pagevec_lru_add_fn to take a folio
mm: Add folio_evictable()
mm/workingset: Convert workingset_refault() to take a folio
mm/filemap: Add readahead_folio()
mm/filemap: Add folio_mkwrite_check_truncate()
mm/filemap: Add i_blocks_per_folio()
mm/writeback: Add folio_redirty_for_writepage()
mm/writeback: Add folio_account_redirty()
mm/writeback: Add folio_clear_dirty_for_io()
mm/writeback: Add folio_cancel_dirty()
mm/writeback: Add folio_account_cleaned()
mm/writeback: Add filemap_dirty_folio()
...
|
|
After commit 9298e63eafea ("bpf/tests: Add exhaustive tests of ALU
operand magnitudes"), when modprobe test_bpf.ko with JIT on mips64,
there exists segment fault due to the following reason:
[...]
ALU64_MOV_X: all register value magnitudes jited:1
Break instruction in kernel code[#1]
[...]
It seems that the related JIT implementations of some test cases
in test_bpf() have problems. At this moment, I do not care about
the segment fault while I just want to verify the test cases of
tail calls.
Based on the above background and motivation, add the following
module parameter test_suite to the test_bpf.ko:
test_suite=<string>: only the specified test suite will be run, the
string can be "test_bpf", "test_tail_calls" or "test_skb_segment".
If test_suite is not specified, but test_id, test_name or test_range
is specified, set 'test_bpf' as the default test suite. This is useful
to only test the corresponding test suite when specifying the valid
test_suite string.
Any invalid test suite will result in -EINVAL being returned and no
tests being run. If the test_suite is not specified or specified as
empty string, it does not change the current logic, all of the test
cases will be run.
Here are some test results:
# dmesg -c
# modprobe test_bpf
# dmesg | grep Summary
test_bpf: Summary: 1009 PASSED, 0 FAILED, [0/997 JIT'ed]
test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED
# rmmod test_bpf
# dmesg -c
# modprobe test_bpf test_suite=test_bpf
# dmesg | tail -1
test_bpf: Summary: 1009 PASSED, 0 FAILED, [0/997 JIT'ed]
# rmmod test_bpf
# dmesg -c
# modprobe test_bpf test_suite=test_tail_calls
# dmesg
test_bpf: #0 Tail call leaf jited:0 21 PASS
[...]
test_bpf: #7 Tail call error path, index out of range jited:0 32 PASS
test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
# rmmod test_bpf
# dmesg -c
# modprobe test_bpf test_suite=test_skb_segment
# dmesg
test_bpf: #0 gso_with_rx_frags PASS
test_bpf: #1 gso_linear_no_head_frag PASS
test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED
# rmmod test_bpf
# dmesg -c
# modprobe test_bpf test_id=1
# dmesg
test_bpf: test_bpf: set 'test_bpf' as the default test_suite.
test_bpf: #1 TXA jited:0 54 51 50 PASS
test_bpf: Summary: 1 PASSED, 0 FAILED, [0/1 JIT'ed]
# rmmod test_bpf
# dmesg -c
# modprobe test_bpf test_suite=test_bpf test_name=TXA
# dmesg
test_bpf: #1 TXA jited:0 54 50 51 PASS
test_bpf: Summary: 1 PASSED, 0 FAILED, [0/1 JIT'ed]
# rmmod test_bpf
# dmesg -c
# modprobe test_bpf test_suite=test_tail_calls test_range=6,7
# dmesg
test_bpf: #6 Tail call error path, NULL target jited:0 41 PASS
test_bpf: #7 Tail call error path, index out of range jited:0 32 PASS
test_bpf: test_tail_calls: Summary: 2 PASSED, 0 FAILED, [0/2 JIT'ed]
# rmmod test_bpf
# dmesg -c
# modprobe test_bpf test_suite=test_skb_segment test_id=1
# dmesg
test_bpf: #1 gso_linear_no_head_frag PASS
test_bpf: test_skb_segment: Summary: 1 PASSED, 0 FAILED
By the way, the above segment fault has been fixed in the latest bpf-next
tree which contains the mips64 JIT rework.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Link: https://lore.kernel.org/bpf/1635384321-28128-1-git-send-email-yangtiezhu@loongson.cn
|
|
The msm next tree is based on rc3, so let's just backmerge rc7 before pulling it in.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
The do while loop continues while ret is zero, but ret is never
initialized. The check for ret in the loop at the while should always be
initialized, but if an empty string were to be passed in, q would be NULL
and p would be '\0', and it would break out of the loop without ever
setting ret.
Set ret to zero, and then xbc_verify_tree() would be called and catch that
it is an empty tree and report the proper error.
Link: https://lkml.kernel.org/r/20211027105753.6ab9da5f@gandalf.local.home
Fixes: bdac5c2b243f ("bootconfig: Allocate xbc_data inside xbc_init()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
There are couple of improvements to static asserts against
the format specifier flags:
- new static assert for SIGN
- fix static assert for SMALL
SMALL is not equal to ASCII code of white space, it equals to
the bit difference between capital and small letters (however
the value is the same, semantically expression means different
things).
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211026140356.45610-1-andriy.shevchenko@linux.intel.com
|
|
All existing users of %pGp want the hex value as well as the decoded
flag names. This looks awkward (passing the same parameter to printf
twice), so move that functionality into the core. If we want, we
can make that optional with flag arguments to %pGp in the future.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211019142621.2810043-6-willy@infradead.org
|
|
Use scnprintf instead of snprintf + strlen.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211019142621.2810043-5-willy@infradead.org
|
|
Instead of having an ifdef to decide whether to print a |, use the
'append' functionality of the main loop to print it.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211019142621.2810043-4-willy@infradead.org
|
|
Keep flags intact so that we also test what happens when unknown flags
are passed to %pGp.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211019142621.2810043-3-willy@infradead.org
|
|
Instead of assigning ptf[i].value, leave the values in the on-stack
array and then we can make the array const.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211019142621.2810043-2-willy@infradead.org
|
|
Expose new node-aware API for bitmap allocation:
bitmap_alloc_node() / bitmap_zalloc_node().
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Since config KPROBES_SANITY_TEST is in lib/Kconfig.debug, it is better to
let test_kprobes.c in lib/, just like other similar tests found in lib/.
Link: https://lkml.kernel.org/r/1635213091-24387-4-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Fix the kernel doc of xbc_get_info() to add '@' to the parameters.
Link: https://lkml.kernel.org/r/163525086738.676803.15352231787913236933.stgit@devnote2
Fixes: e306220cb7b7 ("bootconfig: Add xbc_get_info() for the node information")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Since the xbc_alloc_mem() and xbc_free_mem() are used from
the __init functions and memblock_alloc() is __init function,
make them __init functions too.
Link: https://lkml.kernel.org/r/163515075747.547467.5746167540626712819.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 4ee1b4cac236 ("bootconfig: Cleanup dummy headers in tools/bootconfig")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
KCSAN complaints about the sbitmap hint update:
==================================================================
BUG: KCSAN: data-race in sbitmap_queue_clear / sbitmap_queue_clear
write to 0xffffe8ffffd145b8 of 4 bytes by interrupt on cpu 1:
sbitmap_queue_clear+0xca/0xf0 lib/sbitmap.c:606
blk_mq_put_tag+0x82/0x90
__blk_mq_free_request+0x114/0x180 block/blk-mq.c:507
blk_mq_free_request+0x2c8/0x340 block/blk-mq.c:541
__blk_mq_end_request+0x214/0x230 block/blk-mq.c:565
blk_mq_end_request+0x37/0x50 block/blk-mq.c:574
lo_complete_rq+0xca/0x170 drivers/block/loop.c:541
blk_complete_reqs block/blk-mq.c:584 [inline]
blk_done_softirq+0x69/0x90 block/blk-mq.c:589
__do_softirq+0x12c/0x26e kernel/softirq.c:558
run_ksoftirqd+0x13/0x20 kernel/softirq.c:920
smpboot_thread_fn+0x22f/0x330 kernel/smpboot.c:164
kthread+0x262/0x280 kernel/kthread.c:319
ret_from_fork+0x1f/0x30
write to 0xffffe8ffffd145b8 of 4 bytes by interrupt on cpu 0:
sbitmap_queue_clear+0xca/0xf0 lib/sbitmap.c:606
blk_mq_put_tag+0x82/0x90
__blk_mq_free_request+0x114/0x180 block/blk-mq.c:507
blk_mq_free_request+0x2c8/0x340 block/blk-mq.c:541
__blk_mq_end_request+0x214/0x230 block/blk-mq.c:565
blk_mq_end_request+0x37/0x50 block/blk-mq.c:574
lo_complete_rq+0xca/0x170 drivers/block/loop.c:541
blk_complete_reqs block/blk-mq.c:584 [inline]
blk_done_softirq+0x69/0x90 block/blk-mq.c:589
__do_softirq+0x12c/0x26e kernel/softirq.c:558
run_ksoftirqd+0x13/0x20 kernel/softirq.c:920
smpboot_thread_fn+0x22f/0x330 kernel/smpboot.c:164
kthread+0x262/0x280 kernel/kthread.c:319
ret_from_fork+0x1f/0x30
value changed: 0x00000035 -> 0x00000044
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 5.15.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
==================================================================
which is a data race, but not an important one. This is just updating the
percpu alloc hint, and the reader of that hint doesn't ever require it to
be valid.
Just annotate it with data_race() to silence this one.
Reported-by: syzbot+4f8bfd804b4a1f95b8f6@syzkaller.appspotmail.com
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Introduce a new nofault flag to indicate to iov_iter_get_pages not to
fault in user pages.
This is implemented by passing the FOLL_NOFAULT flag to get_user_pages,
which causes get_user_pages to fail when it would otherwise fault in a
page. We'll use the ->nofault flag to prevent iomap_dio_rw from faulting
in pages when page faults are not allowed.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
This converts the kprobes testcases to use the kunit framework.
It adds a dependency on CONFIG_KUNIT, and the output will change
to TAP:
TAP version 14
1..1
# Subtest: kprobes_test
1..4
random: crng init done
ok 1 - test_kprobe
ok 2 - test_kprobes
ok 3 - test_kretprobe
ok 4 - test_kretprobes
ok 1 - kprobes_test
Note that the kprobes testcases are no longer run immediately after
kprobes initialization, but as a late initcall when kunit is
initialized. kprobes itself is initialized with an early initcall,
so the order is still correct.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
adjust current v*pr_info() calls to fit an overview..detail scheme:
1- module level activity: add/remove, etc
2- command ingest, splitting, summary of effects.
per >control write
3- command parsing: op, flags, search terms
4- per-site change msg
can yield ~3k x 2 logs per echo "+p;-p" > command.
Summarize these 4 levels in MODULE_PARM_DESC, and update verbose=3 in Doc.
2- is new, to isolate a problem where a stress-test script (which
feeds ~4kb multi-command strings) would produce short writes,
truncating last command and causing parsing errors, which confused
test results. The script fix was to use syswrite, to deliver full
proper commands.
4- gets per-callsite "changed:" pr-infos, which are very noisy during
stress tests, and formerly obscured v1-3 messages, and overwhelmed the
static-key workload being tested.
The verbose parameter has previously seen adjustment:
commit 481c0e33f1e7 ("dyndbg: refine debug verbosity; 1 is basic, 2 more chatty")
The script driving these adjustments is:
!/usr/bin/perl -w
=for Doc
1st purpose was to benchmark the effect of wildcard queries on query
performance; if wildcards are risk free cheap enough, we can deploy
them in the (floating) format search. 1st finding: wildcards take 2x
as long to process.
2nd purpose was to benchmark real static-key changes VS simple flag
changes. Found ~100x decrease for the hard work.
The script maximizes workload per >control by packing it a ~4kb
string of "+p; -p;" commands; this uncovered some broken stuff.
The 85th query failed, and appears to be truncated, so is gramatically
incorrect. Its either an error here, or in the kernel. Its not
happening atm, retest.
Plot thickens: fail only happens doing +-p, not +-mf, likely load
dependent. Error remains consistent. Looks like a short write,
longer on writer than kernel-reader. Try syswrite on handle to
control this. That fixed short write.
=cut
use Getopt::Std;
getopts('vN:k:', \my %opts) or die <<EOH;
$0 options:
-v verbose
-k=n kernel dyndbg verbosity
-N=n number of loops.. tbrc
EOH
$opts{N} //= 10; # !undef, 0 tests too long.
my $ctrl = '/proc/dynamic_debug/control';
vx($opts{k}) if defined $opts{k}; # works on -k0
open(my $CTL, '>', $ctrl) or die "cant open $ctrl for writing: $!\n";
sub vx {
my $arg = shift;
my $cmd = "echo $arg > /sys/module/dynamic_debug/parameters/verbose";
system($cmd);
warn("vx problem: rc:$? err:$! qry: $cmd\n") if ($?);
}
sub qryOK {
my $qry = shift;
print "syntax test: <\n$qry>\n" if $opts{v};
my $bytes = syswrite $CTL, $qry;
printf "short read: $bytes / %d\n", length $qry if $bytes < length $qry;
if ($?) {
warn "rc:$? err:$! qry: $qry\n";
return 0;
}
return 1;
}
sub build_queries {
my ($cmd, $flags, $ct) = @_;
# build experiment and reference queries
my $cycle = " $cmd +$flags # on ; $cmd -$flags # off \n";
my $ref = " +$flags ; -$flags \n";
my $len = length $cycle;
my $max = int(4096 / $len); # break/fit to buffer size
$ct |= $max;
print "qry: ct:$max x << \n$cycle >>\n";
return unless qryOK($ref);
return unless qryOK($cycle);
my $wild = $cycle x $ct;
my $empty = $ref x $ct;
printf "len: %d, %d\n", length $wild, length $empty;
return { trial => $wild,
ref => $empty,
probe => $cycle,
zero => $ref,
count => $ct,
max => $max
};
}
my $query_set = build_queries(' file "*" module "*" func "*" ', "mf");
qryOK($query_set->{zero});
qryOK($query_set->{probe});
qryOK($query_set->{ref});
qryOK($query_set->{trial});
use Benchmark;
sub dobatch {
my ($cmd, $flags, $reps, $ct) = @_;
$reps ||= $opts{N};
my $qs = build_queries($cmd, $flags, $ct);
timethese($reps,
{
wildcards => sub {
syswrite $CTL, $qs->{trial};
},
no_search => sub {
syswrite $CTL, $qs->{ref};
}
}
);
}
sub bench_static_key_toggle {
vx 0;
dobatch(' file "*" module "*" func "*" ', "mf");
dobatch(' file "*" module "*" func "*" ', "p");
}
sub bench_verbose_levels {
for my $i (0..4) {
vx $i;
dobatch(' file "*" module "*" func "*" ', "mf");
}
}
bench_static_key_toggle();
__END__
Heres how the test-script runs:
:: verbose=3 parsing info
[ 48.401646] dyndbg: query 95: "file "*" module "*" func "*" -mf # off " mod:*
[ 48.402040] dyndbg: split into words: "file" "*" "module" "*" "func" "*" "-mf"
[ 48.402456] dyndbg: op='-'
[ 48.402615] dyndbg: flags=0x6
[ 48.402779] dyndbg: *flagsp=0x0 *maskp=0xfffffff9
[ 48.403033] dyndbg: parsed: func="*" file="*" module="*" format="" lineno=0-0
[ 48.403674] dyndbg: applied: func="*" file="*" module="*" format="" lineno=0-0
:: verbose=2 >control summary.
~300k site matches/changes per 4kb command
[ 48.404063] dyndbg: processed 96 queries, with 296160 matches, 0 errs
:: 2 queries against each other, no-search vs all-wildcard-search
qry: ct:48 x <<
file "*" module "*" func "*" +mf # on ; file "*" module "*" func "*" -mf # off
>>
len: 4080, 576
Benchmark: timing 10 iterations of no_search, wildcards...
no_search: 0 wallclock secs ( 0.00 usr + 0.03 sys = 0.03 CPU) @ 333.33/s (n=10)
(warning: too few iterations for a reliable count)
wildcards: 0 wallclock secs ( 0.00 usr + 0.09 sys = 0.09 CPU) @ 111.11/s (n=10)
(warning: too few iterations for a reliable count)
:: 2 queries, both doing real work / changing stati-key states.
qry: ct:49 x <<
file "*" module "*" func "*" +p # on ; file "*" module "*" func "*" -p # off
>>
len: 4067, 490
Benchmark: timing 10 iterations of no_search, wildcards...
no_search: 20 wallclock secs ( 0.00 usr + 20.36 sys = 20.36 CPU) @ 0.49/s (n=10)
wildcards: 21 wallclock secs ( 0.00 usr + 21.08 sys = 21.08 CPU) @ 0.47/s (n=10)
bash-5.1#
Thats 150k static-key-toggles / sec
~600x slower than simple flags
on qemu --smp 3 run
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20211019210746.185307-1-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Introduce a new fault_in_iov_iter_writeable helper for safely faulting
in an iterator for writing. Uses get_user_pages() to fault in the pages
without actually writing to them, which would be destructive.
We'll use fault_in_iov_iter_writeable in gfs2 once we've determined that
the iterator passed to .read_iter isn't in memory.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
There are some KUnit tests (KFENCE, Thunderbolt) which, for various
reasons, do not use the kunit_test_suite() macro and end up running
before the KUnit executor runs its tests. This means that their results
are printed separately, and they aren't included in the suite count used
by the executor.
This causes the executor output to be invalid TAP, however, as the suite
numbers used are no-longer 1-based, and don't match the test plan.
kunit_tool, therefore, prints a large number of warnings.
While it'd be nice to fix the tests to run in the executor, in the
meantime, reset the suite counter to 1 in __kunit_test_suites_exit.
Not only does this fix the executor, it means that if there are multiple
calls to __kunit_test_suites_init() across different tests, they'll each
get their own numbering.
kunit_tool likes this better: even if it's lacking the results for those
tests which don't use the executor (due to the lack of TAP header), the
output for the other tests is valid.
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Context:
It's difficult to map a given .kunitconfig => set of enabled tests.
Letting kunit.py figure that out would be useful.
This patch:
* is intended to be an implementation detail used only by kunit.py
* adds a kunit.action module param with one valid non-null value, "list"
* for the "list" action, it simply prints out "<suite>.<test>"
* leaves the kunit.py changes to make use of this for another patch.
Note: kunit.filter_glob is respected for this and all future actions.
Hack: we print a TAP header (but no test plan) to allow kunit.py to
use the same code to pick up KUnit output that it does for normal tests.
Since this is intended to be an implementation detail, it seems fine for
now. Maybe in the future we output each test as SKIPPED or the like.
Go with a more generic "action" param, since it seems like we might
eventually have more modes besides just running or listing tests, e.g.
* perhaps a benchmark mode that reruns test cases and reports timing
* perhaps a deflake mode that reruns test cases that failed
* perhaps a mode where we randomize test order to try and catch
hermeticity bugs like "test a only passes if run after test b"
Tested:
$ ./tools/testing/kunit/kunit.py run --kernel_arg=kunit.action=list --raw_output=kunit
...
TAP version 14
1..1
example.example_simple_test
example.example_skip_test
example.example_mark_skipped_test
reboot: System halted
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
When a user filters by a suite and not a test, e.g.
$ ./tools/testing/kunit/kunit.py run 'suite_name'
it hits this code
const int len = strlen(filter_glob);
...
parsed->suite_glob = kmalloc(len, GFP_KERNEL);
which fails to allocate space for the terminating NULL.
Somehow, it seems like we can't easily reproduce this under UML, so the
existing `parse_filter_test()` didn't catch this.
Fix this by allocating `len + 1` and switch to kzalloc() just to be a
bit more defensive. We're only going to run this code once per kernel
boot, and it should never be very long.
Also update the unit tests to be a bit more cautious.
This bug showed up as a NULL pointer dereference here:
> KUNIT_EXPECT_STREQ(test, (const char *)filtered.start[0][0]->name, "suite0");
`filtered.start[0][0]` was NULL, and `name` is at offset 0 in the struct,
so `...->name` was also NULL.
Fixes: 3b29021ddd10 ("kunit: tool: allow filtering test cases via glob")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Commit 1d71307a6f94 ("kunit: add unit test for filtering suites by
names") introduced the ability to filter which suites we run via glob.
This change extends it so we can also filter individual test cases
inside of suites as well.
This is quite useful when, e.g.
* trying to run just the tests cases you've just added or are working on
* trying to debug issues with test hermeticity
Examples:
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit '*exec*.parse*'
...
============================================================
======== [PASSED] kunit_executor_test ========
[PASSED] parse_filter_test
============================================================
Testing complete. 1 tests run. 0 failed. 0 crashed.
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit '*.no_matching_tests'
...
[ERROR] no tests run!
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
This test assumes that the declared kunit_suite object is the exact one
which is being executed, which KUnit will not guarantee [1].
Specifically, `suite->log` is not initialized until a suite object is
executed. So if KUnit makes a copy of the suite and runs that instead,
this test dereferences an invalid pointer and (hopefully) segfaults.
N.B. since we no longer assume this, we can no longer verify that
`suite->log` is *not* allocated during normal execution.
An alternative to this patch that would allow us to test that would
require exposing an API for the current test to get its current suite.
Exposing that for one internal kunit test seems like overkill, and
grants users more footguns (e.g. reusing a test case in multiple suites
and changing behavior based on the suite name, dynamically modifying the
setup/cleanup funcs, storing/reading stuff out of the suite->log, etc.).
[1] In a subsequent patch, KUnit will allow running subsets of test
cases within a suite by making a copy of the suite w/ the filtered test
list. But there are other reasons KUnit might execute a copy, e.g. if it
ever wants to support parallel execution of different suites, recovering
from errors and restarting suites
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
uncompressible -> incompressible
non-splitted -> non-split
Link: https://lore.kernel.org/r/20211010213145.17462-6-xiang@kernel.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
MicroLZMA is a yet another header format variant where the first
byte of a raw LZMA stream (without the end of stream marker) has
been replaced with a bitwise-negation of the lc/lp/pb properties
byte. MicroLZMA was created to be used in EROFS but can be used
by other things too where wasting minimal amount of space for
headers is important.
This is implemented using most of the LZMA2 code as is so the
amount of new code is small. The API has a few extra features
compared to the XZ decoder. On the other hand, the API lacks
XZ_BUF_ERROR support which is important to take into account
when using this API.
MicroLZMA doesn't support BCJ filters. In theory they could be
added later as there are many unused/reserved values for the
first byte of the compressed stream but in practice it is
somewhat unlikely to happen due to a few implementation reasons.
Link: https://lore.kernel.org/r/20211010213145.17462-5-xiang@kernel.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
It's a more logical place even if the resetting needs to be done
only once per LZMA2 stream (if lzma_reset() called in the middle
of an LZMA2 stream, .len will already be 0).
Link: https://lore.kernel.org/r/20211010213145.17462-4-xiang@kernel.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
This might matter, for example, if the underlying type of enum xz_check
was a signed char. In such a case the validation wouldn't have caught an
unsupported header. I don't know if this problem can occur in the kernel
on any arch but it's still good to fix it because some people might copy
the XZ code to their own projects from Linux instead of the upstream
XZ Embedded repository.
This change may increase the code size by a few bytes. An alternative
would have been to use an unsigned int instead of enum xz_check but
using an enumeration looks cleaner.
Link: https://lore.kernel.org/r/20211010213145.17462-3-xiang@kernel.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
decompression
With valid files, the safety margin described in lib/decompress_unxz.c
ensures that these buffers cannot overlap. But if the uncompressed size
of the input is larger than the caller thought, which is possible when
the input file is invalid/corrupt, the buffers can overlap. Obviously
the result will then be garbage (and usually the decoder will return
an error too) but no other harm will happen when such an over-run occurs.
This change only affects uncompressed LZMA2 chunks and so this
should have no effect on performance.
Link: https://lore.kernel.org/r/20211010213145.17462-2-xiang@kernel.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
sbitmap currently only supports clearing tags one-by-one, add a helper
that allows the caller to pass in an array of tags to clear.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
A common idiom in kernel code is to wipe the contents of a structure
starting from a given member. These open-coded cases are usually difficult
to read and very sensitive to struct layout changes. Like memset_after(),
introduce a new helper, memset_startat() that takes the target struct
instance, the byte to write, and the member name where zeroing should
start.
Note that this doesn't zero padding preceding the target member. For
those cases, memset_after() should be used on the preceding member.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Francis Laniel <laniel_francis@privacyrequired.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
A common idiom in kernel code is to wipe the contents of a structure
after a given member. This is especially useful in places where there is
trailing padding. These open-coded cases are usually difficult to read
and very sensitive to struct layout changes. Introduce a new helper,
memset_after() that takes the target struct instance, the byte to write,
and the member name after which the zeroing should start.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Francis Laniel <laniel_francis@privacyrequired.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Before changing anything about memcpy(), memmove(), and memset(), add
run-time tests to check basic behaviors for any regressions.
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
While the run-time testing of FORTIFY_SOURCE is already present in
LKDTM, there is no testing of the expected compile-time detections. In
preparation for correctly supporting FORTIFY_SOURCE under Clang, adding
additional FORTIFY_SOURCE defenses, and making sure FORTIFY_SOURCE
doesn't silently regress with GCC, introduce a build-time test suite that
checks each expected compile-time failure condition.
As this is relatively backwards from standard build rules in the
sense that a successful test is actually a compile _failure_, create
a wrapper script to check for the correct errors, and wire it up as
a dummy dependency to lib/string.o, collecting the results into a log
file artifact.
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Turn iov_iter_fault_in_readable into a function that returns the number
of bytes not faulted in, similar to copy_to_user, instead of returning a
non-zero value when any of the requested pages couldn't be faulted in.
This supports the existing users that require all pages to be faulted in
as well as new users that are happy if any pages can be faulted in.
Rename iov_iter_fault_in_readable to fault_in_iov_iter_readable to make
sure this change doesn't silently break things.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Turn fault_in_pages_{readable,writeable} into versions that return the
number of bytes not faulted in, similar to copy_to_user, instead of
returning a non-zero value when any of the requested pages couldn't be
faulted in. This supports the existing users that require all pages to
be faulted in as well as new users that are happy if any pages can be
faulted in.
Rename the functions to fault_in_{readable,writeable} to make sure
this change doesn't silently break things.
Neither of these functions is entirely trivial and it doesn't seem
useful to inline them, so move them to mm/gup.c.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The block layer tag allocation batching still calls into sbitmap to get
each tag, but we can improve on that. Add __sbitmap_queue_get_batch(),
which returns a mask of tags all at once, along with an offset for
those tags.
An example return would be 0xff, where bits 0..7 are set, with
tag_offset == 128. The valid tags in this case would be 128..135.
A batch is specific to an individual sbitmap_map, hence it cannot be
larger than that. The requested number of tags is automatically reduced
to the max that can be satisfied with a single map.
On failure, 0 is returned. Caller should fall back to single tag
allocation at that point/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
blk-cgroup.h pulls in blkdev.h and thus pretty much all the block
headers. Break this dependency chain by turning wbc_blkcg_css into a
macro and dropping the blk-cgroup.h include.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210920123328.1399408-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When batching events (such as writing back N pages in a single I/O), it
is better to do one flex_proportion operation instead of N. There is
only one caller of __fprop_inc_percpu_max(), and it's the one we're
going to change in the next patch, so rename it instead of adding a
compatibility wrapper.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
|
|
We need the driver-core fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The cited commit inadvertently altered the verbose level of a
vpr_info, restore it to original.
Fixes: 216a0fc40897 ("dyndbg: show module in vpr-info in dd-exec-queries")
Signed-off-By: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20211014223614.1952171-1-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
tools/testing/selftests/net/ioam6.sh
7b1700e009cc ("selftests: net: modify IOAM tests for undef bits")
bf77b1400a56 ("selftests: net: Test for the IOAM encapsulation with IPv6")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
when `echo $cmd > control` contains multiple queries, extra query
separators (;\n) can parse as empty statements. This is normal, and a
vpr-info on an empty command is just noise.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20211013220726.1280565-4-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
On qemu --smp 3 runs, remove-module can get called 3 times.
So don't print on entry; instead print "removed" after entry is
found and removed, so just once.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20211013220726.1280565-3-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This param has been deprecated for a very long time now, let's rip it
out.
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Link: https://lore.kernel.org/r/1634139622-20667-3-git-send-email-jbaron@akamai.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Right now dyndbg shows up as an unknown parameter if used on boot:
Unknown command line parameters: dyndbg=+p
That's because it is unknown, it doesn't sit in the __param
section, so the processing done to warn users supplying an unknown
parameter doesn't think it is legitimate.
Install a dummy handler to register it. dynamic debug needs to search
the whole command line for modules listed that are currently builtin,
so there's no real work to be done in this callback.
Fixes: 86d1919a4fb0 ("init: print out unknown kernel parameters")
Tested-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Link: https://lore.kernel.org/r/1634139622-20667-2-git-send-email-jbaron@akamai.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.
So, use the struct_size() helper to do the arithmetic instead of the
argument "size + count * size" in the kmalloc() and kzalloc() functions.
Also, take the opportunity to refactor the memcpy() calls to use the
struct_size() and flex_array_size() helpers.
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
Signed-off-by: Len Baker <len.baker@gmx.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
|
|
dynamic_debug_exec_queries() accepts a separate module arg (so it can
support $module.dyndbg boot arg), display that in the vpr-info for a
more useful user-debug context.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20211012183310.1016678-2-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|