summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2019-06-14dma-remap: Avoid de-referencing NULL atomic_poolFlorian Fainelli
With architectures allowing the kernel to be placed almost arbitrarily in memory (e.g.: ARM64), it is possible to have the kernel resides at physical addresses above 4GB, resulting in neither the default CMA area, nor the atomic pool from successfully allocating. This does not prevent specific peripherals from working though, one example is XHCI, which still operates correctly. Trouble comes when the XHCI driver gets suspended and resumed, since we can now trigger the following NPD: [ 12.664170] usb usb1: root hub lost power or was reset [ 12.669387] usb usb2: root hub lost power or was reset [ 12.674662] Unable to handle kernel NULL pointer dereference at virtual address 00000008 [ 12.682896] pgd = ffffffc1365a7000 [ 12.686386] [00000008] *pgd=0000000136500003, *pud=0000000136500003, *pmd=0000000000000000 [ 12.694897] Internal error: Oops: 96000006 [#1] SMP [ 12.699843] Modules linked in: [ 12.702980] CPU: 0 PID: 1499 Comm: pml Not tainted 4.9.135-1.13pre #51 [ 12.709577] Hardware name: BCM97268DV (DT) [ 12.713736] task: ffffffc136bb6540 task.stack: ffffffc1366cc000 [ 12.719740] PC is at addr_in_gen_pool+0x4/0x48 [ 12.724253] LR is at __dma_free+0x64/0xbc [ 12.728325] pc : [<ffffff80083c0df8>] lr : [<ffffff80080979e0>] pstate: 60000145 [ 12.735825] sp : ffffffc1366cf990 [ 12.739196] x29: ffffffc1366cf990 x28: ffffffc1366cc000 [ 12.744608] x27: 0000000000000000 x26: ffffffc13a8568c8 [ 12.750020] x25: 0000000000000000 x24: ffffff80098f9000 [ 12.755433] x23: 000000013a5ff000 x22: ffffff8009c57000 [ 12.760844] x21: ffffffc13a856810 x20: 0000000000000000 [ 12.766255] x19: 0000000000001000 x18: 000000000000000a [ 12.771667] x17: 0000007f917553e0 x16: 0000000000001002 [ 12.777078] x15: 00000000000a36cb x14: ffffff80898feb77 [ 12.782490] x13: ffffffffffffffff x12: 0000000000000030 [ 12.787899] x11: 00000000fffffffe x10: ffffff80098feb7f [ 12.793311] x9 : 0000000005f5e0ff x8 : 65776f702074736f [ 12.798723] x7 : 6c2062756820746f x6 : ffffff80098febb1 [ 12.804134] x5 : ffffff800809797c x4 : 0000000000000000 [ 12.809545] x3 : 000000013a5ff000 x2 : 0000000000000fff [ 12.814955] x1 : ffffff8009c57000 x0 : 0000000000000000 [ 12.820363] [ 12.821907] Process pml (pid: 1499, stack limit = 0xffffffc1366cc020) [ 12.828421] Stack: (0xffffffc1366cf990 to 0xffffffc1366d0000) [ 12.834240] f980: ffffffc1366cf9e0 ffffff80086004d0 [ 12.842186] f9a0: ffffffc13ab08238 0000000000000010 ffffff80097c2218 ffffffc13a856810 [ 12.850131] f9c0: ffffff8009c57000 000000013a5ff000 0000000000000008 000000013a5ff000 [ 12.858076] f9e0: ffffffc1366cfa50 ffffff80085f9250 ffffffc13ab08238 0000000000000004 [ 12.866021] fa00: ffffffc13ab08000 ffffff80097b6000 ffffffc13ab08130 0000000000000001 [ 12.873966] fa20: 0000000000000008 ffffffc13a8568c8 0000000000000000 ffffffc1366cc000 [ 12.881911] fa40: ffffffc13ab08130 0000000000000001 ffffffc1366cfa90 ffffff80085e3de8 [ 12.889856] fa60: ffffffc13ab08238 0000000000000000 ffffffc136b75b00 0000000000000000 [ 12.897801] fa80: 0000000000000010 ffffff80089ccb92 ffffffc1366cfac0 ffffff80084ad040 [ 12.905746] faa0: ffffffc13a856810 0000000000000000 ffffff80084ad004 ffffff80084b91a8 [ 12.913691] fac0: ffffffc1366cfae0 ffffff80084b91b4 ffffffc13a856810 ffffff80080db5cc [ 12.921636] fae0: ffffffc1366cfb20 ffffff80084b96bc ffffffc13a856810 0000000000000010 [ 12.929581] fb00: ffffffc13a856870 0000000000000000 ffffffc13a856810 ffffff800984d2b8 [ 12.937526] fb20: ffffffc1366cfb50 ffffff80084baa70 ffffff8009932ad0 ffffff800984d260 [ 12.945471] fb40: 0000000000000010 00000002eff0a065 ffffffc1366cfbb0 ffffff80084bafbc [ 12.953415] fb60: 0000000000000010 0000000000000003 ffffff80098fe000 0000000000000000 [ 12.961360] fb80: ffffff80097b6000 ffffff80097b6dc8 ffffff80098c12b8 ffffff80098c12f8 [ 12.969306] fba0: ffffff8008842000 ffffff80097b6dc8 ffffffc1366cfbd0 ffffff80080e0d88 [ 12.977251] fbc0: 00000000fffffffb ffffff80080e10bc ffffffc1366cfc60 ffffff80080e16a8 [ 12.985196] fbe0: 0000000000000000 0000000000000003 ffffff80097b6000 ffffff80098fe9f0 [ 12.993140] fc00: ffffff80097d4000 ffffff8008983802 0000000000000123 0000000000000040 [ 13.001085] fc20: ffffff8008842000 ffffffc1366cc000 ffffff80089803c2 00000000ffffffff [ 13.009029] fc40: 0000000000000000 0000000000000000 ffffffc1366cfc60 0000000000040987 [ 13.016974] fc60: ffffffc1366cfcc0 ffffff80080dfd08 0000000000000003 0000000000000004 [ 13.024919] fc80: 0000000000000003 ffffff80098fea08 ffffffc136577ec0 ffffff80089803c2 [ 13.032864] fca0: 0000000000000123 0000000000000001 0000000500000002 0000000000040987 [ 13.040809] fcc0: ffffffc1366cfd00 ffffff80083a89d4 0000000000000004 ffffffc136577ec0 [ 13.048754] fce0: ffffffc136610cc0 ffffffffffffffea ffffffc1366cfeb0 ffffffc136610cd8 [ 13.056700] fd00: ffffffc1366cfd10 ffffff800822a614 ffffffc1366cfd40 ffffff80082295d4 [ 13.064645] fd20: 0000000000000004 ffffffc136577ec0 ffffffc136610cc0 0000000021670570 [ 13.072590] fd40: ffffffc1366cfd80 ffffff80081b5d10 ffffff80097b6000 ffffffc13aae4200 [ 13.080536] fd60: ffffffc1366cfeb0 0000000000000004 0000000021670570 0000000000000004 [ 13.088481] fd80: ffffffc1366cfe30 ffffff80081b6b20 ffffffc13aae4200 0000000000000000 [ 13.096427] fda0: 0000000000000004 0000000021670570 ffffffc1366cfeb0 ffffffc13a838200 [ 13.104371] fdc0: 0000000000000000 000000000000000a ffffff80097b6000 0000000000040987 [ 13.112316] fde0: ffffffc1366cfe20 ffffff80081b3af0 ffffffc13a838200 0000000000000000 [ 13.120261] fe00: ffffffc1366cfe30 ffffff80081b6b0c ffffffc13aae4200 0000000000000000 [ 13.128206] fe20: 0000000000000004 0000000000040987 ffffffc1366cfe70 ffffff80081b7dd8 [ 13.136151] fe40: ffffff80097b6000 ffffffc13aae4200 ffffffc13aae4200 fffffffffffffff7 [ 13.144096] fe60: 0000000021670570 ffffffc13a8c63c0 0000000000000000 ffffff8008083180 [ 13.152042] fe80: ffffffffffffff1d 0000000021670570 ffffffffffffffff 0000007f917ad9b8 [ 13.159986] fea0: 0000000020000000 0000000000000015 0000000000000000 0000000000040987 [ 13.167930] fec0: 0000000000000001 0000000021670570 0000000000000004 0000000000000000 [ 13.175874] fee0: 0000000000000888 0000440110000000 000000000000006d 0000000000000003 [ 13.183819] ff00: 0000000000000040 ffffff80ffffffc8 0000000000000000 0000000000000020 [ 13.191762] ff20: 0000000000000000 0000000000000000 0000000000000001 0000000000000000 [ 13.199707] ff40: 0000000000000000 0000007f917553e0 0000000000000000 0000000000000004 [ 13.207651] ff60: 0000000021670570 0000007f91835480 0000000000000004 0000007f91831638 [ 13.215595] ff80: 0000000000000004 00000000004b0de0 00000000004b0000 0000000000000000 [ 13.223539] ffa0: 0000000000000000 0000007fc92ac8c0 0000007f9175d178 0000007fc92ac8c0 [ 13.231483] ffc0: 0000007f917ad9b8 0000000020000000 0000000000000001 0000000000000040 [ 13.239427] ffe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 13.247360] Call trace: [ 13.249866] Exception stack(0xffffffc1366cf7a0 to 0xffffffc1366cf8d0) [ 13.256386] f7a0: 0000000000001000 0000007fffffffff ffffffc1366cf990 ffffff80083c0df8 [ 13.264331] f7c0: 0000000060000145 ffffff80089b5001 ffffffc13ab08130 0000000000000001 [ 13.272275] f7e0: 0000000000000008 ffffffc13a8568c8 0000000000000000 0000000000000000 [ 13.280220] f800: ffffffc1366cf960 ffffffc1366cf960 ffffffc1366cf930 00000000ffffffd8 [ 13.288165] f820: ffffff8009931ac0 4554535953425553 4544006273753d4d 3831633d45434956 [ 13.296110] f840: ffff003832313a39 ffffff800845926c ffffffc1366cf880 0000000000040987 [ 13.304054] f860: 0000000000000000 ffffff8009c57000 0000000000000fff 000000013a5ff000 [ 13.311999] f880: 0000000000000000 ffffff800809797c ffffff80098febb1 6c2062756820746f [ 13.319944] f8a0: 65776f702074736f 0000000005f5e0ff ffffff80098feb7f 00000000fffffffe [ 13.327884] f8c0: 0000000000000030 ffffffffffffffff [ 13.332835] [<ffffff80083c0df8>] addr_in_gen_pool+0x4/0x48 [ 13.338398] [<ffffff80086004d0>] xhci_mem_cleanup+0xc8/0x51c [ 13.344137] [<ffffff80085f9250>] xhci_resume+0x308/0x65c [ 13.349524] [<ffffff80085e3de8>] xhci_brcm_resume+0x84/0x8c [ 13.355174] [<ffffff80084ad040>] platform_pm_resume+0x3c/0x64 [ 13.360997] [<ffffff80084b91b4>] dpm_run_callback+0x5c/0x15c [ 13.366732] [<ffffff80084b96bc>] device_resume+0xc0/0x190 [ 13.372205] [<ffffff80084baa70>] dpm_resume+0x144/0x2cc [ 13.377504] [<ffffff80084bafbc>] dpm_resume_end+0x20/0x34 [ 13.382980] [<ffffff80080e0d88>] suspend_devices_and_enter+0x104/0x704 [ 13.389585] [<ffffff80080e16a8>] pm_suspend+0x320/0x53c [ 13.394881] [<ffffff80080dfd08>] state_store+0xbc/0xe0 [ 13.400094] [<ffffff80083a89d4>] kobj_attr_store+0x14/0x24 [ 13.405655] [<ffffff800822a614>] sysfs_kf_write+0x60/0x70 [ 13.411128] [<ffffff80082295d4>] kernfs_fop_write+0x130/0x194 [ 13.416954] [<ffffff80081b5d10>] __vfs_write+0x60/0x150 [ 13.422254] [<ffffff80081b6b20>] vfs_write+0xc8/0x164 [ 13.427376] [<ffffff80081b7dd8>] SyS_write+0x70/0xc8 [ 13.432412] [<ffffff8008083180>] el0_svc_naked+0x34/0x38 [ 13.437800] Code: 92800173 97f6fb9e 17fffff5 d1000442 (f8408c03) [ 13.444033] ---[ end trace 2effe12f909ce205 ]--- The call path leading to this problem is xhci_mem_cleanup() -> dma_free_coherent() -> dma_free_from_pool() -> addr_in_gen_pool. If the atomic_pool is NULL, we can't possibly have the address in the atomic pool anyway, so guard against that. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-06-03dma-direct: provide generic support for uncached kernel segmentsChristoph Hellwig
A few architectures support uncached kernel segments. In that case we get an uncached mapping for a given physica address by using an offset in the uncached segement. Implement support for this scheme in the generic dma-direct code instead of duplicating it in arch hooks. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-06-03dma-contiguous: use fallback alloc_pages for single pagesNicolin Chen
The addresses within a single page are always contiguous, so it's not so necessary to always allocate one single page from CMA area. Since the CMA area has a limited predefined size of space, it may run out of space in heavy use cases, where there might be quite a lot CMA pages being allocated for single pages. However, there is also a concern that a device might care where a page comes from -- it might expect the page from CMA area and act differently if the page doesn't. This patch tries to use the fallback alloc_pages path, instead of one-page size allocations from the global CMA area in case that a device does not have its own CMA area. This'd save resources from the CMA global area for more CMA allocations, and also reduce CMA fragmentations resulted from trivial allocations. Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> Tested-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-06-03dma-contiguous: add dma_{alloc,free}_contiguous() helpersNicolin Chen
Both dma_alloc_from_contiguous() and dma_release_from_contiguous() are very simply implemented, but requiring callers to pass certain parameters like count and align, and taking a boolean parameter to check __GFP_NOWARN in the allocation flags. So every function call duplicates similar work: unsigned long order = get_order(size); size_t count = size >> PAGE_SHIFT; page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN); [...] dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT); Additionally, as CMA can be used only in the context which permits sleeping, most of callers do a gfpflags_allow_blocking() check and a corresponding fallback allocation of normal pages upon any false result: if (gfpflags_allow_blocking(flag)) page = dma_alloc_from_contiguous(); if (!page) page = alloc_pages(); [...] if (!dma_release_from_contiguous(dev, page, count)) __free_pages(page, get_order(size)); So this patch simplifies those function calls by abstracting these operations into the two new functions: dma_{alloc,free}_contiguous. As some callers of dma_{alloc,release}_from_contiguous() might be complicated, this patch just implements these two new functions to kernel/dma/direct.c only as an initial step. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> Tested-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-26Merge tag 'trace-v5.2-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing warning fix from Steven Rostedt: "Make the GCC 9 warning for sub struct memset go away. GCC 9 now warns about calling memset() on partial structures when it goes across multiple fields. This adds a helper for the place in tracing that does this type of clearing of a structure" * tag 'trace-v5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Silence GCC 9 array bounds warning
2019-05-25tracing: Silence GCC 9 array bounds warningMiguel Ojeda
Starting with GCC 9, -Warray-bounds detects cases when memset is called starting on a member of a struct but the size to be cleared ends up writing over further members. Such a call happens in the trace code to clear, at once, all members after and including `seq` on struct trace_iterator: In function 'memset', inlined from 'ftrace_dump' at kernel/trace/trace.c:8914:3: ./include/linux/string.h:344:9: warning: '__builtin_memset' offset [8505, 8560] from the object at 'iter' is out of the bounds of referenced subobject 'seq' with type 'struct trace_seq' at offset 4368 [-Warray-bounds] 344 | return __builtin_memset(p, c, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In order to avoid GCC complaining about it, we compute the address ourselves by adding the offsetof distance instead of referring directly to the member. Since there are two places doing this clear (trace.c and trace_kdb.c), take the chance to move the workaround into a single place in the internal header. Link: http://lkml.kernel.org/r/20190523124535.GA12931@gmail.com Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> [ Removed unnecessary parenthesis around "iter" ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-25Merge tag 'trace-v5.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Tom Zanussi sent me some small fixes and cleanups to the histogram code and I forgot to incorporate them. I also added a small clean up patch that was sent to me a while ago and I just noticed it" * tag 'trace-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: kernel/trace/trace.h: Remove duplicate header of trace_seq.h tracing: Add a check_val() check before updating cond_snapshot() track_val tracing: Check keys for variable references in expressions too tracing: Prevent hist_field_var_ref() from accessing NULL tracing_map_elts
2019-05-24Merge tag 'spdx-5.2-rc2-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pule more SPDX updates from Greg KH: "Here is another set of reviewed patches that adds SPDX tags to different kernel files, based on a set of rules that are being used to parse the comments to try to determine that the license of the file is "GPL-2.0-or-later". Only the "obvious" versions of these matches are included here, a number of "non-obvious" variants of text have been found but those have been postponed for later review and analysis. These patches have been out for review on the linux-spdx@vger mailing list, and while they were created by automatic tools, they were hand-verified by a bunch of different people, all whom names are on the patches are reviewers" * tag 'spdx-5.2-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (85 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 125 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 123 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 122 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 121 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 120 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 119 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 116 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 114 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 113 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 112 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 111 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 110 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 106 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 105 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 104 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 103 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 102 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 101 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 98 ...
2019-05-24locking/lock_events: Use this_cpu_add() when necessaryWaiman Long
The kernel test robot has reported that the use of __this_cpu_add() causes bug messages like: BUG: using __this_cpu_add() in preemptible [00000000] code: ... Given the imprecise nature of the count and the possibility of resetting the count and doing the measurement again, this is not really a big problem to use the unprotected __this_cpu_*() functions. To make the preemption checking code happy, the this_cpu_*() functions will be used if CONFIG_DEBUG_PREEMPT is defined. The imprecise nature of the locking counts are also documented with the suggestion that we should run the measurement a few times with the counts reset in between to get a better picture of what is going on under the hood. Fixes: a8654596f0371 ("locking/rwsem: Enable lock event counting") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 38Thomas Gleixner
Based on 1 normalized pattern(s): this file is released under the gplv2 and any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520170857.732920462@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public licence as published by the free software foundation either version 2 of the licence or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 114 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520170857.552531963@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-22kernel/trace/trace.h: Remove duplicate header of trace_seq.hJagadeesh Pagadala
Remove duplicate header which is included twice. Link: http://lkml.kernel.org/r/1553725186-41442-1-git-send-email-jagdsh.linux@gmail.com Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Jagadeesh Pagadala <jagdsh.linux@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-21Merge tag 'spdx-5.2-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull SPDX update from Greg KH: "Here is a series of patches that add SPDX tags to different kernel files, based on two different things: - SPDX entries are added to a bunch of files that we missed a year ago that do not have any license information at all. These were either missed because the tool saw the MODULE_LICENSE() tag, or some EXPORT_SYMBOL tags, and got confused and thought the file had a real license, or the files have been added since the last big sweep, or they were Makefile/Kconfig files, which we didn't touch last time. - Add GPL-2.0-only or GPL-2.0-or-later tags to files where our scan tools can determine the license text in the file itself. Where this happens, the license text is removed, in order to cut down on the 700+ different ways we have in the kernel today, in a quest to get rid of all of these. These patches have been out for review on the linux-spdx@vger mailing list, and while they were created by automatic tools, they were hand-verified by a bunch of different people, all whom names are on the patches are reviewers. The reason for these "large" patches is if we were to continue to progress at the current rate of change in the kernel, adding license tags to individual files in different subsystems, we would be finished in about 10 years at the earliest. There will be more series of these types of patches coming over the next few weeks as the tools and reviewers crunch through the more "odd" variants of how to say "GPLv2" that developers have come up with over the years, combined with other fun oddities (GPL + a BSD disclaimer?) that are being unearthed, with the goal for the whole kernel to be cleaned up. These diffstats are not small, 3840 files are touched, over 10k lines removed in just 24 patches" * tag 'spdx-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (24 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 24 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 23 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 22 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 21 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 20 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 19 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 18 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 17 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 15 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 14 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 12 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 11 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 10 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 9 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 7 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 5 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 4 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 3 ...
2019-05-21tracing: Add a check_val() check before updating cond_snapshot() track_valTom Zanussi
Without this check a snapshot is taken whenever a bucket's max is hit, rather than only when the global max is hit, as it should be. Before: In this example, we do a first run of the workload (cyclictest), examine the output, note the max ('triggering value') (347), then do a second run and note the max again. In this case, the max in the second run (39) is below the max in the first run, but since we haven't cleared the histogram, the first max is still in the histogram and is higher than any other max, so it should still be the max for the snapshot. It isn't however - the value should still be 347 after the second run. # echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="cyclictest"' >> /sys/kernel/debug/tracing/events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio,prev_comm):onmax($wakeup_lat).snapshot() if next_comm=="cyclictest"' >> /sys/kernel/debug/tracing/events/sched/sched_switch/trigger # cyclictest -p 80 -n -s -t 2 -D 2 # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist { next_pid: 2143 } hitcount: 199 max: 44 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/4 { next_pid: 2145 } hitcount: 1325 max: 38 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/2 { next_pid: 2144 } hitcount: 1982 max: 347 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 Snapshot taken (see tracing/snapshot). Details: triggering value { onmax($wakeup_lat) }: 347 triggered by event with key: { next_pid: 2144 } # cyclictest -p 80 -n -s -t 2 -D 2 # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist { next_pid: 2143 } hitcount: 199 max: 44 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/4 { next_pid: 2148 } hitcount: 199 max: 16 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/1 { next_pid: 2145 } hitcount: 1325 max: 38 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/2 { next_pid: 2150 } hitcount: 1326 max: 39 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/4 { next_pid: 2144 } hitcount: 1982 max: 347 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 { next_pid: 2149 } hitcount: 1983 max: 130 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/0 Snapshot taken (see tracing/snapshot). Details: triggering value { onmax($wakeup_lat) }: 39 triggered by event with key: { next_pid: 2150 } After: In this example, we do a first run of the workload (cyclictest), examine the output, note the max ('triggering value') (375), then do a second run and note the max again. In this case, the max in the second run is still 375, the highest in any bucket, as it should be. # cyclictest -p 80 -n -s -t 2 -D 2 # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist { next_pid: 2072 } hitcount: 200 max: 28 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/5 { next_pid: 2074 } hitcount: 1323 max: 375 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/2 { next_pid: 2073 } hitcount: 1980 max: 153 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 Snapshot taken (see tracing/snapshot). Details: triggering value { onmax($wakeup_lat) }: 375 triggered by event with key: { next_pid: 2074 } # cyclictest -p 80 -n -s -t 2 -D 2 # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist { next_pid: 2101 } hitcount: 199 max: 49 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 { next_pid: 2072 } hitcount: 200 max: 28 next_prio: 120 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/5 { next_pid: 2074 } hitcount: 1323 max: 375 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/2 { next_pid: 2103 } hitcount: 1325 max: 74 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/4 { next_pid: 2073 } hitcount: 1980 max: 153 next_prio: 19 next_comm: cyclictest prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 { next_pid: 2102 } hitcount: 1981 max: 84 next_prio: 19 next_comm: cyclictest prev_pid: 12 prev_prio: 120 prev_comm: kworker/0:1 Snapshot taken (see tracing/snapshot). Details: triggering value { onmax($wakeup_lat) }: 375 triggered by event with key: { next_pid: 2074 } Link: http://lkml.kernel.org/r/95958351329f129c07504b4d1769c47a97b70d65.1555597045.git.tom.zanussi@linux.intel.com Cc: stable@vger.kernel.org Fixes: a3785b7eca8fd ("tracing: Add hist trigger snapshot() action") Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-21tracing: Check keys for variable references in expressions tooTom Zanussi
There's an existing check for variable references in keys, but it doesn't go far enough. It checks whether a key field is a variable reference but doesn't check whether it's an expression containing variable references, which can cause the same problems for callers. Use the existing field_has_hist_vars() function rather than a direct top-level flag check to catch all possible variable references. Link: http://lkml.kernel.org/r/e8c3d3d53db5ca90ceea5a46e5413103a6902fc7.1555597045.git.tom.zanussi@linux.intel.com Cc: stable@vger.kernel.org Fixes: 067fe038e70f6 ("tracing: Add variable reference handling to hist triggers") Reported-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-21tracing: Prevent hist_field_var_ref() from accessing NULL tracing_map_eltsTom Zanussi
hist_field_var_ref() is an implementation of hist_field_fn_t(), which can be called with a null tracing_map_elt elt param when assembling a key in event_hist_trigger(). In the case of hist_field_var_ref() this doesn't make sense, because a variable can only be resolved by looking it up using an already assembled key i.e. a variable can't be used to assemble a key since the key is required in order to access the variable. Upper layers should prevent the user from constructing a key using a variable in the first place, but in case one slips through, it shouldn't cause a NULL pointer dereference. Also if one does slip through, we want to know about it, so emit a one-time warning in that case. Link: http://lkml.kernel.org/r/64ec8dc15c14d305295b64cdfcc6b2b9dd14753f.1555597045.git.tom.zanussi@linux.intel.com Reported-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-05-21treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it would be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 6 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190519154043.007767574@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13Thomas Gleixner
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details [based] [from] [clk] [highbank] [c] you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 355 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 9Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not you can access it online at http www gnu org licenses gpl 2 0 html extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190519154041.430943677@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21treewide: Add SPDX license identifier - Makefile/KconfigThomas Gleixner
Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21treewide: Add SPDX license identifier for missed filesThomas Gleixner
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller:1) Use after free in __dev_map_entry_free(), from Eric Dumazet. 1) Use after free in __dev_map_entry_free(), from Eric Dumazet. 2) Fix TCP retransmission timestamps on passive Fast Open, from Yuchung Cheng. 3) Orphan NFC, we'll take the patches directly into my tree. From Johannes Berg. 4) We can't recycle cloned TCP skbs, from Eric Dumazet. 5) Some flow dissector bpf test fixes, from Stanislav Fomichev. 6) Fix RCU marking and warnings in rhashtable, from Herbert Xu. 7) Fix some potential fib6 leaks, from Eric Dumazet. 8) Fix a _decode_session4 uninitialized memory read bug fix that got lost in a merge. From Florian Westphal. 9) Fix ipv6 source address routing wrt. exception route entries, from Wei Wang. 10) The netdev_xmit_more() conversion was not done %100 properly in mlx5 driver, fix from Tariq Toukan. 11) Clean up botched merge on netfilter kselftest, from Florian Westphal. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (74 commits) of_net: fix of_get_mac_address retval if compiled without CONFIG_OF net: fix kernel-doc warnings for socket.c net: Treat sock->sk_drops as an unsigned int when printing kselftests: netfilter: fix leftover net/net-next merge conflict mlxsw: core: Prevent reading unsupported slave address from SFP EEPROM mlxsw: core: Prevent QSFP module initialization for old hardware vsock/virtio: Initialize core virtio vsock before registering the driver net/mlx5e: Fix possible modify header actions memory leak net/mlx5e: Fix no rewrite fields with the same match net/mlx5e: Additional check for flow destination comparison net/mlx5e: Add missing ethtool driver info for representors net/mlx5e: Fix number of vports for ingress ACL configuration net/mlx5e: Fix ethtool rxfh commands when CONFIG_MLX5_EN_RXNFC is disabled net/mlx5e: Fix wrong xmit_more application net/mlx5: Fix peer pf disable hca command net/mlx5: E-Switch, Correct type to u16 for vport_num and int for vport_index net/mlx5: Add meaningful return codes to status_to_err function net/mlx5: Imply MLXFW in mlx5_core Revert "tipc: fix modprobe tipc failed after switch order of device registration" vsock/virtio: free packets during the socket release ...
2019-05-19Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge yet more updates from Andrew Morton: "A few final bits: - large changes to vmalloc, yielding large performance benefits - tweak the console-flush-on-panic code - a few fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: panic: add an option to replay all the printk message in buffer initramfs: don't free a non-existent initrd fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount mm/compaction.c: correct zone boundary handling when isolating pages from a pageblock mm/vmap: add DEBUG_AUGMENT_LOWEST_MATCH_CHECK macro mm/vmap: add DEBUG_AUGMENT_PROPAGATE_CHECK macro mm/vmalloc.c: keep track of free blocks for vmap allocation
2019-05-19Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull IRQ chip updates from Ingo Molnar: "A late irqchips update: - New TI INTR/INTA set of drivers - Rewrite of the stm32mp1-exti driver as a platform driver - Update the IOMMU MSI mapping API to be RT friendly - A number of cleanups and other low impact fixes" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) iommu/dma-iommu: Remove iommu_dma_map_msi_msg() irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg() irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg() irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg() irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg() iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts genirq/msi: Add a new field in msi_desc to store an IOMMU cookie arm64: arch_k3: Enable interrupt controller drivers irqchip/ti-sci-inta: Add msi domain support soc: ti: Add MSI domain bus support for Interrupt Aggregator irqchip/ti-sci-inta: Add support for Interrupt Aggregator driver dt-bindings: irqchip: Introduce TISCI Interrupt Aggregator bindings irqchip/ti-sci-intr: Add support for Interrupt Router driver dt-bindings: irqchip: Introduce TISCI Interrupt router bindings gpio: thunderx: Use the default parent apis for {request,release}_resources genirq: Introduce irq_chip_{request,release}_resource_parent() apis firmware: ti_sci: Add helper apis to manage resources firmware: ti_sci: Add RM mapping table for am654 firmware: ti_sci: Add support for IRQ management firmware: ti_sci: Add support for RM core ops ...
2019-05-18panic: add an option to replay all the printk message in bufferFeng Tang
Currently on panic, kernel will lower the loglevel and print out pending printk msg only with console_flush_on_panic(). Add an option for users to configure the "panic_print" to replay all dmesg in buffer, some of which they may have never seen due to the loglevel setting, which will help panic debugging . [feng.tang@intel.com: keep the original console_flush_on_panic() inside panic()] Link: http://lkml.kernel.org/r/1556199137-14163-1-git-send-email-feng.tang@intel.com [feng.tang@intel.com: use logbuf lock to protect the console log index] Link: http://lkml.kernel.org/r/1556269868-22654-1-git-send-email-feng.tang@intel.com Link: http://lkml.kernel.org/r/1556095872-36838-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Aaro Koskinen <aaro.koskinen@nokia.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-16Merge branch 'for-5.2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "The cgroup2 freezer pulled in this cycle broke strace. This pull request includes a workaround for the problem. It's not a complete fix in that it may cause spurious frozen state flip-flops which is fairly minor. Will push a full fix once it's ready" * 'for-5.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: signal: unconditionally leave the frozen state in ptrace_stop()
2019-05-16bpf: relax inode permission check for retrieving bpf programChenbo Feng
For iptable module to load a bpf program from a pinned location, it only retrieve a loaded program and cannot change the program content so requiring a write permission for it might not be necessary. Also when adding or removing an unrelated iptable rule, it might need to flush and reload the xt_bpf related rules as well and triggers the inode permission check. It might be better to remove the write premission check for the inode so we won't need to grant write access to all the processes that flush and restore iptables rules. Signed-off-by: Chenbo Feng <fengc@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-16Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull time fixes from Ingo Molnar: "A TIA adjtimex interface extension, and a POSIX compliance ABI fix for timespec64 users" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ntp: Allow TAI-UTC offset to be set to zero y2038: Make CONFIG_64BIT_TIME unconditional
2019-05-16Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "A single rwsem fix" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Prevent decrement of reader count before increment
2019-05-16signal: unconditionally leave the frozen state in ptrace_stop()Roman Gushchin
Alex Xu reported a regression in strace, caused by the introduction of the cgroup v2 freezer. The regression can be reproduced by stracing the following simple program: #include <unistd.h> int main() { write(1, "a", 1); return 0; } An attempt to run strace ./a.out leads to the infinite loop: [ pre-main omitted ] write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) [ repeats forever ] The problem occurs because the traced task leaves ptrace_stop() (and the signal handling loop) with the frozen bit set. So let's call cgroup_leave_frozen(true) unconditionally after sleeping in ptrace_stop(). With this patch applied, strace works as expected: [ pre-main omitted ] write(1, "a", 1) = 1 exit_group(0) = ? +++ exited with 0 +++ Reported-by: Alex Xu <alex_y_xu@yahoo.ca> Fixes: 76f969e8948d ("cgroup: cgroup v2 freezer") Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2019-05-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2019-05-16 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a use after free in __dev_map_entry_free(), from Eric. 2) Several sockmap related bug fixes: a splat in strparser if it was never initialized, remove duplicate ingress msg list purging which can race, fix msg->sg.size accounting upon skb to msg conversion, and last but not least fix a timeout bug in tcp_bpf_wait_data(), from John. 3) Fix LRU map to avoid messing with eviction heuristics upon syscall lookup, e.g. map walks from user space side will then lead to eviction of just recently created entries on updates as it would mark all map entries, from Daniel. 4) Don't bail out when libbpf feature probing fails. Also various smaller fixes to flow_dissector test, from Stanislav. 5) Fix missing brackets for BTF_INT_OFFSET() in UAPI, from Gary. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-15Merge tag 'trace-v5.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The major changes in this tracing update includes: - Removal of non-DYNAMIC_FTRACE from 32bit x86 - Removal of mcount support from x86 - Emulating a call from int3 on x86_64, fixes live kernel patching - Consolidated Tracing Error logs file Minor updates: - Removal of klp_check_compiler_support() - kdb ftrace dumping output changes - Accessing and creating ftrace instances from inside the kernel - Clean up of #define if macro - Introduction of TRACE_EVENT_NOP() to disable trace events based on config options And other minor fixes and clean ups" * tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) x86: Hide the int3_emulate_call/jmp functions from UML livepatch: Remove klp_check_compiler_support() ftrace/x86: Remove mcount support ftrace/x86_32: Remove support for non DYNAMIC_FTRACE tracing: Simplify "if" macro code tracing: Fix documentation about disabling options using trace_options tracing: Replace kzalloc with kcalloc tracing: Fix partial reading of trace event's id file tracing: Allow RCU to run between postponed startup tests tracing: Fix white space issues in parse_pred() function tracing: Eliminate const char[] auto variables ring-buffer: Fix mispelling of Calculate tracing: probeevent: Fix to make the type of $comm string tracing: probeevent: Do not accumulate on ret variable tracing: uprobes: Re-enable $comm support for uprobe events ftrace/x86_64: Emulate call function while updating in breakpoint handler x86_64: Allow breakpoints to emulate call instructions x86_64: Add gap to int3 to allow for call emulation tracing: kdb: Allow ftdump to skip all but the last few entries tracing: Add trace_total_entries() / trace_total_entries_cpu() ...
2019-05-15kernel/compat.c: mark expected switch fall-throughsStephen Rothwell
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch aims to suppress 3 missing-break-in-switch false positives on some architectures. Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Deepa Dinamani <deepa.kernel@gmail.com> Cc: Gustavo A. R. Silva <gustavo@embeddedor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Jann Horn <jannh@google.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge more updates from Andrew Morton: - a couple of hotfixes - almost all of the rest of MM - lib/ updates - binfmt_elf updates - autofs updates - quite a lot of misc fixes and updates - reiserfs, fatfs - signals - exec - cpumask - rapidio - sysctl - pids - eventfd - gcov - panic - pps - gdb script updates - ipc updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (126 commits) mm: memcontrol: fix NUMA round-robin reclaim at intermediate level mm: memcontrol: fix recursive statistics correctness & scalabilty mm: memcontrol: move stat/event counting functions out-of-line mm: memcontrol: make cgroup stats and events query API explicitly local drivers/virt/fsl_hypervisor.c: prevent integer overflow in ioctl drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctl mm, memcg: rename ambiguously named memory.stat counters and functions arch: remove <asm/sizes.h> and <asm-generic/sizes.h> treewide: replace #include <asm/sizes.h> with #include <linux/sizes.h> fs/block_dev.c: Remove duplicate header fs/cachefiles/namei.c: remove duplicate header include/linux/sched/signal.h: replace `tsk' with `task' fs/coda/psdev.c: remove duplicate header ipc: do cyclic id allocation for the ipc object. ipc: conserve sequence numbers in ipcmni_extend mode ipc: allow boot time extension of IPCMNI from 32k to 16M ipc/mqueue: optimize msg_get() ipc/mqueue: remove redundant wq task assignment ipc: prevent lockup on alloc_msg and free_msg scripts/gdb: print cached rate in lx-clk-summary ...
2019-05-14panic/reboot: allow specifying reboot_mode for panic onlyAaro Koskinen
Allow specifying reboot_mode for panic only. This is needed on systems where ramoops is used to store panic logs, and user wants to use warm reset to preserve those, while still having cold reset on normal reboots. Link: http://lkml.kernel.org/r/20190322004735.27702-1-aaro.koskinen@iki.fi Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14panic: avoid the extra noise dmesgFeng Tang
When kernel panic happens, it will first print the panic call stack, then the ending msg like: [ 35.743249] ---[ end Kernel panic - not syncing: Fatal exception [ 35.749975] ------------[ cut here ]------------ The above message are very useful for debugging. But if system is configured to not reboot on panic, say the "panic_timeout" parameter equals 0, it will likely print out many noisy message like WARN() call stack for each and every CPU except the panic one, messages like below: WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 set_task_cpu+0x183/0x190 Call Trace: <IRQ> try_to_wake_up default_wake_function autoremove_wake_function __wake_up_common __wake_up_common_lock __wake_up wake_up_klogd_work_func irq_work_run_list irq_work_tick update_process_times tick_sched_timer __hrtimer_run_queues hrtimer_interrupt smp_apic_timer_interrupt apic_timer_interrupt For people working in console mode, the screen will first show the panic call stack, but immediately overridden by these noisy extra messages, which makes debugging much more difficult, as the original context gets lost on screen. Also these noisy messages will confuse some users, as I have seen many bug reporters posted the noisy message into bugzilla, instead of the real panic call stack and context. Adding a flag "suppress_printk" which gets set in panic() to avoid those noisy messages, without changing current kernel behavior that both panic blinking and sysrq magic key can work as is, suggested by Petr Mladek. To verify this, make sure kernel is not configured to reboot on panic and in console # echo c > /proc/sysrq-trigger to see if console only prints out the panic call stack. Link: http://lkml.kernel.org/r/1551430186-24169-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Suggested-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14gcov: clang supportGreg Hackmann
LLVM uses profiling data that's deliberately similar to GCC, but has a very different way of exporting that data. LLVM calls llvm_gcov_init() once per module, and provides a couple of callbacks that we can use to ask for more data. We care about the "writeout" callback, which in turn calls back into compiler-rt/this module to dump all the gathered coverage data to disk: llvm_gcda_start_file() llvm_gcda_emit_function() llvm_gcda_emit_arcs() llvm_gcda_emit_function() llvm_gcda_emit_arcs() [... repeats for each function ...] llvm_gcda_summary_info() llvm_gcda_end_file() This design is much more stateless and unstructured than gcc's, and is intended to run at process exit. This forces us to keep some local state about which module we're dealing with at the moment. On the other hand, it also means we don't depend as much on how LLVM represents profiling data internally. See LLVM's lib/Transforms/Instrumentation/GCOVProfiling.cpp for more details on how this works, particularly GCOVProfiler::emitProfileArcs(), GCOVProfiler::insertCounterWriteout(), and GCOVProfiler::insertFlush(). [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20190417225328.208129-1-trong@android.com Signed-off-by: Greg Hackmann <ghackmann@android.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Tri Vo <trong@android.com> Co-developed-by: Nick Desaulniers <ndesaulniers@google.com> Co-developed-by: Tri Vo <trong@android.com> Tested-by: Trilok Soni <tsoni@quicinc.com> Tested-by: Prasad Sodagudi <psodagud@quicinc.com> Tested-by: Tri Vo <trong@android.com> Tested-by: Daniel Mentz <danielmentz@google.com> Tested-by: Petri Gynther <pgynther@google.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14gcov: clang: move common GCC code into gcc_base.cGreg Hackmann
Patch series "gcov: add Clang support", v4. This patch (of 3): base.c contains a few callbacks specific to GCC's gcov implementation. Move these into their own module in preparation for Clang support. Link: http://lkml.kernel.org/r/20190318025411.98014-2-trong@android.com Signed-off-by: Greg Hackmann <ghackmann@android.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Tri Vo <trong@android.com> Tested-by: Trilok Soni <tsoni@quicinc.com> Tested-by: Prasad Sodagudi <psodagud@quicinc.com> Tested-by: Tri Vo <trong@android.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Daniel Mentz <danielmentz@google.com> Cc: Petri Gynther <pgynther@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14kernel/pid.c: remove unneeded hash header fileTimmy Li
Hash functions are not needed since idr is used now. Let's remove hash header file for cleanup. Link: http://lkml.kernel.org/r/20190430053319.95913-1-scuttimmy@gmail.com Signed-off-by: Timmy Li <scuttimmy@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14kernel/sysctl.c: fix proc_do_large_bitmap for large input buffersEric Sandeen
Today, proc_do_large_bitmap() truncates a large write input buffer to PAGE_SIZE - 1, which may result in misparsed numbers at the (truncated) end of the buffer. Further, it fails to notify the caller that the buffer was truncated, so it doesn't get called iteratively to finish the entire input buffer. Tell the caller if there's more work to do by adding the skipped amount back to left/*lenp before returning. To fix the misparsing, reset the position if we have completely consumed a truncated buffer (or if just one char is left, which may be a "-" in a range), and ask the caller to come back for more. Link: http://lkml.kernel.org/r/20190320222831.8243-7-mcgrof@kernel.org Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14sysctl: return -EINVAL if val violates minmaxChristian Brauner
Currently when userspace gives us a values that overflow e.g. file-max and other callers of __do_proc_doulongvec_minmax() we simply ignore the new value and leave the current value untouched. This can be problematic as it gives the illusion that the limit has indeed be bumped when in fact it failed. This commit makes sure to return EINVAL when an overflow is detected. Please note that this is a userspace facing change. Link: http://lkml.kernel.org/r/20190210203943.8227-4-christian@brauner.io Signed-off-by: Christian Brauner <christian@brauner.io> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14kernel/sysctl.c: switch to bitmap_zalloc()Andy Shevchenko
Switch to bitmap_zalloc() to show clearly what we are allocating. Besides that it returns pointer of bitmap type instead of opaque void *. Link: http://lkml.kernel.org/r/20190304094037.57756-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14kernel/signal.c: annotate implicit fall throughMathieu Malaterre
There is a plan to build the kernel with -Wimplicit-fallthrough and this place in the code produced a warning (W=1). This commit remove the following warning: kernel/signal.c:795:13: warning: this statement may fall through [-Wimplicit-fallthrough=] Link: http://lkml.kernel.org/r/20190114203505.17875-1-malat@debian.org Signed-off-by: Mathieu Malaterre <malat@debian.org> Acked-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14kernel/user.c: clean up some leftover codeRasmus Villemoes
The out_unlock label is misleading; no unlocking happens after it, so just return NULL directly. Also, nothing between the kmem_cache_zalloc() that creates new and the two key_put() can initialize new->uid_keyring or new->session_keyring, so those calls are no-ops. Link: http://lkml.kernel.org/r/20190424200404.9114-1-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14kernel/latencytop.c: rename clear_all_latency_tracing to ↵Lin Feng
clear_tsk_latency_tracing The name clear_all_latency_tracing is misleading, in fact which only clear per task's latency_record[], and we do have another function named clear_global_latency_tracing which clear the global latency_record[] buffer. Link: http://lkml.kernel.org/r/20190226114602.16902-1-linf@wangsu.com Signed-off-by: Lin Feng <linf@wangsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Fabian Frederick <fabf@skynet.be> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14kernel/latencytop.c: remove unnecessary checks for latencytop_enabledLin Feng
1. In latencytop source codes, we only have such calling chain: account_scheduler_latency(struct task_struct *task, int usecs, int inter) { if (unlikely(latencytop_enabled)) /* the outtermost check */ __account_scheduler_latency(task, usecs, inter); } __account_scheduler_latency account_global_scheduler_latency if (!latencytop_enabled) So, the inner check for latencytop_enabled is not necessary at all. 2. In clear_all_latency_tracing and now is called clear_tsk_latency_tracing the check for latencytop_enabled is redundant and buggy to some extent. We have no reason to refuse clearing the /proc/$pid/latency if latencytop_enabled is set to 0, considering that if we use latencytop manually by echo 0 > /proc/sys/kernel/latencytop, then we want to clear /proc/$pid/latency and failed. Also we don't have such check in brother function clear_global_latency_tracing. Notes: These changes are only visible to users who set CONFIG_LATENCYTOP and won't change user tool latencytop's behavior. Link: http://lkml.kernel.org/r/20190226114602.16902-2-linf@wangsu.com Signed-off-by: Lin Feng <linf@wangsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Fabian Frederick <fabf@skynet.be> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14kernel/notifier.c: double register detectionVasily Averin
By design notifiers can be registerd once only, 2nd register attempt called by mistake silently corrupts notifiers list. A few years ago I investigated described problem, the host was power cycled because of notifier list corruption. I've prepared this patch and applied it to the OpenVZ kernel and sent this patch but nobody commented on it. Later it helped us to detect a similar problem in the OpenVz kernel. Mistakes with notifier registration can happen for example during subsystem initialization from different namespaces, or because of a lost unregister in the roll-back path on initialization failures. The proposed check cannot prevent the described problem, however it allows us to detect its reason quickly without coredump analysis. Link: http://lkml.kernel.org/r/04127e71-4782-9bbb-fe5a-7c01e93a99b0@virtuozzo.com Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14kernel/sched/psi.c: expose pressure metrics on root cgroupDan Schatzberg
Pressure metrics are already recorded and exposed in procfs for the entire system, but any tool which monitors cgroup pressure has to special case the root cgroup to read from procfs. This patch exposes the already recorded pressure metrics on the root cgroup. Link: http://lkml.kernel.org/r/20190510174938.3361741-1-dschatzberg@fb.com Signed-off-by: Dan Schatzberg <dschatzberg@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14psi: introduce psi monitorSuren Baghdasaryan
Psi monitor aims to provide a low-latency short-term pressure detection mechanism configurable by users. It allows users to monitor psi metrics growth and trigger events whenever a metric raises above user-defined threshold within user-defined time window. Time window and threshold are both expressed in usecs. Multiple psi resources with different thresholds and window sizes can be monitored concurrently. Psi monitors activate when system enters stall state for the monitored psi metric and deactivate upon exit from the stall state. While system is in the stall state psi signal growth is monitored at a rate of 10 times per tracking window. Min window size is 500ms, therefore the min monitoring interval is 50ms. Max window size is 10s with monitoring interval of 1s. When activated psi monitor stays active for at least the duration of one tracking window to avoid repeated activations/deactivations when psi signal is bouncing. Notifications to the users are rate-limited to one per tracking window. Link: http://lkml.kernel.org/r/20190319235619.260832-8-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Dennis Zhou <dennis@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14include/: refactor headers to allow kthread.h inclusion in psi_types.hSuren Baghdasaryan
kthread.h can't be included in psi_types.h because it creates a circular inclusion with kthread.h eventually including psi_types.h and complaining on kthread structures not being defined because they are defined further in the kthread.h. Resolve this by removing psi_types.h inclusion from the headers included from kthread.h. Link: http://lkml.kernel.org/r/20190319235619.260832-7-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Dennis Zhou <dennis@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>