summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-08-27ALSA: ctl: Stop notification after disconnectionTakashi Iwai
commit f388cdcdd160687c6650833f286b9c89c50960ff upstream. snd_ctl_remove() has a notification for the removal event. It's superfluous when done during the device got disconnected. Although the notification itself is mostly harmless, it may potentially be harmful, and should be suppressed. Actually some components PCM may free ctl elements during the disconnect or free callbacks, thus it's no theoretical issue. This patch adds the check of card->shutdown flag for avoiding unnecessary notifications after (or during) the disconnect. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27ALSA: au88x0: Fix calculation in vortex_wtdma_bufshift()Takashi Iwai
commit 62db7152c924e4c060e42b34a69cd39658e8a0dc upstream. vortex_wtdma_bufshift() function does calculate the page index wrongly, first masking then shift, which always results in zero. The proper computation is to first shift, then mask. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27ALSA: dummy: Fix a use-after-free at closingTakashi Iwai
commit d5dbbe6569481bf12dcbe3e12cff72c5f78d272c upstream. syzkaller fuzzer spotted a potential use-after-free case in snd-dummy driver when hrtimer is used as backend: > ================================================================== > BUG: KASAN: use-after-free in rb_erase+0x1b17/0x2010 at addr ffff88005e5b6f68 > Read of size 8 by task syz-executor/8984 > ============================================================================= > BUG kmalloc-192 (Not tainted): kasan: bad access detected > ----------------------------------------------------------------------------- > > Disabling lock debugging due to kernel taint > INFO: Allocated in 0xbbbbbbbbbbbbbbbb age=18446705582212484632 > .... > [< none >] dummy_hrtimer_create+0x49/0x1a0 sound/drivers/dummy.c:464 > .... > INFO: Freed in 0xfffd8e09 age=18446705496313138713 cpu=2164287125 pid=-1 > [< none >] dummy_hrtimer_free+0x68/0x80 sound/drivers/dummy.c:481 > .... > Call Trace: > [<ffffffff8179e59e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:333 > [< inline >] rb_set_parent include/linux/rbtree_augmented.h:111 > [< inline >] __rb_erase_augmented include/linux/rbtree_augmented.h:218 > [<ffffffff82ca5787>] rb_erase+0x1b17/0x2010 lib/rbtree.c:427 > [<ffffffff82cb02e8>] timerqueue_del+0x78/0x170 lib/timerqueue.c:86 > [<ffffffff814d0c80>] __remove_hrtimer+0x90/0x220 kernel/time/hrtimer.c:903 > [< inline >] remove_hrtimer kernel/time/hrtimer.c:945 > [<ffffffff814d23da>] hrtimer_try_to_cancel+0x22a/0x570 kernel/time/hrtimer.c:1046 > [<ffffffff814d2742>] hrtimer_cancel+0x22/0x40 kernel/time/hrtimer.c:1066 > [<ffffffff85420531>] dummy_hrtimer_stop+0x91/0xb0 sound/drivers/dummy.c:417 > [<ffffffff854228bf>] dummy_pcm_trigger+0x17f/0x1e0 sound/drivers/dummy.c:507 > [<ffffffff85392170>] snd_pcm_do_stop+0x160/0x1b0 sound/core/pcm_native.c:1106 > [<ffffffff85391b26>] snd_pcm_action_single+0x76/0x120 sound/core/pcm_native.c:956 > [<ffffffff85391e01>] snd_pcm_action+0x231/0x290 sound/core/pcm_native.c:974 > [< inline >] snd_pcm_stop sound/core/pcm_native.c:1139 > [<ffffffff8539754d>] snd_pcm_drop+0x12d/0x1d0 sound/core/pcm_native.c:1784 > [<ffffffff8539d3be>] snd_pcm_common_ioctl1+0xfae/0x2150 sound/core/pcm_native.c:2805 > [<ffffffff8539ee91>] snd_pcm_capture_ioctl1+0x2a1/0x5e0 sound/core/pcm_native.c:2976 > [<ffffffff8539f2ec>] snd_pcm_kernel_ioctl+0x11c/0x160 sound/core/pcm_native.c:3020 > [<ffffffff853d9a44>] snd_pcm_oss_sync+0x3a4/0xa30 sound/core/oss/pcm_oss.c:1693 > [<ffffffff853da27d>] snd_pcm_oss_release+0x1ad/0x280 sound/core/oss/pcm_oss.c:2483 > ..... A workaround is to call hrtimer_cancel() in dummy_hrtimer_sync() which is called certainly before other blocking ops. Reported-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27tty/vt/keyboard: fix OOB access in do_compute_shiftstate()Dmitry Torokhov
commit 510cccb5b0c8868a2b302a0ab524da7912da648b upstream. The size of individual keymap in drivers/tty/vt/keyboard.c is NR_KEYS, which is currently 256, whereas number of keys/buttons in input device (and therefor in key_down) is much larger - KEY_CNT - 768, and that can cause out-of-bound access when we do sym = U(key_maps[0][k]); with large 'k'. To fix it we should not attempt iterating beyond smaller of NR_KEYS and KEY_CNT. Also while at it let's switch to for_each_set_bit() instead of open-coding it. Reported-by: Sasha Levin <sasha.levin@oracle.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27iio:ad7266: Fix probe deferral for vrefMark Brown
commit 68b356eb3d9f5e38910fb62e22a78e2a18d544ae upstream. Currently the ad7266 driver treats any failure to get vref as though the regulator were not present but this means that if probe deferral is triggered the driver will act as though the regulator were not present. Instead only use the internal reference if we explicitly got -ENODEV which is what is returned for absent regulators. Signed-off-by: Mark Brown <broonie@kernel.org> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27iio:ad7266: Fix broken regulator error handlingMark Brown
commit 6b7f4e25f3309f106a5c7ff42c8231494cf285d3 upstream. All regulator_get() variants return either a pointer to a regulator or an ERR_PTR() so testing for NULL makes no sense and may lead to bugs if we use NULL as a valid regulator. Fix this by using IS_ERR() as expected. Signed-off-by: Mark Brown <broonie@kernel.org> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27iio: accel: kxsd9: fix the usage of spi_w8r8()Linus Walleij
commit 0c1f91b98552da49d9d8eed32b3132a58d2f4598 upstream. These two spi_w8r8() calls return a value with is used by the code following the error check. The dubious use was caused by a cleanup patch. Fixes: d34dbee8ac8e ("staging:iio:accel:kxsd9 cleanup and conversion to iio_chan_spec.") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27staging: iio: accel: fix error checkLuis de Bethencourt
commit ef3149eb3ddb7f9125e11c90f8330e371b55cffd upstream. sca3000_read_ctrl_reg() returns a negative number on failure, check for this instead of zero. Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27iio: Fix error handling in iio_trigger_attach_poll_funcCrestez Dan Leonard
commit 99543823357966ac938d9a310947e731b67338e6 upstream. When attaching a pollfunc iio_trigger_attach_poll_func will allocate a virtual irq and call the driver's set_trigger_state function. Fix error handling to undo previous steps if any fails. In particular this fixes handling errors from a driver's set_trigger_state function. When using triggered buffers a failure to enable the trigger used to make the buffer unusable. Signed-off-by: Crestez Dan Leonard <leonard.crestez@intel.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27base: make module_create_drivers_dir race-freeJiri Slaby
commit 7e1b1fc4dabd6ec8e28baa0708866e13fa93c9b3 upstream. Modules which register drivers via standard path (driver_register) in parallel can cause a warning: WARNING: CPU: 2 PID: 3492 at ../fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80 sysfs: cannot create duplicate filename '/module/saa7146/drivers' Modules linked in: hexium_gemini(+) mxb(+) ... ... Call Trace: ... [<ffffffff812e63a2>] sysfs_warn_dup+0x62/0x80 [<ffffffff812e6487>] sysfs_create_dir_ns+0x77/0x90 [<ffffffff8140f2c4>] kobject_add_internal+0xb4/0x340 [<ffffffff8140f5b8>] kobject_add+0x68/0xb0 [<ffffffff8140f631>] kobject_create_and_add+0x31/0x70 [<ffffffff8157a703>] module_add_driver+0xc3/0xd0 [<ffffffff8155e5d4>] bus_add_driver+0x154/0x280 [<ffffffff815604c0>] driver_register+0x60/0xe0 [<ffffffff8145bed0>] __pci_register_driver+0x60/0x70 [<ffffffffa0273e14>] saa7146_register_extension+0x64/0x90 [saa7146] [<ffffffffa0033011>] hexium_init_module+0x11/0x1000 [hexium_gemini] ... As can be (mostly) seen, driver_register causes this call sequence: -> bus_add_driver -> module_add_driver -> module_create_drivers_dir The last one creates "drivers" directory in /sys/module/<...>. When this is done in parallel, the directory is attempted to be created twice at the same time. This can be easily reproduced by loading mxb and hexium_gemini in parallel: while :; do modprobe mxb & modprobe hexium_gemini wait rmmod mxb hexium_gemini saa7146_vv saa7146 done saa7146 calls pci_register_driver for both mxb and hexium_gemini, which means /sys/module/saa7146/drivers is to be created for both of them. Fix this by a new mutex in module_create_drivers_dir which makes the test-and-create "drivers" dir atomic. I inverted the condition and removed 'return' to avoid multiple unlocks or a goto. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Fixes: fe480a2675ed (Modules: only add drivers/ direcory if needed) Cc: v2.6.21+ <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27tracing: Handle NULL formats in hold_module_trace_bprintk_format()Steven Rostedt (Red Hat)
commit 70c8217acd4383e069fe1898bbad36ea4fcdbdcc upstream. If a task uses a non constant string for the format parameter in trace_printk(), then the trace_printk_fmt variable is set to NULL. This variable is then saved in the __trace_printk_fmt section. The function hold_module_trace_bprintk_format() checks to see if duplicate formats are used by modules, and reuses them if so (saves them to the list if it is new). But this function calls lookup_format() that does a strcmp() to the value (which is now NULL) and can cause a kernel oops. This wasn't an issue till 3debb0a9ddb ("tracing: Fix trace_printk() to print when not using bprintk()") which added "__used" to the trace_printk_fmt variable, and before that, the kernel simply optimized it out (no NULL value was saved). The fix is simply to handle the NULL pointer in lookup_format() and have the caller ignore the value if it was NULL. Link: http://lkml.kernel.org/r/1464769870-18344-1-git-send-email-zhengjun.xing@intel.com Reported-by: xingzhen <zhengjun.xing@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Fixes: 3debb0a9ddb ("tracing: Fix trace_printk() to print when not using bprintk()") Cc: stable@vger.kernel.org # v3.5+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27kvm: Fix irq route entries exceeding KVM_MAX_IRQ_ROUTESXiubo Li
commit caf1ff26e1aa178133df68ac3d40815fed2187d9 upstream. These days, we experienced one guest crash with 8 cores and 3 disks, with qemu error logs as bellow: qemu-system-x86_64: /build/qemu-2.0.0/kvm-all.c:984: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. And then we found one patch(bdf026317d) in qemu tree, which said could fix this bug. Execute the following script will reproduce the BUG quickly: irq_affinity.sh ======================================================================== vda_irq_num=25 vdb_irq_num=27 while [ 1 ] do for irq in {1,2,4,8,10,20,40,80} do echo $irq > /proc/irq/$vda_irq_num/smp_affinity echo $irq > /proc/irq/$vdb_irq_num/smp_affinity dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct done done ======================================================================== The following qemu log is added in the qemu code and is displayed when this bug reproduced: kvm_irqchip_commit_routes: max gsi: 1008, nr_allocated_irq_routes: 1024, irq_routes->nr: 1024, gsi_count: 1024. That's to say when irq_routes->nr == 1024, there are 1024 routing entries, but in the kernel code when routes->nr >= 1024, will just return -EINVAL; The nr is the number of the routing entries which is in of [1 ~ KVM_MAX_IRQ_ROUTES], not the index in [0 ~ KVM_MAX_IRQ_ROUTES - 1]. This patch fix the BUG above. Cc: stable@vger.kernel.org Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Signed-off-by: Wei Tang <tangwei@cmss.chinamobile.com> Signed-off-by: Zhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27cdc_ncm: workaround for EM7455 "silent" data interfaceBjørn Mork
commit c086e7096170390594c425114d98172bc9aceb8a upstream. Several Lenovo users have reported problems with their Sierra Wireless EM7455 modem. The driver has loaded successfully and the MBIM management channel has appeared to work, including establishing a connection to the mobile network. But no frames have been received over the data interface. The problem affects all EM7455 and MC7455, and is assumed to affect other modems based on the same Qualcomm chipset and baseband firmware. Testing narrowed the problem down to what seems to be a firmware timing bug during initialization. Adding a short sleep while probing is sufficient to make the problem disappear. Experiments have shown that 1-2 ms is too little to have any effect, while 10-20 ms is enough to reliably succeed. Reported-by: Stefan Armbruster <ml001@armbruster-it.de> Reported-by: Ralph Plawetzki <ralph@purejava.org> Reported-by: Andreas Fett <andreas.fett@secunet.com> Reported-by: Rasmus Lerdorf <rasmus@lerdorf.com> Reported-by: Samo Ratnik <samo.ratnik@gmail.com> Reported-and-tested-by: Aleksander Morgado <aleksander@aleksander.es> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27UBIFS: Implement ->migratepage()Kirill A. Shutemov
commit 4ac1c17b2044a1b4b2fbed74451947e905fc2992 upstream. During page migrations UBIFS might get confused and the following assert triggers: [ 213.480000] UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436) [ 213.490000] CPU: 0 PID: 436 Comm: drm-stress-test Not tainted 4.4.4-00176-geaa802524636-dirty #1008 [ 213.490000] Hardware name: Allwinner sun4i/sun5i Families [ 213.490000] [<c0015e70>] (unwind_backtrace) from [<c0012cdc>] (show_stack+0x10/0x14) [ 213.490000] [<c0012cdc>] (show_stack) from [<c02ad834>] (dump_stack+0x8c/0xa0) [ 213.490000] [<c02ad834>] (dump_stack) from [<c0236ee8>] (ubifs_set_page_dirty+0x44/0x50) [ 213.490000] [<c0236ee8>] (ubifs_set_page_dirty) from [<c00fa0bc>] (try_to_unmap_one+0x10c/0x3a8) [ 213.490000] [<c00fa0bc>] (try_to_unmap_one) from [<c00fadb4>] (rmap_walk+0xb4/0x290) [ 213.490000] [<c00fadb4>] (rmap_walk) from [<c00fb1bc>] (try_to_unmap+0x64/0x80) [ 213.490000] [<c00fb1bc>] (try_to_unmap) from [<c010dc28>] (migrate_pages+0x328/0x7a0) [ 213.490000] [<c010dc28>] (migrate_pages) from [<c00d0cb0>] (alloc_contig_range+0x168/0x2f4) [ 213.490000] [<c00d0cb0>] (alloc_contig_range) from [<c010ec00>] (cma_alloc+0x170/0x2c0) [ 213.490000] [<c010ec00>] (cma_alloc) from [<c001a958>] (__alloc_from_contiguous+0x38/0xd8) [ 213.490000] [<c001a958>] (__alloc_from_contiguous) from [<c001ad44>] (__dma_alloc+0x23c/0x274) [ 213.490000] [<c001ad44>] (__dma_alloc) from [<c001ae08>] (arm_dma_alloc+0x54/0x5c) [ 213.490000] [<c001ae08>] (arm_dma_alloc) from [<c035cecc>] (drm_gem_cma_create+0xb8/0xf0) [ 213.490000] [<c035cecc>] (drm_gem_cma_create) from [<c035cf20>] (drm_gem_cma_create_with_handle+0x1c/0xe8) [ 213.490000] [<c035cf20>] (drm_gem_cma_create_with_handle) from [<c035d088>] (drm_gem_cma_dumb_create+0x3c/0x48) [ 213.490000] [<c035d088>] (drm_gem_cma_dumb_create) from [<c0341ed8>] (drm_ioctl+0x12c/0x444) [ 213.490000] [<c0341ed8>] (drm_ioctl) from [<c0121adc>] (do_vfs_ioctl+0x3f4/0x614) [ 213.490000] [<c0121adc>] (do_vfs_ioctl) from [<c0121d30>] (SyS_ioctl+0x34/0x5c) [ 213.490000] [<c0121d30>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34) UBIFS is using PagePrivate() which can have different meanings across filesystems. Therefore the generic page migration code cannot handle this case correctly. We have to implement our own migration function which basically does a plain copy but also duplicates the page private flag. UBIFS is not a block device filesystem and cannot use buffer_migrate_page(). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> [rw: Massaged changelog, build fixes, etc...] Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27mm: Export migrate_page_move_mapping and migrate_page_copyRichard Weinberger
commit 1118dce773d84f39ebd51a9fe7261f9169cb056e upstream. Export these symbols such that UBIFS can implement ->migratepage. Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [wt: also add the prototype to include/linux/migrate.h] Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27NFS: Fix another OPEN_DOWNGRADE bugTrond Myklebust
commit e547f2628327fec6afd2e03b46f113f614cca05b upstream. Olga Kornievskaia reports that the following test fails to trigger an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE. fd0 = open(foo, RDRW) -- should be open on the wire for "both" fd1 = open(foo, RDONLY) -- should be open on the wire for "read" close(fd0) -- should trigger an open_downgrade read(fd1) close(fd1) The issue is that we're missing a check for whether or not the current state transitioned from an O_RDWR state as opposed to having transitioned from a combination of O_RDONLY and O_WRONLY. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: cd9288ffaea4 ("NFSv4: Fix another bug in the close/open_downgrade code") Cc: stable@vger.kernel.org # 2.6.33+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27x86/amd_nb: Fix boot crash on non-AMD systemsBorislav Petkov
commit 1ead852dd88779eda12cb09cc894a03d9abfe1ec upstream. Fix boot crash that triggers if this driver is built into a kernel and run on non-AMD systems. AMD northbridges users call amd_cache_northbridges() and it returns a negative value to signal that we weren't able to cache/detect any northbridges on the system. At least, it should do so as all its callers expect it to do so. But it does return a negative value only when kmalloc() fails. Fix it to return -ENODEV if there are no NBs cached as otherwise, amd_nb users like amd64_edac, for example, which relies on it to know whether it should load or not, gets loaded on systems like Intel Xeons where it shouldn't. Reported-and-tested-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1466097230-5333-2-git-send-email-bp@alien8.de Link: https://lkml.kernel.org/r/5761BEB0.9000807@cybernetics.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27kprobes/x86: Clear TF bit in fault on single-steppingMasami Hiramatsu
commit dcfc47248d3f7d28df6f531e6426b933de94370d upstream. Fix kprobe_fault_handler() to clear the TF (trap flag) bit of the flags register in the case of a fault fixup on single-stepping. If we put a kprobe on the instruction which caused a page fault (e.g. actual mov instructions in copy_user_*), that fault happens on the single-stepping buffer. In this case, kprobes resets running instance so that the CPU can retry execution on the original ip address. However, current code forgets to reset the TF bit. Since this fault happens with TF bit set for enabling single-stepping, when it retries, it causes a debug exception and kprobes can not handle it because it already reset itself. On the most of x86-64 platform, it can be easily reproduced by using kprobe tracer. E.g. # cd /sys/kernel/debug/tracing # echo p copy_user_enhanced_fast_string+5 > kprobe_events # echo 1 > events/kprobes/enable And you'll see a kernel panic on do_debug(), since the debug trap is not handled by kprobes. To fix this problem, we just need to clear the TF bit when resetting running kprobe. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: systemtap@sourceware.org Cc: stable@vger.kernel.org # All the way back to ancient kernels Link: http://lkml.kernel.org/r/20160611140648.25885.37482.stgit@devbox [ Updated the comments. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27x86, build: copy ldlinux.c32 to image.isoH. Peter Anvin
commit 9c77679cadb118c0aa99e6f88533d91765a131ba upstream. For newer versions of Syslinux, we need ldlinux.c32 in addition to isolinux.bin to reside on the boot disk, so if the latter is found, copy it, too, to the isoimage tree. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Linux Stable Tree <stable@vger.kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27IB/mlx4: Fix the SQ size of an RC QPYishai Hadas
commit f2940e2c76bb554a7fbdd28ca5b90904117a9e96 upstream. When calculating the required size of an RC QP send queue, leave enough space for masked atomic operations, which require more space than "regular" atomic operation. Fixes: 6fa8f719844b ("IB/mlx4: Add support for masked atomic operations") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@mellanox.co.il> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27IB/IPoIB: Don't update neigh validity for unresolved entriesErez Shitrit
commit 61c78eea9516a921799c17b4c20558e2aa780fd3 upstream. ipoib_neigh_get unconditionally updates the "alive" variable member on any packet send. This prevents the neighbor garbage collection from cleaning out a dead neighbor entry if we are still queueing packets for it. If the queue for this neighbor is full, then don't update the alive timestamp. That way the neighbor can time out even if packets are still being queued as long as none of them are being sent. Fixes: b63b70d87741 ("IPoIB: Use a private hash table for path lookup in xmit path") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27IB/security: Restrict use of the write() interfaceJason Gunthorpe
commit e6bd18f57aad1a2d1ef40e646d03ed0f2515c9e3 upstream. The drivers/infiniband stack uses write() as a replacement for bi-directional ioctl(). This is not safe. There are ways to trigger write calls that result in the return structure that is normally written to user space being shunted off to user specified kernel memory instead. For the immediate repair, detect and deny suspicious accesses to the write API. For long term, update the user space libraries and the kernel API to something that doesn't present the same security vulnerabilities (likely a structured ioctl() interface). The impacted uAPI interfaces are generally only available if hardware from drivers/infiniband is installed in the system. Reported-by: Jann Horn <jann@thejh.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [ Expanded check to all known write() entry points ] Cc: stable@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com> [wt: no hfi1 subdir in 3.10. A minimal rdma/ib.h had to be created from 3.11 sources to keep the code similar to mainline] Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27IB/mlx4: Properly initialize GRH TClass and FlowLabel in AHsJason Gunthorpe
commit 8c5122e45a10a9262f872b53f151a592e870f905 upstream. When this code was reworked for IBoE support the order of assignments for the sl_tclass_flowlabel got flipped around resulting in TClass & FlowLabel being permanently set to 0 in the packet headers. This breaks IB routers that rely on these headers, but only affects kernel users - libmlx4 does this properly for user space. Cc: stable@vger.kernel.org Fixes: fa417f7b520e ("IB/mlx4: Add support for IBoE") Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27mac80211_hwsim: Add missing check for HWSIM_ATTR_SIGNALMartin Willi
commit 62397da50bb20a6b812c949ef465d7e69fe54bb6 upstream. A wmediumd that does not send this attribute causes a NULL pointer dereference, as the attribute is accessed even if it does not exist. The attribute was required but never checked ever since userspace frame forwarding has been introduced. The issue gets more problematic once we allow wmediumd registration from user namespaces. Fixes: 7882513bacb1 ("mac80211_hwsim driver support userspace frame tx/rx") Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27mac80211: mesh: flush mesh paths unconditionallyBob Copeland
commit fe7a7c57629e8dcbc0e297363a9b2366d67a6dc5 upstream. Currently, the mesh paths associated with a nexthop station are cleaned up in the following code path: __sta_info_destroy_part1 synchronize_net() __sta_info_destroy_part2 -> cleanup_single_sta -> mesh_sta_cleanup -> mesh_plink_deactivate -> mesh_path_flush_by_nexthop However, there are a couple of problems here: 1) the paths aren't flushed at all if the MPM is running in userspace (e.g. when using wpa_supplicant or authsae) 2) there is no synchronize_rcu between removing the path and readers accessing the nexthop, which means the following race is possible: CPU0 CPU1 ~~~~ ~~~~ sta_info_destroy_part1() synchronize_net() rcu_read_lock() mesh_nexthop_resolve() mpath = mesh_path_lookup() [...] -> mesh_path_flush_by_nexthop() sta = rcu_dereference( mpath->next_hop) kfree(sta) access sta <-- CRASH Fix both of these by unconditionally flushing paths before destroying the sta, and by adding a synchronize_net() after path flush to ensure no active readers can still dereference the sta. Fixes this crash: [ 348.529295] BUG: unable to handle kernel paging request at 00020040 [ 348.530014] IP: [<f929245d>] ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211] [ 348.530014] *pde = 00000000 [ 348.530014] Oops: 0000 [#1] PREEMPT [ 348.530014] Modules linked in: drbg ansi_cprng ctr ccm ppp_generic slhc ipt_MASQUERADE nf_nat_masquerade_ipv4 8021q ] [ 348.530014] CPU: 0 PID: 20597 Comm: wget Tainted: G O 4.6.0-rc5-wt=V1 #1 [ 348.530014] Hardware name: To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080016 11/07/2014 [ 348.530014] task: f64fa280 ti: f4f9c000 task.ti: f4f9c000 [ 348.530014] EIP: 0060:[<f929245d>] EFLAGS: 00010246 CPU: 0 [ 348.530014] EIP is at ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211] [ 348.530014] EAX: f4ce63e0 EBX: 00000088 ECX: f3788416 EDX: 00020008 [ 348.530014] ESI: 00000000 EDI: 00000088 EBP: f6409a4c ESP: f6409a40 [ 348.530014] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 [ 348.530014] CR0: 80050033 CR2: 00020040 CR3: 33190000 CR4: 00000690 [ 348.530014] Stack: [ 348.530014] 00000000 f4ce63e0 f5f9bd80 f6409a64 f9291d80 0000ce67 f5d51e00 f4ce63e0 [ 348.530014] f3788416 f6409a80 f9291dc1 f4ce8320 f4ce63e0 f5d51e00 f4ce63e0 f4ce8320 [ 348.530014] f6409a98 f9277f6f 00000000 00000000 0000007c 00000000 f6409b2c f9278dd1 [ 348.530014] Call Trace: [ 348.530014] [<f9291d80>] mesh_nexthop_lookup+0xbb/0xc8 [mac80211] [ 348.530014] [<f9291dc1>] mesh_nexthop_resolve+0x34/0xd8 [mac80211] [ 348.530014] [<f9277f6f>] ieee80211_xmit+0x92/0xc1 [mac80211] [ 348.530014] [<f9278dd1>] __ieee80211_subif_start_xmit+0x807/0x83c [mac80211] [ 348.530014] [<c04df012>] ? sch_direct_xmit+0xd7/0x1b3 [ 348.530014] [<c022a8c6>] ? __local_bh_enable_ip+0x5d/0x7b [ 348.530014] [<f956870c>] ? nf_nat_ipv4_out+0x4c/0xd0 [nf_nat_ipv4] [ 348.530014] [<f957e036>] ? iptable_nat_ipv4_fn+0xf/0xf [iptable_nat] [ 348.530014] [<c04c6f45>] ? netif_skb_features+0x14d/0x30a [ 348.530014] [<f9278e10>] ieee80211_subif_start_xmit+0xa/0xe [mac80211] [ 348.530014] [<c04c769c>] dev_hard_start_xmit+0x1f8/0x267 [ 348.530014] [<c04c7261>] ? validate_xmit_skb.isra.120.part.121+0x10/0x253 [ 348.530014] [<c04defc6>] sch_direct_xmit+0x8b/0x1b3 [ 348.530014] [<c04c7a9c>] __dev_queue_xmit+0x2c8/0x513 [ 348.530014] [<c04c7cfb>] dev_queue_xmit+0xa/0xc [ 348.530014] [<f91bfc7a>] batadv_send_skb_packet+0xd6/0xec [batman_adv] [ 348.530014] [<f91bfdc4>] batadv_send_unicast_skb+0x15/0x4a [batman_adv] [ 348.530014] [<f91b5938>] batadv_dat_send_data+0x27e/0x310 [batman_adv] [ 348.530014] [<f91c30b5>] ? batadv_tt_global_hash_find.isra.11+0x8/0xa [batman_adv] [ 348.530014] [<f91b63f3>] batadv_dat_snoop_outgoing_arp_request+0x208/0x23d [batman_adv] [ 348.530014] [<f91c0cd9>] batadv_interface_tx+0x206/0x385 [batman_adv] [ 348.530014] [<c04c769c>] dev_hard_start_xmit+0x1f8/0x267 [ 348.530014] [<c04c7261>] ? validate_xmit_skb.isra.120.part.121+0x10/0x253 [ 348.530014] [<c04defc6>] sch_direct_xmit+0x8b/0x1b3 [ 348.530014] [<c04c7a9c>] __dev_queue_xmit+0x2c8/0x513 [ 348.530014] [<f80cbd2a>] ? igb_xmit_frame+0x57/0x72 [igb] [ 348.530014] [<c04c7cfb>] dev_queue_xmit+0xa/0xc [ 348.530014] [<f843a326>] br_dev_queue_push_xmit+0xeb/0xfb [bridge] [ 348.530014] [<f843a35f>] br_forward_finish+0x29/0x74 [bridge] [ 348.530014] [<f843a23b>] ? deliver_clone+0x3b/0x3b [bridge] [ 348.530014] [<f843a714>] __br_forward+0x89/0xe7 [bridge] [ 348.530014] [<f843a336>] ? br_dev_queue_push_xmit+0xfb/0xfb [bridge] [ 348.530014] [<f843a234>] deliver_clone+0x34/0x3b [bridge] [ 348.530014] [<f843a68b>] ? br_flood+0x95/0x95 [bridge] [ 348.530014] [<f843a66d>] br_flood+0x77/0x95 [bridge] [ 348.530014] [<f843a809>] br_flood_forward+0x13/0x1a [bridge] [ 348.530014] [<f843a68b>] ? br_flood+0x95/0x95 [bridge] [ 348.530014] [<f843b877>] br_handle_frame_finish+0x392/0x3db [bridge] [ 348.530014] [<c04e9b2b>] ? nf_iterate+0x2b/0x6b [ 348.530014] [<f843baa6>] br_handle_frame+0x1e6/0x240 [bridge] [ 348.530014] [<f843b4e5>] ? br_handle_local_finish+0x6a/0x6a [bridge] [ 348.530014] [<c04c4ba0>] __netif_receive_skb_core+0x43a/0x66b [ 348.530014] [<f843b8c0>] ? br_handle_frame_finish+0x3db/0x3db [bridge] [ 348.530014] [<c023cea4>] ? resched_curr+0x19/0x37 [ 348.530014] [<c0240707>] ? check_preempt_wakeup+0xbf/0xfe [ 348.530014] [<c0255dec>] ? ktime_get_with_offset+0x5c/0xfc [ 348.530014] [<c04c4fc1>] __netif_receive_skb+0x47/0x55 [ 348.530014] [<c04c57ba>] netif_receive_skb_internal+0x40/0x5a [ 348.530014] [<c04c61ef>] napi_gro_receive+0x3a/0x94 [ 348.530014] [<f80ce8d5>] igb_poll+0x6fd/0x9ad [igb] [ 348.530014] [<c0242bd8>] ? swake_up_locked+0x14/0x26 [ 348.530014] [<c04c5d29>] net_rx_action+0xde/0x250 [ 348.530014] [<c022a743>] __do_softirq+0x8a/0x163 [ 348.530014] [<c022a6b9>] ? __hrtimer_tasklet_trampoline+0x19/0x19 [ 348.530014] [<c021100f>] do_softirq_own_stack+0x26/0x2c [ 348.530014] <IRQ> [ 348.530014] [<c022a957>] irq_exit+0x31/0x6f [ 348.530014] [<c0210eb2>] do_IRQ+0x8d/0xa0 [ 348.530014] [<c058152c>] common_interrupt+0x2c/0x40 [ 348.530014] Code: e7 8c 00 66 81 ff 88 00 75 12 85 d2 75 0e b2 c3 b8 83 e9 29 f9 e8 a7 5f f9 c6 eb 74 66 81 e3 8c 005 [ 348.530014] EIP: [<f929245d>] ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211] SS:ESP 0068:f6409a40 [ 348.530014] CR2: 0000000000020040 [ 348.530014] ---[ end trace 48556ac26779732e ]--- [ 348.530014] Kernel panic - not syncing: Fatal exception in interrupt [ 348.530014] Kernel Offset: disabled Reported-by: Fred Veldini <fred.veldini@gmail.com> Tested-by: Fred Veldini <fred.veldini@gmail.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27net: alx: Work around the DMA RX overflow issueFeng Tang
commit 881d0327db37ad917a367c77aff1afa1ee41e0a9 upstream. Note: This is a verified backported patch for stable 4.4 kernel, and it could also be applied to 4.3/4.2/4.1/3.18/3.16 There is a problem with alx devices, that the network link will be lost in 1-5 minutes after the device is up. >From debugging without datasheet, we found the error always happen when the DMA RX address is set to 0x....fc0, which is very likely to be a HW/silicon problem. This patch will apply rx skb with 64 bytes longer space, and if the allocated skb has a 0x...fc0 address, it will use skb_resever(skb, 64) to advance the address, so that the RX overflow can be avoided. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=70761 Signed-off-by: Feng Tang <feng.tang@intel.com> Suggested-by: Eric Dumazet <edumazet@google.com> Tested-by: Ole Lukoie <olelukoie@mail.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27ipmr/ip6mr: Initialize the last assert time of mfc entries.Tom Goff
commit 70a0dec45174c976c64b4c8c1d0898581f759948 upstream. This fixes wrong-interface signaling on 32-bit platforms for entries created when jiffies > 2^31 + MFC_ASSERT_THRESH. Signed-off-by: Tom Goff <thomas.goff@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27sit: correct IP protocol used in ipip6_errSimon Horman
commit d5d8760b78d0cfafe292f965f599988138b06a70 upstream. Since 32b8a8e59c9c ("sit: add IPv4 over IPv4 support") ipip6_err() may be called for packets whose IP protocol is IPPROTO_IPIP as well as those whose IP protocol is IPPROTO_IPV6. In the case of IPPROTO_IPIP packets the correct protocol value is not passed to ipv4_update_pmtu() or ipv4_redirect(). This patch resolves this problem by using the IP protocol of the packet rather than a hard-coded value. This appears to be consistent with the usage of the protocol of a packet by icmp_socket_deliver() the caller of ipip6_err(). I was able to exercise the redirect case by using a setup where an ICMP redirect was received for the destination of the encapsulated packet. However, it appears that although incorrect the protocol field is not used in this case and thus no problem manifests. On inspection it does not appear that a problem will manifest in the fragmentation needed/update pmtu case either. In short I believe this is a cosmetic fix. None the less, the use of IPPROTO_IPV6 seems wrong and confusing. Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27crypto: scatterwalk - Fix test in scatterwalk_doneHerbert Xu
commit 5f070e81bee35f1b7bd1477bb223a873ff657803 upstream. When there is more data to be processed, the current test in scatterwalk_done may prevent us from calling pagedone even when we should. In particular, if we're on an SG entry spanning multiple pages where the last page is not a full page, we will incorrectly skip calling pagedone on the second last page. This patch fixes this by adding a separate test for whether we've reached the end of a page. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27crypto: gcm - Filter out async ghash if necessaryHerbert Xu
commit b30bdfa86431afbafe15284a3ad5ac19b49b88e3 upstream. As it is if you ask for a sync gcm you may actually end up with an async one because it does not filter out async implementations of ghash. This patch fixes this by adding the necessary filter when looking for ghash. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27crypto: ux500 - memmove the right sizeLinus Walleij
commit 19ced623db2fe91604d69f7d86b03144c5107739 upstream. The hash buffer is really HASH_BLOCK_SIZE bytes, someone must have thought that memmove takes n*u32 words by mistake. Tests work as good/bad as before after this patch. Cc: Joakim Bech <joakim.bech@linaro.org> Cc: stable@vger.kernel.org Reported-by: David Binderman <linuxdev.baldrick@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27fix d_walk()/non-delayed __d_free() raceAl Viro
commit 3d56c25e3bb0726a5c5e16fc2d9e38f8ed763085 upstream. Ascend-to-parent logics in d_walk() depends on all encountered child dentries not getting freed without an RCU delay. Unfortunately, in quite a few cases it is not true, with hard-to-hit oopsable race as the result. Fortunately, the fix is simiple; right now the rule is "if it ever been hashed, freeing must be delayed" and changing it to "if it ever had a parent, freeing must be delayed" closes that hole and covers all cases the old rule used to cover. Moreover, pipes and sockets remain _not_ covered, so we do not introduce RCU delay in the cases which are the reason for having that delay conditional in the first place. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [wt: add the required change to __d_materialise_dentry() for kernels older than v3.17] Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-22ecryptfs: forbid opening files without mmap handlerJann Horn
commit 2f36db71009304b3f0b95afacd8eba1f9f046b87 upstream. This prevents users from triggering a stack overflow through a recursive invocation of pagefault handling that involves mapping procfs files into virtual memory. Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-22parisc: Fix pagefault crash in unaligned __get_user() callHelge Deller
commit 8b78f260887df532da529f225c49195d18fef36b upstream. One of the debian buildd servers had this crash in the syslog without any other information: Unaligned handler failed, ret = -2 clock_adjtime (pid 22578): Unaligned data reference (code 28) CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G E 4.5.0-2-parisc64-smp #1 Debian 4.5.4-1 task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111100000001111 Tainted: G E r00-03 000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0 r04-07 00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff r08-11 0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4 r12-15 000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b r16-19 0000000000028800 0000000000000001 0000000000000070 00000001bde7c218 r20-23 0000000000000000 00000001bde7c210 0000000000000002 0000000000000000 r24-27 0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0 r28-31 0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218 sr00-03 0000000001200000 0000000001200000 0000000000000000 0000000001200000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88 IIR: 0ca0d089 ISR: 0000000001200000 IOR: 00000000fa6f7fff CPU: 1 CR30: 00000001bde7c000 CR31: ffffffffffffffff ORIG_R28: 00000002369fe628 IAOQ[0]: compat_get_timex+0x2dc/0x3c0 IAOQ[1]: compat_get_timex+0x2e0/0x3c0 RP(r2): compat_get_timex+0x40/0x3c0 Backtrace: [<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0 [<0000000040205024>] syscall_exit+0x0/0x14 This means the userspace program clock_adjtime called the clock_adjtime() syscall and then crashed inside the compat_get_timex() function. Syscalls should never crash programs, but instead return EFAULT. The IIR register contains the executed instruction, which disassebles into "ldw 0(sr3,r5),r9". This load-word instruction is part of __get_user() which tried to read the word at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in. The unaligned handler is able to emulate all ldw instructions, but it fails if it fails to read the source e.g. because of page fault. The following program reproduces the problem: #define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> #include <sys/mman.h> int main(void) { /* allocate 8k */ char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); /* free second half (upper 4k) and make it invalid. */ munmap(ptr+4096, 4096); /* syscall where first int is unaligned and clobbers into invalid memory region */ /* syscall should return EFAULT */ return syscall(__NR_clock_adjtime, 0, ptr+4095); } To fix this issue we simply need to check if the faulting instruction address is in the exception fixup table when the unaligned handler failed. If it is, call the fixup routine instead of crashing. While looking at the unaligned handler I found another issue as well: The target register should not be modified if the handler was unsuccessful. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-22arm: oabi compat: add missing access checksDave Weinstein
commit 7de249964f5578e67b99699c5f0b405738d820a2 upstream. Add access checks to sys_oabi_epoll_wait() and sys_oabi_semtimedop(). This fixes CVE-2016-3857, a local privilege escalation under CONFIG_OABI_COMPAT. Cc: stable@vger.kernel.org Reported-by: Chiachih Wu <wuchiachih@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Dave Weinstein <olorin@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-22ARM: fix PTRACE_SETVFPREGS on SMP systemsRussell King
commit e2dfb4b880146bfd4b6aa8e138c0205407cebbaf upstream. PTRACE_SETVFPREGS fails to properly mark the VFP register set to be reloaded, because it undoes one of the effects of vfp_flush_hwstate(). Specifically vfp_flush_hwstate() sets thread->vfpstate.hard.cpu to an invalid CPU number, but vfp_set() overwrites this with the original CPU number, thereby rendering the hardware state as apparently "valid", even though the software state is more recent. Fix this by reverting the previous change. Cc: <stable@vger.kernel.org> Fixes: 8130b9d7b9d8 ("ARM: 7308/1: vfp: flush thread hwstate before copying ptrace registers") Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Simon Marchi <simon.marchi@ericsson.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-22KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGSPaolo Bonzini
commit d14bdb553f9196169f003058ae1cdabe514470e6 upstream. MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to any of bits 63:32. However, this is not detected at KVM_SET_DEBUGREGS time, and the next KVM_RUN oopses: general protection fault: 0000 [#1] SMP CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 [...] Call Trace: [<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm] [<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm] [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480 [<ffffffff812418a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71 Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 RIP [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40 RSP <ffff88005836bd50> Testcase (beautified/reduced from syzkaller output): #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[8]; int main() { struct kvm_debugregs dr = { 0 }; r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); memcpy(&dr, "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72" "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8" "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9" "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb", 48); r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr); r[6] = ioctl(r[4], KVM_RUN, 0); } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-22xfs: skip stale inodes in xfs_iflush_clusterDave Chinner
commit 7d3aa7fe970791f1a674b14572a411accf2f4d4e upstream. We don't write back stale inodes so we should skip them in xfs_iflush_cluster, too. cc: <stable@vger.kernel.org> # 3.10.x- Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-22xfs: fix inode validity check in xfs_iflush_clusterDave Chinner
commit 51b07f30a71c27405259a0248206ed4e22adbee2 upstream. Some careless idiot(*) wrote crap code in commit 1a3e8f3 ("xfs: convert inode cache lookups to use RCU locking") back in late 2010, and so xfs_iflush_cluster checks the wrong inode for whether it is still valid under RCU protection. Fix it to lock and check the correct inode. (*) Careless-idiot: Dave Chinner <dchinner@redhat.com> cc: <stable@vger.kernel.org> # 3.10.x- Discovered-by: Brain Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-22xfs: xfs_iflush_cluster fails to abort on errorDave Chinner
commit b1438f477934f5a4d5a44df26f3079a7575d5946 upstream. When a failure due to an inode buffer occurs, the error handling fails to abort the inode writeback correctly. This can result in the inode being reclaimed whilst still in the AIL, leading to use-after-free situations as well as filesystems that cannot be unmounted as the inode log items left in the AIL never get removed. Fix this by ensuring fatal errors from xfs_imap_to_bp() result in the inode flush being aborted correctly. Reported-by: Shyam Kaushik <shyam@zadarastorage.com> Diagnosed-by: Shyam Kaushik <shyam@zadarastorage.com> Tested-by: Shyam Kaushik <shyam@zadarastorage.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [wt: in kernels < 3.17, the error sign is positive, not negative] Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-21dma-debug: avoid spinlock recursion when disabling dma-debugVille Syrjälä
commit 3017cd63f26fc655d56875aaf497153ba60e9edf upstream. With netconsole (at least) the pr_err("... disablingn") call can recurse back into the dma-debug code, where it'll try to grab free_entries_lock again. Avoid the problem by doing the printk after dropping the lock. Link: http://lkml.kernel.org/r/1463678421-18683-1-git-send-email-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-21ext4: fix reference counting bug on block allocation errorVegard Nossum
commit 554a5ccc4e4a20c5f3ec859de0842db4b4b9c77e upstream. If we hit this error when mounted with errors=continue or errors=remount-ro: EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to continue. However, ext4_mb_release_context() is the wrong thing to call here since we are still actually using the allocation context. Instead, just error out. We could retry the allocation, but there is a possibility of getting stuck in an infinite loop instead, so this seems safer. [ Fixed up so we don't return EAGAIN to userspace. --tytso ] Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block allocation") Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: stable@vger.kernel.org [wt: 3.10 doesn't have EFSCORRUPTED, but XFS uses EUCLEAN as does 3.14 on this patch so use this instead] Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-21ext4: short-cut orphan cleanup on errorVegard Nossum
commit c65d5c6c81a1f27dec5f627f67840726fcd146de upstream. If we encounter a filesystem error during orphan cleanup, we should stop. Otherwise, we may end up in an infinite loop where the same inode is processed again and again. EXT4-fs (loop0): warning: checktime reached, running e2fsck is recommended EXT4-fs error (device loop0): ext4_mb_generate_buddy:758: group 2, block bitmap and bg descriptor inconsistent: 6117 vs 0 free clusters Aborting journal on device loop0-8. EXT4-fs (loop0): Remounting filesystem read-only EXT4-fs error (device loop0) in ext4_free_blocks:4895: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs error (device loop0) in ext4_ext_remove_space:3068: IO failure EXT4-fs error (device loop0) in ext4_ext_truncate:4667: Journal has aborted EXT4-fs error (device loop0) in ext4_orphan_del:2927: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs (loop0): Inode 16 (00000000618192a0): orphan list check failed! [...] EXT4-fs (loop0): Inode 16 (0000000061819748): orphan list check failed! [...] EXT4-fs (loop0): Inode 16 (0000000061819bf0): orphan list check failed! [...] See-also: c9eb13a9105 ("ext4: fix hang when processing corrupted orphaned inode list") Cc: Jan Kara <jack@suse.cz> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-21ext4: don't call ext4_should_journal_data() on the journal inodeVegard Nossum
commit 6a7fd522a7c94cdef0a3b08acf8e6702056e635c upstream. If ext4_fill_super() fails early, it's possible for ext4_evict_inode() to call ext4_should_journal_data() before superblock options and flags are fully set up. In that case, the iput() on the journal inode can end up causing a BUG(). Work around this problem by reordering the tests so we only call ext4_should_journal_data() after we know it's not the journal inode. Fixes: 2d859db3e4 ("ext4: fix data corruption in inodes with journalled data") Fixes: 2b405bfa84 ("ext4: fix data=journal fast mount/umount hang") Cc: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-21ext4: check for extents that wrap aroundVegard Nossum
commit f70749ca42943faa4d4dcce46dfdcaadb1d0c4b6 upstream. An extent with lblock = 4294967295 and len = 1 will pass the ext4_valid_extent() test: ext4_lblk_t last = lblock + len - 1; if (len == 0 || lblock > last) return 0; since last = 4294967295 + 1 - 1 = 4294967295. This would later trigger the BUG_ON(es->es_lblk + es->es_len < es->es_lblk) in ext4_es_end(). We can simplify it by removing the - 1 altogether and changing the test to use lblock + len <= lblock, since now if len = 0, then lblock + 0 == lblock and it fails, and if len > 0 then lblock + len > lblock in order to pass (i.e. it doesn't overflow). Fixes: 5946d0893 ("ext4: check for overlapping extents in ext4_valid_extent_entries()") Fixes: 2f974865f ("ext4: check for zero length extent explicitly") Cc: Eryu Guan <guaneryu@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-21ext4: verify extent header depthVegard Nossum
commit 7bc9491645118c9461bd21099c31755ff6783593 upstream. Although the extent tree depth of 5 should enough be for the worst case of 2*32 extents of length 1, the extent tree code does not currently to merge nodes which are less than half-full with a sibling node, or to shrink the tree depth if possible. So it's possible, at least in theory, for the tree depth to be greater than 5. However, even in the worst case, a tree depth of 32 is highly unlikely, and if the file system is maliciously corrupted, an insanely large eh_depth can cause memory allocation failures that will trigger kernel warnings (here, eh_depth = 65280): JBD2: ext4.exe wants too many credits credits:195849 rsv_credits:0 max:256 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 50 at fs/jbd2/transaction.c:293 start_this_handle+0x569/0x580 CPU: 0 PID: 50 Comm: ext4.exe Not tainted 4.7.0-rc5+ #508 Stack: 604a8947 625badd8 0002fd09 00000000 60078643 00000000 62623910 601bf9bc 62623970 6002fc84 626239b0 900000125 Call Trace: [<6001c2dc>] show_stack+0xdc/0x1a0 [<601bf9bc>] dump_stack+0x2a/0x2e [<6002fc84>] __warn+0x114/0x140 [<6002fdff>] warn_slowpath_null+0x1f/0x30 [<60165829>] start_this_handle+0x569/0x580 [<60165d4e>] jbd2__journal_start+0x11e/0x220 [<60146690>] __ext4_journal_start_sb+0x60/0xa0 [<60120a81>] ext4_truncate+0x131/0x3a0 [<60123677>] ext4_setattr+0x757/0x840 [<600d5d0f>] notify_change+0x16f/0x2a0 [<600b2b16>] do_truncate+0x76/0xc0 [<600c3e56>] path_openat+0x806/0x1300 [<600c55c9>] do_filp_open+0x89/0xf0 [<600b4074>] do_sys_open+0x134/0x1e0 [<600b4140>] SyS_open+0x20/0x30 [<6001ea68>] handle_syscall+0x88/0x90 [<600295fd>] userspace+0x3fd/0x500 [<6001ac55>] fork_handler+0x85/0x90 ---[ end trace 08b0b88b6387a244 ]--- [ Commit message modified and the extent tree depath check changed from 5 to 32 -- tytso ] Cc: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-21ext4: silence UBSAN in ext4_mb_init()Nicolai Stange
commit 935244cd54b86ca46e69bc6604d2adfb1aec2d42 upstream. Currently, in ext4_mb_init(), there's a loop like the following: do { ... offset += 1 << (sb->s_blocksize_bits - i); i++; } while (i <= sb->s_blocksize_bits + 1); Note that the updated offset is used in the loop's next iteration only. However, at the last iteration, that is at i == sb->s_blocksize_bits + 1, the shift count becomes equal to (unsigned)-1 > 31 (c.f. C99 6.5.7(3)) and UBSAN reports UBSAN: Undefined behaviour in fs/ext4/mballoc.c:2621:15 shift exponent 4294967295 is too large for 32-bit type 'int' [...] Call Trace: [<ffffffff818c4d25>] dump_stack+0xbc/0x117 [<ffffffff818c4c69>] ? _atomic_dec_and_lock+0x169/0x169 [<ffffffff819411ab>] ubsan_epilogue+0xd/0x4e [<ffffffff81941cac>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254 [<ffffffff81941ab1>] ? __ubsan_handle_load_invalid_value+0x158/0x158 [<ffffffff814b6dc1>] ? kmem_cache_alloc+0x101/0x390 [<ffffffff816fc13b>] ? ext4_mb_init+0x13b/0xfd0 [<ffffffff814293c7>] ? create_cache+0x57/0x1f0 [<ffffffff8142948a>] ? create_cache+0x11a/0x1f0 [<ffffffff821c2168>] ? mutex_lock+0x38/0x60 [<ffffffff821c23ab>] ? mutex_unlock+0x1b/0x50 [<ffffffff814c26ab>] ? put_online_mems+0x5b/0xc0 [<ffffffff81429677>] ? kmem_cache_create+0x117/0x2c0 [<ffffffff816fcc49>] ext4_mb_init+0xc49/0xfd0 [...] Observe that the mentioned shift exponent, 4294967295, equals (unsigned)-1. Unless compilers start to do some fancy transformations (which at least GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the such calculated value of offset is never used again. Silence UBSAN by introducing another variable, offset_incr, holding the next increment to apply to offset and adjust that one by right shifting it by one position per loop iteration. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161 Cc: stable@vger.kernel.org Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-21ext4: address UBSAN warning in mb_find_order_for_block()Nicolai Stange
commit b5cb316cdf3a3f5f6125412b0f6065185240cfdc upstream. Currently, in mb_find_order_for_block(), there's a loop like the following: while (order <= e4b->bd_blkbits + 1) { ... bb += 1 << (e4b->bd_blkbits - order); } Note that the updated bb is used in the loop's next iteration only. However, at the last iteration, that is at order == e4b->bd_blkbits + 1, the shift count becomes negative (c.f. C99 6.5.7(3)) and UBSAN reports UBSAN: Undefined behaviour in fs/ext4/mballoc.c:1281:11 shift exponent -1 is negative [...] Call Trace: [<ffffffff818c4d35>] dump_stack+0xbc/0x117 [<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169 [<ffffffff819411bb>] ubsan_epilogue+0xd/0x4e [<ffffffff81941cbc>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254 [<ffffffff81941ac1>] ? __ubsan_handle_load_invalid_value+0x158/0x158 [<ffffffff816e93a0>] ? ext4_mb_generate_from_pa+0x590/0x590 [<ffffffff816502c8>] ? ext4_read_block_bitmap_nowait+0x598/0xe80 [<ffffffff816e7b7e>] mb_find_order_for_block+0x1ce/0x240 [...] Unless compilers start to do some fancy transformations (which at least GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the such calculated value of bb is never used again. Silence UBSAN by introducing another variable, bb_incr, holding the next increment to apply to bb and adjust that one by right shifting it by one position per loop iteration. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161 Cc: stable@vger.kernel.org Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-21ext4: fix hang when processing corrupted orphaned inode listTheodore Ts'o
commit c9eb13a9105e2e418f72e46a2b6da3f49e696902 upstream. If the orphaned inode list contains inode #5, ext4_iget() returns a bad inode (since the bootloader inode should never be referenced directly). Because of the bad inode, we end up processing the inode repeatedly and this hangs the machine. This can be reproduced via: mke2fs -t ext4 /tmp/foo.img 100 debugfs -w -R "ssv last_orphan 5" /tmp/foo.img mount -o loop /tmp/foo.img /mnt (But don't do this if you are using an unpatched kernel if you care about the system staying functional. :-) This bug was found by the port of American Fuzzy Lop into the kernel to find file system problems[1]. (Since it *only* happens if inode #5 shows up on the orphan list --- 3, 7, 8, etc. won't do it, it's not surprising that AFL needed two hours before it found it.) [1] http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf Cc: stable@vger.kernel.org Reported by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-21drm/radeon: fix firmware info version checksAlex Deucher
commit 3edc38a0facef45ee22af8afdce3737f421f36ab upstream. Some of the checks didn't handle frev 2 tables properly. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Willy Tarreau <w@1wt.eu>