lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2009-02-02	Linux 2.6.27.14v2.6.27.14	Greg Kroah-Hartman

2009-02-02	relay: fix lock imbalance in relay_late_setup_files	Jiri Slaby
	commit b786c6a98ef6fa81114ba7b9fbfc0d67060775e3 upstream. One fail path in relay_late_setup_files() omits mutex_unlock(&relay_channels_mutex); Add it. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	PCI hotplug: fix lock imbalance in pciehp	Jiri Slaby
	commit c2fdd36b550659f5ac2240d1f5a83ffa1a092289 upstream. set_lock_status omits mutex_unlock in fail path. Add the omitted unlock. As a result a lockup caused by this can be triggered from userspace by writing 1 to /sys/bus/pci/slots/.../lock often enough. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	rtl8187: Fix error in setting OFDM power settings for RTL8187L	Larry Finger
	commit eb83bbf57429ab80f49b413e3e44d3b19c3fdc5a upstream. After reports of poor performance, a review of the latest vendor driver (rtl8187_linux_26.1025.0328.2007) for RTL8187L devices was undertaken. A difference was found in the code used to index the OFDM power tables. When the Linux driver was changed, my unit works at a much greater range than before. I think this fixes Bugzilla #12380 and has been tested by at least two other users. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Tested-by: Martín Ernesto Barreyro <barreyromartin@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	sound: virtuoso: do not overwrite EEPROM on Xonar D2/D2X	Clemens Ladisch
	commit 7e86c0e6850504ec9516b953f316a47277825e33 upstream. On the Asus Xonar D2 and D2X models, the SPI chip select signal for the fourth DAC shares its pin with the serial clock for the EEPROM that contains the PCI subdevice ID values. It appears that when DAC registers are written and some other unknown conditions occur (probably noise on the EEPROM's chip select line), the EEPROM gets overwritten with garbage, which makes it impossible to properly detect the card later. Therefore, we better avoid DAC register writes and make sure that the driver works with the DAC's registers' default values. Consequently, the sample format is now I2S instead of left-justified (no user-visible change), and the DAC's volume/mute registers cannot be used anymore (volume changes are now done by the software volume plugin). Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	sgi-xpc: Remove NULL pointer dereference.	Robin Holt
	commit 17e2161654da4e6bdfd8d53d4f52e820ee93f423 upstream. If the bte copy fails, the attempt to retrieve payloads merely returns a null pointer deref and not NULL as was expected. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	sgi-xpc: ensure flags are updated before bte_copy	Robin Holt
	commit 69b3bb65fa97a1e8563518dbbc35cd57beefb2d4 upstream. The clearing of the msg->flags needs a barrier between it and the notify of the channel threads that the messages are cleaned and ready for use. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	include/linux: Add bsg.h to the Kernel exported headers	Boaz Harrosh
	commit a229fc61ef0ee3c30fd193beee0eeb87410227f1 upstream. bsg.h in current form is perfectly suitable for user-mode consumption. It is needed together with scsi/sg.h for applications that want to interface with the bsg driver. Currently the few projects that use it would copy it over into the projects. But that is not acceptable for projects that need to provide source and devel packages for distros. This should also be submitted to stable 2.6.28 and 2.6.27 since bsg had a stable API since these Kernels and distro users will need the header for these kernels a swell Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	ext3: Add sanity check to make_indexed_dir	Theodore Ts'o
	commit a21102b55c4f8dfd3adb4a15a34cd62237b46039 upstream. Make sure the rec_len field in the '..' entry is sane, lest we overrun the directory block and cause a kernel oops on a purposefully corrupted filesystem. This fixes a bug related to a bug originally reported by Sami Liedes for ext4 at: http://bugzilla.kernel.org/show_bug.cgi?id=12430 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	epoll: drop max_user_instances and rely only on max_user_watches	Davide Libenzi
	commit 9df04e1f25effde823a600e755b51475d438f56b upstream. Linus suggested to put limits where the money is, and max_user_watches already does that w/out the need of max_user_instances. That has the advantage to mitigate the potential DoS while allowing pretty generous default behavior. Allowing top 4% of low memory (per user) to be allocated in epoll watches, we have: LOMEM MAX_WATCHES (per user) 512MB ~178000 1GB ~356000 2GB ~712000 A box with 512MB of lomem, will meet some challenge in hitting 180K watches, socket buffers math teaches us. No more max_user_instances limits then. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Willy Tarreau <w@1wt.eu> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: Bron Gondwana <brong@fastmail.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	crypto: ccm - Fix handling of null assoc data	Jarod Wilson
	commit 516280e735b034216de97eb7ba080ec6acbfc58f upstream. Its a valid use case to have null associated data in a ccm vector, but this case isn't being handled properly right now. The following ccm decryption/verification test vector, using the rfc4309 implementation regularly triggers a panic, as will any other vector with null assoc data: * key: ab2f8a74b71cd2b1ff802e487d82f8b9 * iv: c6fb7d800d13abd8a6b2d8 * Associated Data: [NULL] * Tag Length: 8 * input: d5e8939fc7892e2b The resulting panic looks like so: Unable to handle kernel paging request at ffff810064ddaec0 RIP: [<ffffffff8864c4d7>] :ccm:get_data_to_compute+0x1a6/0x1d6 PGD 8063 PUD 0 Oops: 0002 [1] SMP last sysfs file: /module/libata/version CPU 0 Modules linked in: crypto_tester_kmod(U) seqiv krng ansi_cprng chainiv rng ctr aes_generic aes_x86_64 ccm cryptomgr testmgr_cipher testmgr aead crypto_blkcipher crypto_a lgapi des ipv6 xfrm_nalgo crypto_api autofs4 hidp l2cap bluetooth nfs lockd fscache nfs_acl sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_ tcpudp iptable_filter ip_tables x_tables dm_mirror dm_log dm_multipath scsi_dh dm_mod video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac lp sg snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss joydev snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ide_cd snd_pcm floppy parport_p c shpchp e752x_edac snd_timer e1000 i2c_i801 edac_mc snd soundcore snd_page_alloc i2c_core cdrom parport serio_raw pcspkr ata_piix libata sd_mod scsi_mod ext3 jbd uhci_h cd ohci_hcd ehci_hcd Pid: 12844, comm: crypto-tester Tainted: G 2.6.18-128.el5.fips1 #1 RIP: 0010:[<ffffffff8864c4d7>] [<ffffffff8864c4d7>] :ccm:get_data_to_compute+0x1a6/0x1d6 RSP: 0018:ffff8100134434e8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8100104898b0 RCX: ffffffffab6aea10 RDX: 0000000000000010 RSI: ffff8100104898c0 RDI: ffff810064ddaec0 RBP: 0000000000000000 R08: ffff8100104898b0 R09: 0000000000000000 R10: ffff8100103bac84 R11: ffff8100104898b0 R12: ffff810010489858 R13: ffff8100104898b0 R14: ffff8100103bac00 R15: 0000000000000000 FS: 00002ab881adfd30(0000) GS:ffffffff803ac000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff810064ddaec0 CR3: 0000000012a88000 CR4: 00000000000006e0 Process crypto-tester (pid: 12844, threadinfo ffff810013442000, task ffff81003d165860) Stack: ffff8100103bac00 ffff8100104898e8 ffff8100134436f8 ffffffff00000000 0000000000000000 ffff8100104898b0 0000000000000000 ffff810010489858 0000000000000000 ffff8100103bac00 ffff8100134436f8 ffffffff8864c634 Call Trace: [<ffffffff8864c634>] :ccm:crypto_ccm_auth+0x12d/0x140 [<ffffffff8864cf73>] :ccm:crypto_ccm_decrypt+0x161/0x23a [<ffffffff88633643>] :crypto_tester_kmod:cavs_test_rfc4309_ccm+0x4a5/0x559 [...] The above is from a RHEL5-based kernel, but upstream is susceptible too. The fix is trivial: in crypto/ccm.c:crypto_ccm_auth(), pctx->ilen contains whatever was in memory when pctx was allocated if assoclen is 0. The tested fix is to simply add an else clause setting pctx->ilen to 0 for the assoclen == 0 case, so that get_data_to_compute() doesn't try doing things its not supposed to. Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	crypto: authenc - Fix zero-length IV crash	Herbert Xu
	commit 29b37f42127f7da511560a40ea74f5047da40c13 upstream. As it is if an algorithm with a zero-length IV is used (e.g., NULL encryption) with authenc, authenc may generate an SG entry of length zero, which will trigger a BUG check in the hash layer. This patch fixes it by skipping the IV SG generation if the IV size is zero. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	bnx2x: Block nvram access when the device is inactive	Eilon Greenstein
	commit 2add3acb11a26cc14b54669433ae6ace6406cbf2 upstream. Don't dump eeprom when bnx2x adapter is down. Running ethtool -e causes an eeh without it when the device is down Signed-off-by: Paul Larson <pl@linux.vnet.ibm.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	ALSA: hda - Fix PCM reference NID for STAC/IDT analog outputs	Takashi Iwai
	commit 00a602db1ce9d61319d6f769dee206ec85f19bda upstream. The reference NID for the analog outputs of STAC/IDT codecs is set to a fixed number 0x02. But this isn't always correct and in many codecs it points to a non-existing NID. This patch fixes the initialization of the PCM reference NID taken from the actually probed DAC list. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	ALSA: hda - Add quirk for HP DV6700 laptop	Joerg Schirottke
	commit aa9d823bb347fb66cb07f98c686be8bb85cb6a74 upstream. Added the matching model=laptop for HP DV6700 laptop. Signed-off-by: Joerg Schirottke <master@kanotix.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	ALSA: hda - add another MacBook Pro 4, 1 subsystem ID	Luke Yelavich
	commit 2a88464ceb1bda2571f88902fd8068a6168e3f7b upstream. Add another MacBook Pro 4,1 SSID (106b:3800). It seems that latter revisions, (at least mine), have different IDs to earlier revisions. Signed-off-by: Luke Yelavich <themuso@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	USB: usbmon: Implement compat_ioctl	Pete Zaitcev
	commit 7abce6bedc118eb39fe177c2c26be5d008505c14 upstream. Running a 32-bit usbmon(8) on 2.6.28-rc9 produces the following: ioctl32(usbmon:28563): Unknown cmd fd(3) cmd(400c9206){t:ffffff92;sz:12} arg(ffd3f458) on /dev/usbmon0 It happens because the compatibility mode was implemented for 2.6.18 and not updated for the fsops.compat_ioctl API. This patch relocates the pieces from under #ifdef CONFIG_COMPAT into compat_ioctl with no other changes except one new whitespace. Signed-off-by: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	USB: storage: add unusual devs entry	Oliver Neukum
	commit b90de8aea36ae6fe8050a6e91b031369c4f251b2 upstream. This adds an unusual devs entry for 2116:0320 Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	USB: fix char-device disconnect handling	Alan Stern
	commit 501950d846218ed80a776d2aae5aed9c8b92e778 upstream. This patch (as1198) fixes a conceptual bug: Somewhere along the line we managed to confuse USB class devices with USB char devices. As a result, the code to send a disconnect signal to userspace would not be built if both CONFIG_USB_DEVICE_CLASS and CONFIG_USB_DEVICEFS were disabled. The usb_fs_classdev_common_remove() routine has been renamed to usbdev_remove() and it is now called whenever any USB device is removed, not just when a class device is unregistered. The notifier registration and unregistration calls are no longer conditionally compiled. And since the common removal code will always be called as part of the char device interface, there's no need to call it again as part of the usbfs interface; thus the invocation of usb_fs_classdev_common_remove() has been taken out of usbfs_remove_device(). Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Alon Bar-Lev <alon.barlev@gmail.com> Tested-by: Alon Bar-Lev <alon.barlev@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	SUNRPC: Fix autobind on cloned rpc clients	Trond Myklebust
	commit 9a4bd29fe8f6d3f015fe1c8e5450eb62cfebfcc9 upstream. Despite the fact that cloned rpc clients won't have the cl_autobind flag set, they may still find themselves calling rpcb_getport_async(). For this to happen, it suffices for a _parent_ rpc_clnt to use autobinding, in which case any clone may find itself triggering the !xprt_bound() case in call_bind(). The correct fix for this is to walk back up the tree of cloned rpc clients, in order to find the parent that 'owns' the transport, either because it has clnt->cl_autobind set, or because it originally created the transport... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	SUNRPC: Fix a memory leak in rpcb_getport_async	Trond Myklebust
	commit 96165e2b7c4e2c82a0b60c766d4a2036444c21a0 upstream. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	serial_8250: support for Sealevel Systems Model 7803 COMM+8	Flavio Leitner
	commit e65f0f8271b1b0452334e5da37fd35413a000de4 upstream. Add support for Sealevel Systems Model 7803 COMM+8 Signed-off-by: Flavio Leitner <fleitner@redhat.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	rtl8187: Add termination packet to prevent stall	Larry Finger
	commit 2fcbab044a3faf4d4a6e269148dd1f188303b206 upstream. The RTL8187 and RTL8187B devices can stall unless an explicit termination packet is sent. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	libata: pata_via: support VX855, future chips whose IDE controller use 0x0571	JosephChan@via.com.tw
	commit e4d866cdea24543ee16ce6d07d80c513e86ba983 upstream. It supports VX855 and future chips whose IDE controller uses PCI ID 0x0571. Signed-off-by: Joseph Chan <josephchan@via.com.tw> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	it821x: Add ultra_mask quirk for Vortex86SX	Brandon Philips
	commit b94b898f3107046b5c97c556e23529283ea5eadd upstream. On Vortex86SX with IDE controller revision 0x11 ultra DMA must be disabled. This patch was tested by DMP and seems to work. It is a cleaned up version of their older Kernel patch: http://www.dmp.com.tw/tech/vortex86sx/patch-2.6.24-DMP.gz Tested-by: Shawn Lin <shawn@dmp.com.tw> Signed-off-by: Brandon Philips <bphilips@suse.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	alpha: nautilus - fix compile failure with gcc-4.3	Ivan Kokshaysky
	commit 70b66cbfd3316b792a855cb9a2574e85f1a63d0f upstream. init_srm_irq() deals with irq's #16 and above, but size of irq_desc array on nautilus and some other system types is 16. So gcc-4.3 complains that "array subscript is above array bounds", even though this function is never called on those systems. This adds a check for NR_IRQS <= 16, which effectively optimizes init_srm_irq() code away on problematic platforms. Thanks to Daniel Drake <dsd@gentoo.org> for detailed analysis of the problem. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tobias Klausmann <klausman@schwarzvogel.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	x86, mm: fix pte_free()	Peter Zijlstra
	commit 42ef73fe134732b2e91c0326df5fd568da17c4b2 upstream. On -rt we were seeing spurious bad page states like: Bad page state in process 'firefox' page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0 Trying to fix it up, but a reboot is needed Backtrace: Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3 [<c043d0f3>] ? printk+0x14/0x19 [<c0272d4e>] bad_page+0x4e/0x79 [<c0273831>] free_hot_cold_page+0x5b/0x1d3 [<c02739f6>] free_hot_page+0xf/0x11 [<c0273a18>] __free_pages+0x20/0x2b [<c027d170>] __pte_alloc+0x87/0x91 [<c027d25e>] handle_mm_fault+0xe4/0x733 [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63 [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63 [<c0218875>] do_page_fault+0x36f/0x88a This is the case where a concurrent fault already installed the PTE and we get to free the newly allocated one. This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl) which is overlaid with the {private, mapping} struct. union { struct { unsigned long private; struct address_space mapping; }; spinlock_t ptl; struct kmem_cache slab; struct page *first_page; }; Normally the spinlock is small enough to not stomp on page->mapping, but PREEMPT_RT=y has huge 'spin'locks. But lockdep kernels should also be able to trigger this splat, as the lock tracking code grows the spinlock to cover page->mapping. The obvious fix is calling pgtable_page_dtor() like the regular pte free path __pte_free_tlb() does. It seems all architectures except x86 and nm10300 already do this, and nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it doesn't do SMP or simply doesnt do MMU at all or something. Signed-off-by: Peter Zijlstra <a.p.zijlsta@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	sysfs: fix problems with binary files	Greg Kroah-Hartman
	commit 4503efd0891c40e30928afb4b23dc3f99c62a6b2 upstream. Some sysfs binary files don't like having 0 passed to them as a size. Fix this up at the root by just returning to the vfs if userspace asks us for a zero sized buffer. Thanks to Pavel Roskin for pointing this out. Reported-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	mac80211: decrement ref count to netdev after launching mesh discovery	Brian Cavagnolo
	commit 5dc306f3bd1d4cfdf79df39221b3036eab1ddcf3 upstream. After launching mesh discovery in tx path, reference count was not being decremented. This was preventing module unload. Signed-off-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: Andrey Yurovsky <andrey@cozybit.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	inotify: clean up inotify_read and fix locking problems	Vegard Nossum
	commit 3632dee2f8b8a9720329f29eeaa4ec4669a3aff8 upstream. If userspace supplies an invalid pointer to a read() of an inotify instance, the inotify device's event list mutex is unlocked twice. This causes an unbalance which effectively leaves the data structure unprotected, and we can trigger oopses by accessing the inotify instance from different tasks concurrently. The best fix (contributed largely by Linus) is a total rewrite of the function in question: On Thu, Jan 22, 2009 at 7:05 AM, Linus Torvalds wrote: > The thing to notice is that: > > - locking is done in just one place, and there is no question about it > not having an unlock. > > - that whole double-while(1)-loop thing is gone. > > - use multiple functions to make nesting and error handling sane > > - do error testing after doing the things you always need to do, ie do > this: > > mutex_lock(..) > ret = function_call(); > mutex_unlock(..) > > .. test ret here .. > > instead of doing conditional exits with unlocking or freeing. > > So if the code is written in this way, it may still be buggy, but at least > it's not buggy because of subtle "forgot to unlock" or "forgot to free" > issues. > > This _always_ unlocks if it locked, and it always frees if it got a > non-error kevent. Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Robert Love <rlove@google.com> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	fuse: fix NULL deref in fuse_file_alloc()	Dan Carpenter
	commit bb875b38dc5e343bdb696b2eab8233e4d195e208 upstream. ff is set to NULL and then dereferenced on line 65. Compile tested only. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	fuse: fix missing fput on error	Miklos Szeredi
	commit 3ddf1e7f57237ac7c5d5bfb7058f1ea4f970b661 upstream. Fix the leaking file reference if allocation or initialization of fuse_conn failed. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-02-02	fuse: destroy bdi on umount	Miklos Szeredi
	commit 26c3679101dbccc054dcf370143941844ba70531 upstream. If a fuse filesystem is unmounted but the device file descriptor remains open and a new mount reuses the old device number, then the mount fails with EEXIST and the following warning is printed in the kernel log: WARNING: at fs/sysfs/dir.c:462 sysfs_add_one+0x35/0x3d() sysfs: duplicate filename '0:15' can not be created The cause is that the bdi belonging to the fuse filesystem was destoryed only after the device file was released. Fix this by calling bdi_destroy() from fuse_put_super() instead. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	Linux 2.6.27.13v2.6.27.13	Greg Kroah-Hartman

2009-01-24	fs: sys_sync fix	Nick Piggin
	commit 856bf4d717feb8c55d4e2f817b71ebb70cfbc67b upstream. s_syncing livelock avoidance was breaking data integrity guarantee of sys_sync, by allowing sys_sync to skip writing or waiting for superblocks if there is a concurrent sys_sync happening. This livelock avoidance is much less important now that we don't have the get_super_to_sync() call after every sb that we sync. This was replaced by __put_super_and_need_restart. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	fs: sync_sb_inodes fix	Nick Piggin
	commit 38f21977663126fef53f5585e7f1653d8ebe55c4 upstream. Fix data integrity semantics required by sys_sync, by iterating over all inodes and waiting for any writeback pages after the initial writeout. Comments explain the exact problem. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	fs: remove WB_SYNC_HOLD	Nick Piggin
	commit 4f5a99d64c17470a784a6c68064207d82e3e74a5 upstream. Remove WB_SYNC_HOLD. The primary motiviation is the design of my anti-starvation code for fsync. It requires taking an inode lock over the sync operation, so we could run into lock ordering problems with multiple inodes. It is possible to take a single global lock to solve the ordering problem, but then that would prevent a future nice implementation of "sync multiple inodes" based on lock order via inode address. Seems like a backward step to remove this, but actually it is busted anyway: we can't use the inode lists for data integrity wait: an inode can be taken off the dirty lists but still be under writeback. In order to satisfy data integrity semantics, we should wait for it to finish writeback, but if we only search the dirty lists, we'll miss it. It would be possible to have a "writeback" list, for sys_sync, I suppose. But why complicate things by prematurely optimise? For unmounting, we could avoid the "livelock avoidance" code, which would be easier, but again premature IMO. Fixing the existing data integrity problem will come next. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	mm: direct IO starvation improvement	Nick Piggin
	commit 48b47c561e41525061b5bc0cfd67d6367fd11dc4 upstream. Direct IO can invalidate and sync a lot of pagecache pages in the mapping. A 4K direct IO will actually try to sync and/or invalidate the pagecache of the entire file, for example (which might be many GB or TB large). Improve this by doing range syncs. Also, memory no longer has to be unmapped to catch the dirty bits for syncing, as dirty bits would remain coherent due to dirty mmap accounting. This fixes the immediate DM deadlocks when doing direct IO reads to block device with a mounted filesystem, if only by papering over the problem somewhat rather than addressing the fsync starvation cases. Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	mm: do_sync_mapping_range integrity fix	Nick Piggin
	commit ee53a891f47444c53318b98dac947ede963db400 upstream. Chris Mason notices do_sync_mapping_range didn't actually ask for data integrity writeout. Unfortunately, it is advertised as being usable for data integrity operations. This is a data integrity bug. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	mm: write_cache_pages more terminate quickly	Andrew Morton
	commit 82fd1a9a8ced9607312b54859572bcc6211e8919 upstream. Now that we have the early-termination logic in place, it makes sense to bail out early in all other cases where done is set to 1. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	mm: write_cache_pages terminate quickly	Nick Piggin
	commit d5482cdf8a0aacb1e6468a97d5544f5829c8d8c4 upstream. Terminate the write_cache_pages loop upon encountering the first page past end, without locking the page. Pages cannot have their index change when we have a reference on them (truncate, eg truncate_inode_pages_range performs the same check without the page lock). Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	mm: write_cache_pages optimise page cleaning	Nick Piggin
	commit 515f4a037fb9ab736f8bad733fcd2ffd350cf265 upstream. In write_cache_pages, if we get stuck behind another process that is cleaning pages, we will be forced to wait for them to finish, then perform our own writeout (if it was redirtied during the long wait), then wait for that. If a page under writeout is still clean, we can skip waiting for it (if we're part of a data integrity sync, we'll be waiting for all writeout pages afterwards, so we'll still be waiting for the other guy's write that's cleaned the page). Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	mm: write_cache_pages cleanups	Nick Piggin
	commit 5a3d5c9813db56a75934eb1015367fda23a8b0b4 upstream. Get rid of some complex expressions from flow control statements, add a comment, remove some duplicate code. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	mm: write_cache_pages integrity fix	Nick Piggin
	commit 05fe478dd04e02fa230c305ab9b5616669821dd3 upstream. In write_cache_pages, nr_to_write is heeded even for data-integrity syncs, so the function will return success after writing out nr_to_write pages, even if that was not sufficient to guarantee data integrity. The callers tend to set it to values that could break data interity semantics easily in practice. For example, nr_to_write can be set to mapping->nr_pages * 2, however if a file has a single, dirty page, then fsync is called, subsequent pages might be concurrently added and dirtied, then write_cache_pages might writeout two of these newly dirty pages, while not writing out the old page that should have been written out. Fix this by ignoring nr_to_write if it is a data integrity sync. This is a data integrity bug. The reason this has been done in the past is to avoid stalling sync operations behind page dirtiers. "If a file has one dirty page at offset 1000000000000000 then someone does an fsync() and someone else gets in first and starts madly writing pages at offset 0, we want to write that page at 1000000000000000. Somehow." What we do today is return success after an arbitrary amount of pages are written, whether or not we have provided the data-integrity semantics that the caller has asked for. Even this doesn't actually fix all stall cases completely: in the above situation, if the file has a huge number of pages in pagecache (but not dirty), then mapping->nrpages is going to be huge, even if pages are being dirtied. This change does indeed make the possibility of long stalls lager, and that's not a good thing, but lying about data integrity is even worse. We have to either perform the sync, or return -ELINUXISLAME so at least the caller knows what has happened. There are subsequent competing approaches in the works to solve the stall problems properly, without compromising data integrity. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	mm: write_cache_pages writepage error fix	Nick Piggin
	commit 00266770b8b3a6a77f896ca501a0613739086832 upstream. In write_cache_pages, if ret signals a real error, but we still have some pages left in the pagevec, done would be set to 1, but the remaining pages would continue to be processed and ret will be overwritten in the process. It could easily be overwritten with success, and thus success will be returned even if there is an error. Thus the caller is told all writes succeeded, wheras in reality some did not. Fix this by bailing immediately if there is an error, and retaining the first error code. This is a data integrity bug. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	mm: write_cache_pages early loop termination	Nick Piggin
	commit bd19e012f6fd3b7309689165ea865cbb7bb88c1e upstream. We'd like to break out of the loop early in many situations, however the existing code has been setting mapping->writeback_index past the final page in the pagevec lookup for cyclic writeback. This is a problem if we don't process all pages up to the final page. Currently the code mostly keeps writeback_index reasonable and hacked around this by not breaking out of the loop or writing pages outside the range in these cases. Keep track of a real "done index" that enables us to terminate the loop in a much more flexible manner. Needed by the subsequent patch to preserve writepage errors, and then further patches to break out of the loop early for other reasons. However there are no functional changes with this patch alone. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	mm: write_cache_pages cyclic fix	Nick Piggin
	commit 31a12666d8f0c22235297e1c1575f82061480029 upstream. In write_cache_pages, scanned == 1 is supposed to mean that cyclic writeback has circled through zero, thus we should not circle again. However it gets set to 1 after the first successful pagevec lookup. This leads to cases where not enough data gets written. Counterexample: file with first 10 pages dirty, writeback_index == 5, nr_to_write == 10. Then the 5 last pages will be found, and scanned will be set to 1, after writing those out, we will not cycle back to get the first 5. Rework this logic, now we'll always cycle unless we started off from index 0. When cycling, only write out as far as 1 page before the start page from the first cycle (so we don't write parts of the file twice). Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	powerpc: is_hugepage_only_range() must account for both 4kB and 64kB slices	Dave Kleikamp
	commit 9ba0fdbfaed2e74005d87fab948c5522b86ff733 upstream. powerpc: is_hugepage_only_range() must account for both 4kB and 64kB slices The subpage_prot syscall fails on second and subsequent calls for a given region, because is_hugepage_only_range() is mis-identifying the 4 kB slices when the process has a 64 kB page size. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	hwmon: (abituguru3) Fix CONFIG_DMI=n fallback to probe	Alistair John Strachan
	commit 46a5f173fc88ffc22651162033696d8a9fbcdc5c upstream. When CONFIG_DMI is not enabled, dmi detection should flag that no board could be detected (err=1) rather than another error condition (err<0). This fixes the fallback to manual probing for all motherboards, even those without DMI strings, when CONFIG_DMI=n. Signed-off-by: Alistair John Strachan <alistair@devzero.co.uk> Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-24	dell_rbu: use scnprintf() instead of less secure sprintf()	Pavel Roskin
	commit 81156928f8fe31621e467490b9d441c0285998c3 upstream. Reading 0 bytes from /sys/devices/platform/dell_rbu/image_type or /sys/devices/platform/dell_rbu/packet_size by an ordinary user causes an oops. Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>