<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/include/linux, branch v3.2.17</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=v3.2.17</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=v3.2.17'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2012-05-11T12:14:50+00:00</updated>
<entry>
<title>Fix __read_seqcount_begin() to use ACCESS_ONCE for sequence value read</title>
<updated>2012-05-11T12:14:50+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-05-04T21:46:02+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=32b216a576ef1bf96c1434578c4c978c2b863e92'/>
<id>urn:sha1:32b216a576ef1bf96c1434578c4c978c2b863e92</id>
<content type='text'>
commit 2f624278626677bfaf73fef97f86b37981621f5c upstream.

We really need to use a ACCESS_ONCE() on the sequence value read in
__read_seqcount_begin(), because otherwise the compiler might end up
reloading the value in between the test and the return of it.  As a
result, it might end up returning an odd value (which means that a write
is in progress).

If the reader is then fast enough that that odd value is still the
current one when the read_seqcount_retry() is done, we might end up with
a "successful" read sequence, even despite the concurrent write being
active.

In practice this probably never really happens - there just isn't
anything else going on around the read of the sequence count, and the
common case is that we end up having a read barrier immediately
afterwards.

So the code sequence in which gcc might decide to reaload from memory is
small, and there's no reason to believe it would ever actually do the
reload.  But if the compiler ever were to decide to do so, it would be
incredibly annoying to debug.  Let's just make sure.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>efi: Add new variable attributes</title>
<updated>2012-05-11T12:14:47+00:00</updated>
<author>
<name>Matthew Garrett</name>
<email>mjg@redhat.com</email>
</author>
<published>2012-04-30T20:11:29+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=3a91135a4b1720f83810399e596937598b38b158'/>
<id>urn:sha1:3a91135a4b1720f83810399e596937598b38b158</id>
<content type='text'>
commit 41b3254c93acc56adc3c4477fef7c9512d47659e upstream.

More recent versions of the UEFI spec have added new attributes for
variables. Add them.

Signed-off-by: Matthew Garrett &lt;mjg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>pipes: add a "packetized pipe" mode for writing</title>
<updated>2012-05-11T12:14:43+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-04-29T20:12:42+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=39bcf98adc9432136e3b6a54881dcf703b7fcb63'/>
<id>urn:sha1:39bcf98adc9432136e3b6a54881dcf703b7fcb63</id>
<content type='text'>
commit 9883035ae7edef3ec62ad215611cb8e17d6a1a5d upstream.

The actual internal pipe implementation is already really about
individual packets (called "pipe buffers"), and this simply exposes that
as a special packetized mode.

When we are in the packetized mode (marked by O_DIRECT as suggested by
Alan Cox), a write() on a pipe will not merge the new data with previous
writes, so each write will get a pipe buffer of its own.  The pipe
buffer is then marked with the PIPE_BUF_FLAG_PACKET flag, which in turn
will tell the reader side to break the read at that boundary (and throw
away any partial packet contents that do not fit in the read buffer).

End result: as long as you do writes less than PIPE_BUF in size (so that
the pipe doesn't have to split them up), you can now treat the pipe as a
packet interface, where each read() system call will read one packet at
a time.  You can just use a sufficiently big read buffer (PIPE_BUF is
sufficient, since bigger than that doesn't guarantee atomicity anyway),
and the return value of the read() will naturally give you the size of
the packet.

NOTE! We do not support zero-sized packets, and zero-sized reads and
writes to a pipe continue to be no-ops.  Also note that big packets will
currently be split at write time, but that the size at which that
happens is not really specified (except that it's bigger than PIPE_BUF).
Currently that limit is the system page size, but we might want to
explicitly support bigger packets some day.

The main user for this is going to be the autofs packet interface,
allowing us to stop having to care so deeply about exact packet sizes
(which have had bugs with 32/64-bit compatibility modes).  But user
space can create packetized pipes with "pipe2(fd, O_DIRECT)", which will
fail with an EINVAL on kernels that do not support this interface.

Tested-by: Michael Tokarev &lt;mjt@tls.msk.ru&gt;
Cc: Alan Cox &lt;alan@lxorguk.ukuu.org.uk&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Ian Kent &lt;raven@themaw.net&gt;
Cc: Thomas Meyer &lt;thomas@m3y3r.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>USB: EHCI: fix crash during suspend on ASUS computers</title>
<updated>2012-05-11T12:14:42+00:00</updated>
<author>
<name>Alan Stern</name>
<email>stern@rowland.harvard.edu</email>
</author>
<published>2012-04-24T18:07:22+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=f845dfabaf943e9d1bd97c1c023ee5c58ba04352'/>
<id>urn:sha1:f845dfabaf943e9d1bd97c1c023ee5c58ba04352</id>
<content type='text'>
commit 151b61284776be2d6f02d48c23c3625678960b97 upstream.

This patch (as1545) fixes a problem affecting several ASUS computers:
The machine crashes or corrupts memory when going into suspend if the
ehci-hcd driver is bound to any controllers.  Users have been forced
to unbind or unload ehci-hcd before putting their systems to sleep.

After extensive testing, it was determined that the machines don't
like going into suspend when any EHCI controllers are in the PCI D3
power state.  Presumably this is a firmware bug, but there's nothing
we can do about it except to avoid putting the controllers in D3
during system sleep.

The patch adds a new flag to indicate whether the problem is present,
and avoids changing the controller's power state if the flag is set.
Runtime suspend is unaffected; this matters only for system suspend.
However as a side effect, the controller will not respond to remote
wakeup requests while the system is asleep.  Hence USB wakeup is not
functional -- but of course, this is already true in the current state
of affairs.

This fixes Bugzilla #42728.

Signed-off-by: Alan Stern &lt;stern@rowland.harvard.edu&gt;
Tested-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Tested-by: Andrey Rahmatullin &lt;wrar@wrar.name&gt;
Tested-by: Oleksij Rempel (fishor) &lt;bug-track@fisher-privat.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tcp: avoid order-1 allocations on wifi and tx path</title>
<updated>2012-05-11T12:14:22+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-04-25T03:01:22+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=8d34d1f1b61169add8bdcad2e8b9ddf4f52889cd'/>
<id>urn:sha1:8d34d1f1b61169add8bdcad2e8b9ddf4f52889cd</id>
<content type='text'>
[ This combines upstream commit
  a21d45726acacc963d8baddf74607d9b74e2b723 and the follow-on bug fix
  commit 22b4a4f22da4b39c6f7f679fd35f3d35c91bf851 ]

Marc Merlin reported many order-1 allocations failures in TX path on its
wireless setup, that dont make any sense with MTU=1500 network, and non
SG capable hardware.

After investigation, it turns out TCP uses sk_stream_alloc_skb() and
used as a convention skb_tailroom(skb) to know how many bytes of data
payload could be put in this skb (for non SG capable devices)

Note : these skb used kmalloc-4096 (MTU=1500 + MAX_HEADER +
sizeof(struct skb_shared_info) being above 2048)

Later, mac80211 layer need to add some bytes at the tail of skb
(IEEE80211_ENCRYPT_TAILROOM = 18 bytes) and since no more tailroom is
available has to call pskb_expand_head() and request order-1
allocations.

This patch changes sk_stream_alloc_skb() so that only
sk-&gt;sk_prot-&gt;max_header bytes of headroom are reserved, and use a new
skb field, avail_size to hold the data payload limit.

This way, order-0 allocations done by TCP stack can leave more than 2 KB
of tailroom and no more allocation is performed in mac80211 layer (or
any layer needing some tailroom)

avail_size is unioned with mark/dropcount, since mark will be set later
in IP stack for output packets. Therefore, skb size is unchanged.

Reported-by: Marc MERLIN &lt;marc@merlins.org&gt;
Tested-by: Marc MERLIN &lt;marc@merlins.org&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[bwh: Correct commit hash for follow-on bug fix]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tcp: allow splice() to build full TSO packets</title>
<updated>2012-05-11T12:14:17+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-04-25T02:12:06+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=0d63a9816f225228d72461444c6b1e5f58c6f90b'/>
<id>urn:sha1:0d63a9816f225228d72461444c6b1e5f58c6f90b</id>
<content type='text'>
[ This combines upstream commit
  2f53384424251c06038ae612e56231b96ab610ee and the follow-on bug fix
  commit 35f9c09fe9c72eb8ca2b8e89a593e1c151f28fc2 ]

vmsplice()/splice(pipe, socket) call do_tcp_sendpages() one page at a
time, adding at most 4096 bytes to an skb. (assuming PAGE_SIZE=4096)

The call to tcp_push() at the end of do_tcp_sendpages() forces an
immediate xmit when pipe is not already filled, and tso_fragment() try
to split these skb to MSS multiples.

4096 bytes are usually split in a skb with 2 MSS, and a remaining
sub-mss skb (assuming MTU=1500)

This makes slow start suboptimal because many small frames are sent to
qdisc/driver layers instead of big ones (constrained by cwnd and packets
in flight of course)

In fact, applications using sendmsg() (adding an additional memory copy)
instead of vmsplice()/splice()/sendfile() are a bit faster because of
this anomaly, especially if serving small files in environments with
large initial [c]wnd.

Call tcp_push() only if MSG_MORE is not set in the flags parameter.

This bit is automatically provided by splice() internals but for the
last page, or on all pages if user specified SPLICE_F_MORE splice()
flag.

In some workloads, this can reduce number of sent logical packets by an
order of magnitude, making zero-copy TCP actually faster than
one-copy :)

Reported-by: Tom Herbert &lt;therbert@google.com&gt;
Cc: Nandita Dukkipati &lt;nanditad@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Tom Herbert &lt;therbert@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: H.K. Jerry Chu &lt;hkchu@google.com&gt;
Cc: Maciej Żenczykowski &lt;maze@google.com&gt;
Cc: Mahesh Bandewar &lt;maheshb@google.com&gt;
Cc: Ilpo Järvinen &lt;ilpo.jarvinen@helsinki.fi&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>net: fix /proc/net/dev regression</title>
<updated>2012-05-11T12:14:16+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-04-02T22:33:02+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=205648388f15b1ccf92b09ba0cf141cdd62bf711'/>
<id>urn:sha1:205648388f15b1ccf92b09ba0cf141cdd62bf711</id>
<content type='text'>
[ Upstream commit 2def16ae6b0c77571200f18ba4be049b03d75579 ]

Commit f04565ddf52 (dev: use name hash for dev_seq_ops) added a second
regression, as some devices are missing from /proc/net/dev if many
devices are defined.

When seq_file buffer is filled, the last -&gt;next/show() method is
canceled (pos value is reverted to value prior -&gt;next() call)

Problem is after above commit, we dont restart the lookup at right
position in -&gt;start() method.

Fix this by removing the internal 'pos' pointer added in commit, since
we need to use the 'loff_t *pos' provided by seq_file layer.

This also reverts commit 5cac98dd0 (net: Fix corruption
in /proc/*/net/dev_mcast), since its not needed anymore.

Reported-by: Ben Greear &lt;greearb@candelatech.com&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Mihai Maruseac &lt;mmaruseac@ixiacom.com&gt;
Tested-by:  Ben Greear &lt;greearb@candelatech.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>KVM: unmap pages from the iommu when slots are removed</title>
<updated>2012-05-11T12:14:07+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2012-04-27T21:54:08+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=1e57aab4e6c549804298f07fac0b6fc77f10fab2'/>
<id>urn:sha1:1e57aab4e6c549804298f07fac0b6fc77f10fab2</id>
<content type='text'>
commit 32f6daad4651a748a58a3ab6da0611862175722f upstream.

We've been adding new mappings, but not destroying old mappings.
This can lead to a page leak as pages are pinned using
get_user_pages, but only unpinned with put_page if they still
exist in the memslots list on vm shutdown.  A memslot that is
destroyed while an iommu domain is enabled for the guest will
therefore result in an elevated page reference count that is
never cleared.

Additionally, without this fix, the iommu is only programmed
with the first translation for a gpa.  This can result in
peer-to-peer errors if a mapping is destroyed and replaced by a
new mapping at the same gpa as the iommu will still be pointing
to the original, pinned memory address.

Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>sched/x86: Fix overflow in cyc2ns_offset</title>
<updated>2012-04-13T15:33:50+00:00</updated>
<author>
<name>Salman Qazi</name>
<email>sqazi@google.com</email>
</author>
<published>2012-03-10T00:41:01+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=193bc3a0209aececb61262bd432e82710c2d499a'/>
<id>urn:sha1:193bc3a0209aececb61262bd432e82710c2d499a</id>
<content type='text'>
commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 upstream.

When a machine boots up, the TSC generally gets reset.  However,
when kexec is used to boot into a kernel, the TSC value would be
carried over from the previous kernel.  The computation of
cycns_offset in set_cyc2ns_scale is prone to an overflow, if the
machine has been up more than 208 days prior to the kexec.  The
overflow happens when we multiply *scale, even though there is
enough room to store the final answer.

We fix this issue by decomposing tsc_now into the quotient and
remainder of division by CYC2NS_SCALE_FACTOR and then performing
the multiplication separately on the two components.

Refactor code to share the calculation with the previous
fix in __cycles_2_ns().

Signed-off-by: Salman Qazi &lt;sqazi@google.com&gt;
Acked-by: John Stultz &lt;john.stultz@linaro.org&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: john stultz &lt;johnstul@us.ibm.com&gt;
Link: http://lkml.kernel.org/r/20120310004027.19291.88460.stgit@dungbeetle.mtv.corp.google.com
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>CIFS: Fix VFS lock usage for oplocked files</title>
<updated>2012-04-13T15:33:48+00:00</updated>
<author>
<name>Pavel Shilovsky</name>
<email>piastry@etersoft.ru</email>
</author>
<published>2012-03-28T17:56:19+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=2d5e264758e31dd1d1e609a5a4d3d6dc693b9d2e'/>
<id>urn:sha1:2d5e264758e31dd1d1e609a5a4d3d6dc693b9d2e</id>
<content type='text'>
commit 66189be74ff5f9f3fd6444315b85be210d07cef2 upstream.

We can deadlock if we have a write oplock and two processes
use the same file handle. In this case the first process can't
unlock its lock if the second process blocked on the lock in the
same time.

Fix it by using posix_lock_file rather than posix_lock_file_wait
under cinode-&gt;lock_mutex. If we request a blocking lock and
posix_lock_file indicates that there is another lock that prevents
us, wait untill that lock is released and restart our call.

Acked-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Pavel Shilovsky &lt;piastry@etersoft.ru&gt;
Signed-off-by: Steve French &lt;sfrench@us.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
