<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/arch/sparc/lib, branch v4.3-rc2</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=v4.3-rc2</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=v4.3-rc2'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2015-09-03T22:46:07+00:00</updated>
<entry>
<title>Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2015-09-03T22:46:07+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-09-03T22:46:07+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=ca520cab25e0e8da717c596ccaa2c2b3650cfa09'/>
<id>urn:sha1:ca520cab25e0e8da717c596ccaa2c2b3650cfa09</id>
<content type='text'>
Pull locking and atomic updates from Ingo Molnar:
 "Main changes in this cycle are:

   - Extend atomic primitives with coherent logic op primitives
     (atomic_{or,and,xor}()) and deprecate the old partial APIs
     (atomic_{set,clear}_mask())

     The old ops were incoherent with incompatible signatures across
     architectures and with incomplete support.  Now every architecture
     supports the primitives consistently (by Peter Zijlstra)

   - Generic support for 'relaxed atomics':

       - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
       - atomic_read_acquire()
       - atomic_set_release()

     This came out of porting qwrlock code to arm64 (by Will Deacon)

   - Clean up the fragile static_key APIs that were causing repeat bugs,
     by introducing a new one:

       DEFINE_STATIC_KEY_TRUE(name);
       DEFINE_STATIC_KEY_FALSE(name);

     which define a key of different types with an initial true/false
     value.

     Then allow:

       static_branch_likely()
       static_branch_unlikely()

     to take a key of either type and emit the right instruction for the
     case.  To be able to know the 'type' of the static key we encode it
     in the jump entry (by Peter Zijlstra)

   - Static key self-tests (by Jason Baron)

   - qrwlock optimizations (by Waiman Long)

   - small futex enhancements (by Davidlohr Bueso)

   - ... and misc other changes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  jump_label/x86: Work around asm build bug on older/backported GCCs
  locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
  locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
  locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
  locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
  locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
  locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
  locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
  locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
  locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
  locking/static_keys: Make verify_keys() static
  jump label, locking/static_keys: Update docs
  locking/static_keys: Provide a selftest
  jump_label: Provide a self-test
  s390/uaccess, locking/static_keys: employ static_branch_likely()
  x86, tsc, locking/static_keys: Employ static_branch_likely()
  locking/static_keys: Add selftest
  locking/static_keys: Add a new static_key interface
  locking/static_keys: Rework update logic
  locking/static_keys: Add static_key_{en,dis}able() helpers
  ...
</content>
</entry>
<entry>
<title>sparc64: Fix userspace FPU register corruptions.</title>
<updated>2015-08-07T02:13:25+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-08-07T02:13:25+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=44922150d87cef616fd183220d43d8fde4d41390'/>
<id>urn:sha1:44922150d87cef616fd183220d43d8fde4d41390</id>
<content type='text'>
If we have a series of events from userpsace, with %fprs=FPRS_FEF,
like follows:

ETRAP
	ETRAP
		VIS_ENTRY(fprs=0x4)
		VIS_EXIT
		RTRAP (kernel FPU restore with fpu_saved=0x4)
	RTRAP

We will not restore the user registers that were clobbered by the FPU
using kernel code in the inner-most trap.

Traps allocate FPU save slots in the thread struct, and FPU using
sequences save the "dirty" FPU registers only.

This works at the initial trap level because all of the registers
get recorded into the top-level FPU save area, and we'll return
to userspace with the FPU disabled so that any FPU use by the user
will take an FPU disabled trap wherein we'll load the registers
back up properly.

But this is not how trap returns from kernel to kernel operate.

The simplest fix for this bug is to always save all FPU register state
for anything other than the top-most FPU save area.

Getting rid of the optimized inner-slot FPU saving code ends up
making VISEntryHalf degenerate into plain VISEntry.

Longer term we need to do something smarter to reinstate the partial
save optimizations.  Perhaps the fundament error is having trap entry
and exit allocate FPU save slots and restore register state.  Instead,
the VISEntry et al. calls should be doing that work.

This bug is about two decades old.

Reported-by: James Y Knight &lt;jyknight@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>sparc: Provide atomic_{or,xor,and}</title>
<updated>2015-07-27T12:06:23+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-04-23T17:40:25+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=304a0d699a3c6103b61d5ea18d56820e7d8e3116'/>
<id>urn:sha1:304a0d699a3c6103b61d5ea18d56820e7d8e3116</id>
<content type='text'>
Implement atomic logic ops -- atomic_{or,xor,and}.

These will replace the atomic_{set,clear}_mask functions that are
available on some archs.

Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>sparc64: Fix several bugs in memmove().</title>
<updated>2015-03-23T16:22:10+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-03-23T16:22:10+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=2077cef4d5c29cf886192ec32066f783d6a80db8'/>
<id>urn:sha1:2077cef4d5c29cf886192ec32066f783d6a80db8</id>
<content type='text'>
Firstly, handle zero length calls properly.  Believe it or not there
are a few of these happening during early boot.

Next, we can't just drop to a memcpy() call in the forward copy case
where dst &lt;= src.  The reason is that the cache initializing stores
used in the Niagara memcpy() implementations can end up clearing out
cache lines before we've sourced their original contents completely.

For example, considering NG4memcpy, the main unrolled loop begins like
this:

     load   src + 0x00
     load   src + 0x08
     load   src + 0x10
     load   src + 0x18
     load   src + 0x20
     store  dst + 0x00

Assume dst is 64 byte aligned and let's say that dst is src - 8 for
this memcpy() call.  That store at the end there is the one to the
first line in the cache line, thus clearing the whole line, which thus
clobbers "src + 0x28" before it even gets loaded.

To avoid this, just fall through to a simple copy only mildly
optimized for the case where src and dst are 8 byte aligned and the
length is a multiple of 8 as well.  We could get fancy and call
GENmemcpy() but this is good enough for how this thing is actually
used.

Reported-by: David Ahern &lt;david.ahern@oracle.com&gt;
Reported-by: Bob Picco &lt;bpicco@meloft.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>sparc32: Implement xchg and atomic_xchg using ATOMIC_HASH locks</title>
<updated>2014-11-07T20:51:44+00:00</updated>
<author>
<name>Andreas Larsson</name>
<email>andreas@gaisler.com</email>
</author>
<published>2014-11-05T14:52:08+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=1a17fdc4f4ed06b63fac1937470378a5441a663a'/>
<id>urn:sha1:1a17fdc4f4ed06b63fac1937470378a5441a663a</id>
<content type='text'>
Atomicity between xchg and cmpxchg cannot be guaranteed when xchg is
implemented with a swap and cmpxchg is implemented with locks.
Without this, e.g. mcs_spin_lock and mcs_spin_unlock are broken.

Signed-off-by: Andreas Larsson &lt;andreas@gaisler.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>sparc64: Fix FPU register corruption with AES crypto offload.</title>
<updated>2014-10-15T02:37:58+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2014-10-15T02:37:58+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=f4da3628dc7c32a59d1fb7116bb042e6f436d611'/>
<id>urn:sha1:f4da3628dc7c32a59d1fb7116bb042e6f436d611</id>
<content type='text'>
The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the
key material is preloaded into the FPU registers, and then we loop
over and over doing the crypt operation, reusing those pre-cooked key
registers.

There are intervening blkcipher*() calls between the crypt operation
calls.  And those might perform memcpy() and thus also try to use the
FPU.

The sparc64 kernel FPU usage mechanism is designed to allow such
recursive uses, but with a catch.

There has to be a trap between the two FPU using threads of control.

The mechanism works by, when the FPU is already in use by the kernel,
allocating a slot for FPU saving at trap time.  Then if, within the
trap handler, we try to use the FPU registers, the pre-trap FPU
register state is saved into the slot.  Then at trap return time we
notice this and restore the pre-trap FPU state.

Over the long term there are various more involved ways we can make
this work, but for a quick fix let's take advantage of the fact that
the situation where this happens is very limited.

All sparc64 chips that support the crypto instructiosn also are using
the Niagara4 memcpy routine, and that routine only uses the FPU for
large copies where we can't get the source aligned properly to a
multiple of 8 bytes.

We look to see if the FPU is already in use in this context, and if so
we use the non-large copy path which only uses integer registers.

Furthermore, we also limit this special logic to when we are doing
kernel copy, rather than a user copy.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge branch 'locking-arch-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2014-10-13T13:48:00+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-10-13T13:48:00+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=dbb885fecc1b1b35e93416bedd24d21bd20f60ed'/>
<id>urn:sha1:dbb885fecc1b1b35e93416bedd24d21bd20f60ed</id>
<content type='text'>
Pull arch atomic cleanups from Ingo Molnar:
 "This is a series kept separate from the main locking tree, which
  cleans up and improves various details in the atomics type handling:

   - Remove the unused atomic_or_long() method

   - Consolidate and compress atomic ops implementations between
     architectures, to reduce linecount and to make it easier to add new
     ops.

   - Rewrite generic atomic support to only require cmpxchg() from an
     architecture - generate all other methods from that"

* 'locking-arch-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  locking,arch: Use ACCESS_ONCE() instead of cast to volatile in atomic_read()
  locking, mips: Fix atomics
  locking, sparc64: Fix atomics
  locking,arch: Rewrite generic atomic support
  locking,arch,xtensa: Fold atomic_ops
  locking,arch,sparc: Fold atomic_ops
  locking,arch,sh: Fold atomic_ops
  locking,arch,powerpc: Fold atomic_ops
  locking,arch,parisc: Fold atomic_ops
  locking,arch,mn10300: Fold atomic_ops
  locking,arch,mips: Fold atomic_ops
  locking,arch,metag: Fold atomic_ops
  locking,arch,m68k: Fold atomic_ops
  locking,arch,m32r: Fold atomic_ops
  locking,arch,ia64: Fold atomic_ops
  locking,arch,hexagon: Fold atomic_ops
  locking,arch,cris: Fold atomic_ops
  locking,arch,avr32: Fold atomic_ops
  locking,arch,arm64: Fold atomic_ops
  locking,arch,arm: Fold atomic_ops
  ...
</content>
</entry>
<entry>
<title>locking, sparc64: Fix atomics</title>
<updated>2014-09-10T09:45:04+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-09-02T09:40:16+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=caa17d49f9a5cc09b3bbb101dc640f914f3b4ff7'/>
<id>urn:sha1:caa17d49f9a5cc09b3bbb101dc640f914f3b4ff7</id>
<content type='text'>
The patch folding the atomic ops had a silly fail in the _return primitives.

Fixes: 4f3316c2b5fe ("locking,arch,sparc: Fold atomic_ops")
Reported-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Tested-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: sparclinux@vger.kernel.org
Link: http://lkml.kernel.org/r/20140902094016.GD31157@worktop.ger.corp.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sparc: Let memset return the address argument</title>
<updated>2014-09-09T23:38:10+00:00</updated>
<author>
<name>Andreas Larsson</name>
<email>andreas@gaisler.com</email>
</author>
<published>2014-08-29T15:08:21+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=74cad25c076a2f5253312c2fe82d1a4daecc1323'/>
<id>urn:sha1:74cad25c076a2f5253312c2fe82d1a4daecc1323</id>
<content type='text'>
This makes memset follow the standard (instead of returning 0 on success). This
is needed when certain versions of gcc optimizes around memset calls and assume
that the address argument is preserved in %o0.

Signed-off-by: Andreas Larsson &lt;andreas@gaisler.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>locking,arch,sparc: Fold atomic_ops</title>
<updated>2014-08-14T10:48:13+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-03-26T17:29:28+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=4f3316c2b5fe2062c26c9b66915b5a5c80c60a5c'/>
<id>urn:sha1:4f3316c2b5fe2062c26c9b66915b5a5c80c60a5c</id>
<content type='text'>
Many of the atomic op implementations are the same except for one
instruction; fold the lot into a few CPP macros and reduce LoC.

This also prepares for easy addition of new ops.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: Kirill Tkhai &lt;tkhai@yandex.ru&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Sam Ravnborg &lt;sam@ravnborg.org&gt;
Cc: sparclinux@vger.kernel.org
Link: http://lkml.kernel.org/r/20140508135852.825281379@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
