diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2012-03-19 16:37:28 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-03-19 16:37:28 -0700 |
commit | b0e37d7ac6ba937c3776ff5111ff6a7fa832fb4f (patch) | |
tree | fdb86783c464825a77223e49cc24f632e319d2df /fs/dcache.c | |
parent | 6d7d1a0dc735ea8412769edae7154885021107a9 (diff) | |
parent | bfcfaa77bdf0f775263e906015982a608df01c76 (diff) | |
download | lwn-b0e37d7ac6ba937c3776ff5111ff6a7fa832fb4f.tar.gz lwn-b0e37d7ac6ba937c3776ff5111ff6a7fa832fb4f.zip |
Merge branch 'dcache-word-accesses'
* branch 'dcache-word-accesses':
vfs: use 'unsigned long' accesses for dcache name comparison and hashing
This does the name hashing and lookup using word-sized accesses when
that is efficient, namely on x86 (although any little-endian machine
with good unaligned accesses would do).
It does very much depend on little-endian logic, but it's a very hot
couple of functions under some real loads, and this patch improves the
performance of __d_lookup_rcu() and link_path_walk() by up to about 30%.
Giving a 10% improvement on some very pathname-heavy benchmarks.
Because we do make unaligned accesses past the filename, the
optimization is disabled when CONFIG_DEBUG_PAGEALLOC is active, and we
effectively depend on the fact that on x86 we don't really ever have the
last page of usable RAM followed immediately by any IO memory (due to
ACPI tables, BIOS buffer areas etc).
Some of the bit operations we do are a bit "subtle". It's commented,
but you do need to really think about the code. Or just consider it
black magic.
Thanks to people on G+ for some of the optimized bit tricks.
Diffstat (limited to 'fs/dcache.c')
-rw-r--r-- | fs/dcache.c | 23 |
1 files changed, 23 insertions, 0 deletions
diff --git a/fs/dcache.c b/fs/dcache.c index 5f00a6f63c9e..11828de68dce 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -144,6 +144,28 @@ int proc_nr_dentry(ctl_table *table, int write, void __user *buffer, static inline int dentry_cmp(const unsigned char *cs, size_t scount, const unsigned char *ct, size_t tcount) { +#ifdef CONFIG_DCACHE_WORD_ACCESS + unsigned long a,b,mask; + + if (unlikely(scount != tcount)) + return 1; + + for (;;) { + a = *(unsigned long *)cs; + b = *(unsigned long *)ct; + if (tcount < sizeof(unsigned long)) + break; + if (unlikely(a != b)) + return 1; + cs += sizeof(unsigned long); + ct += sizeof(unsigned long); + tcount -= sizeof(unsigned long); + if (!tcount) + return 0; + } + mask = ~(~0ul << tcount*8); + return unlikely(!!((a ^ b) & mask)); +#else if (scount != tcount) return 1; @@ -155,6 +177,7 @@ static inline int dentry_cmp(const unsigned char *cs, size_t scount, tcount--; } while (tcount); return 0; +#endif } static void __d_free(struct rcu_head *head) |