From 6d7d1a0dc735ea8412769edae7154885021107a9 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Mon, 19 Mar 2012 16:19:53 -0700 Subject: [PATCH] vfs: get rid of batshit-insane pointless dentry hash calculations For some odd historical reason, the final mixing round for the dentry cache hash table lookup had an insane "xor with big constant" logic. In two places. The big constant that is being xor'ed is GOLDEN_RATIO_PRIME, which is a fairly random-looking number that is designed to be *multiplied* with so that the bits get spread out over a whole long-word. But xor'ing with it is insane. It doesn't really even change the hash - it really only shifts the hash around in the hash table. To make matters worse, the insane big constant is different on 32-bit and 64-bit builds, even though the name hash bits we use are always 32-bit (and the bits from the pointer we mix in effectively are too). It's all total voodoo programming, in other words. Now, some testing and analysis of the hash chains shows that the rest of the hash function seems to be fairly good. It does pick the right bits of the parent dentry pointer, for example, and while it's generally a bad idea to use an xor to mix down the upper bits (because if there is a repeating pattern, the xor can cause "destructive interference"), it seems to not have been a disaster. For example, replacing the hash with the normal "hash_long()" code (that uses the GOLDEN_RATIO_PRIME constant correctly, btw) actually just makes the hash worse. The hand-picked hash knew which bits of the pointer had the highest entropy, and hash_long() ends up mixing bits less optimally at least in some trivial tests. So the hash function overall seems fine, it just has that really odd "shift result around by a constant xor". So get rid of the silly xor, and replace the down-mixing of the bits with an add instead of an xor that tends to not have the same kind of destructive interference issues. Some stats on the resulting hash chains shows that they look statistically identical before and after, but the code is simpler and no longer makes you go "WTF?". Also, the incoming hash really is just "unsigned int", not a long, and there's no real point to worry about the high 26 bits of the dentry pointer for the 64-bit case, because they are all going to be identical anyway. So also change the hashing to be done in the more natural 'unsigned int' that is the real size of the actual hashed data anyway. Signed-off-by: Linus Torvalds --- fs/dcache.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index bcbdb33fcc20..5f00a6f63c9e 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -105,10 +105,10 @@ static unsigned int d_hash_shift __read_mostly; static struct hlist_bl_head *dentry_hashtable __read_mostly; static inline struct hlist_bl_head *d_hash(const struct dentry *parent, - unsigned long hash) + unsigned int hash) { - hash += ((unsigned long) parent ^ GOLDEN_RATIO_PRIME) / L1_CACHE_BYTES; - hash = hash ^ ((hash ^ GOLDEN_RATIO_PRIME) >> D_HASHBITS); + hash += (unsigned long) parent / L1_CACHE_BYTES; + hash = hash + (hash >> D_HASHBITS); return dentry_hashtable + (hash & D_HASHMASK); } -- 2.20.1