The table expansion operation seems complex. Is it really faster than just rebuilding the hash table (by concurrently reading it), switching to the new hash table, then deleting the old table (use RCU to determine when it's safe to delete)?
Perhaps copying is slow if the nodes are large. Perhaps one way around this is to have two bucket pointers within each node and swap between them whenever the table is rewritten.
In addition to the overhead of copying all the elements, you'd break any code that holds references to those elements. In quite a few hash tables in the kernel, other code holds long-running references to hash nodes and expects them to stay alive.
When I first started working on hash table algorithms, years ago, I started out by developing an algorithm to move a single node between buckets because the key changed. My first attempt involved making a copy of the node. That works fine in a standalone hash table, but it won't work in many real kernel data structures (for instance, dcache), because it breaks other pointers to the (reference-counted) node.
That's why the algorithm shown here does not copy any nodes.
As for having two bucket pointers within each node: we've looked at that approach, as have others, but it uses far more memory. If you have a hash table with millions of entries, you don't want to add 8 bytes to every one of them.
Perhaps copying is slow if the nodes are large. Perhaps one way around this is to have two bucket pointers within each node and swap between them whenever the table is rewritten.