Project

General

Profile

Actions

Bug #1777

closed

asymmetrical af_packet hash function in recent kernel

Added by Eric Leblond over 8 years ago. Updated about 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
Affected Versions:
Effort:
Difficulty:
Label:

Description

The in kernel hash function is not symmetrical anymore since the 4.2 kernel. This introduces out of order treatment of packets. A possible solution is to used the new cluster_ebpf load balancing mode to introduce our own hashing functions.

Actions #1

Updated by Victor Julien over 8 years ago

  • Status changed from New to Assigned
  • Priority changed from Normal to High
  • Target version set to 70
Actions #2

Updated by Derek Ditch over 8 years ago

I filed a bug with the kernel bugzilla. I don't have solid reproducible input at the moment, but I don't see any reason why the new hashing to include vlan and flow-labels can't be symmetric as well.

https://bugzilla.kernel.org/show_bug.cgi?id=120441

Actions #3

Updated by Victor Julien over 8 years ago

If I'm not mistaken, we can rely on the NIC to do the balancing:

$ ethtool -n ens2f0 rx-flow-hash tcp4
TCP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]

$ ethtool -n ens2f0 rx-flow-hash tcp6
TCP over IPV6 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]

$ ethtool -n ens2f0 rx-flow-hash udp4
UDP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]

$ ethtool -n ens2f0 rx-flow-hash udp6
UDP over IPV6 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]

Then instead of 'cluster_flow' in the yaml set 'cluster_qm'.

Actions #4

Updated by Victor Julien over 8 years ago

It seems my thinking about RSS is wrong. As far I understand now, the RSS hashing is not symmetric either. It makes sense if you consider that the promisc capture case is a bit of an anomaly: normally the RX of a NIC would not see both sides of a flow. See also http://www.ntop.org/pf_ring/going-beyond-rss-receive-side-scaling/

Actions #5

Updated by Eric Leblond over 8 years ago

For xl710, section 7.1.9.3 of xl710-10-40-controller-datasheet.pdf announced it is possible to get a symetrical hash. From conversation I had at netdev 1.1 with Intel people it was possible to have it.

Actions #6

Updated by Victor Julien over 8 years ago

David Miller has provided a fix that we're currently testing. It should go into stable kernels as well once it's ready, as the change seems not intrusive. That should fix the issue.

However, it seems that generally using multiple RSS queues is harmful. Most NIC's will not do symmetric hashing on the NIC leading to both sides of the conversation possibly being sent to different queues. Timing issue can lead to packet ordering problems. The solution here is to set the number of queues to 1 or force the NIC to use symmetric hashing if possible.

Actions #7

Updated by Eric Leblond over 8 years ago

Actions #9

Updated by Victor Julien over 8 years ago

Kernel 4.7rc7 contains the fix http://lwn.net/Articles/694055/

Actions #10

Updated by Eric Leblond over 8 years ago

Patch has been queued for 4.4 and 4.6 stable kernel.

Actions #12

Updated by Victor Julien about 8 years ago

It should be in Ubuntu 16.04's kernel soon https://launchpad.net/ubuntu/+source/linux/4.4.0-34.53

Actions #13

Updated by Victor Julien about 8 years ago

Ubuntu 16.04 now has this fixed kernel: https://launchpad.net/ubuntu/+source/linux/4.4.0-36.55

Actions #14

Updated by Victor Julien about 8 years ago

  • Status changed from Assigned to Closed
  • Assignee deleted (Eric Leblond)
  • Priority changed from High to Normal
  • Target version deleted (70)
Actions

Also available in: Atom PDF