Bug #1777
closedasymmetrical af_packet hash function in recent kernel
Description
The in kernel hash function is not symmetrical anymore since the 4.2 kernel. This introduces out of order treatment of packets. A possible solution is to used the new cluster_ebpf load balancing mode to introduce our own hashing functions.
Updated by Victor Julien over 8 years ago
- Status changed from New to Assigned
- Priority changed from Normal to High
- Target version set to 70
Updated by Derek Ditch over 8 years ago
I filed a bug with the kernel bugzilla. I don't have solid reproducible input at the moment, but I don't see any reason why the new hashing to include vlan and flow-labels can't be symmetric as well.
Updated by Victor Julien over 8 years ago
If I'm not mistaken, we can rely on the NIC to do the balancing:
$ ethtool -n ens2f0 rx-flow-hash tcp4 TCP over IPV4 flows use these fields for computing Hash flow key: IP SA IP DA L4 bytes 0 & 1 [TCP/UDP src port] L4 bytes 2 & 3 [TCP/UDP dst port] $ ethtool -n ens2f0 rx-flow-hash tcp6 TCP over IPV6 flows use these fields for computing Hash flow key: IP SA IP DA L4 bytes 0 & 1 [TCP/UDP src port] L4 bytes 2 & 3 [TCP/UDP dst port] $ ethtool -n ens2f0 rx-flow-hash udp4 UDP over IPV4 flows use these fields for computing Hash flow key: IP SA IP DA L4 bytes 0 & 1 [TCP/UDP src port] L4 bytes 2 & 3 [TCP/UDP dst port] $ ethtool -n ens2f0 rx-flow-hash udp6 UDP over IPV6 flows use these fields for computing Hash flow key: IP SA IP DA L4 bytes 0 & 1 [TCP/UDP src port] L4 bytes 2 & 3 [TCP/UDP dst port]
Then instead of 'cluster_flow' in the yaml set 'cluster_qm'.
Updated by Victor Julien over 8 years ago
It seems my thinking about RSS is wrong. As far I understand now, the RSS hashing is not symmetric either. It makes sense if you consider that the promisc capture case is a bit of an anomaly: normally the RX of a NIC would not see both sides of a flow. See also http://www.ntop.org/pf_ring/going-beyond-rss-receive-side-scaling/
Updated by Eric Leblond over 8 years ago
For xl710, section 7.1.9.3 of xl710-10-40-controller-datasheet.pdf announced it is possible to get a symetrical hash. From conversation I had at netdev 1.1 with Intel people it was possible to have it.
Updated by Victor Julien over 8 years ago
David Miller has provided a fix that we're currently testing. It should go into stable kernels as well once it's ready, as the change seems not intrusive. That should fix the issue.
However, it seems that generally using multiple RSS queues is harmful. Most NIC's will not do symmetric hashing on the NIC leading to both sides of the conversation possibly being sent to different queues. Timing issue can lead to packet ordering problems. The solution here is to set the number of queues to 1 or force the NIC to use symmetric hashing if possible.
Updated by Eric Leblond over 8 years ago
Patch provided by David Miller: http://marc.info/?l=linux-netdev&m=146740374418529&w=2
Updated by Victor Julien over 8 years ago
Updated by Victor Julien about 8 years ago
Kernel 4.7rc7 contains the fix http://lwn.net/Articles/694055/
Updated by Eric Leblond about 8 years ago
Patch has been queued for 4.4 and 4.6 stable kernel.
Updated by Peter Manev about 8 years ago
Kernels 4.4 and 4.6 are now patched -
https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.16
https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.6.5
Updated by Victor Julien about 8 years ago
It should be in Ubuntu 16.04's kernel soon https://launchpad.net/ubuntu/+source/linux/4.4.0-34.53
Updated by Victor Julien about 8 years ago
Ubuntu 16.04 now has this fixed kernel: https://launchpad.net/ubuntu/+source/linux/4.4.0-36.55
Updated by Victor Julien about 8 years ago
- Status changed from Assigned to Closed
- Assignee deleted (
Eric Leblond) - Priority changed from High to Normal
- Target version deleted (
70)
General packet capture recommendations: https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Packet_Capture