Add new 'cluster_peer' runmode to allow for load balancing by IP header (src<->dst) only
I'm investigating an issue on a production deployment that is seeing a large number of 'tcp.pkt_on_wrong_thread' in stats.log.
My current theory is that this is due to fragmented TCP packets not being properly hashed by the kernels RSS implementation and sent to different cores/threads.
One idea I had to address this was to a new cluster runmode that simply load-balanced based on the IP header only, so even if the packets were on the 'wrong' RSS queue, they would be directed to the same worker thread. However it's still possible/likely that the fragments will end up in the wrong order on the worker thread, which may cause other issues.
The 'right' way to fix this is to force the hashing on the NIC itself, however I'm not sure if that is possible in all cases.
Updated by Victor Julien over 1 year ago
It doesn't look like AF_PACKET has the support for this. See https://github.com/torvalds/linux/blob/master/net/packet/af_packet.c#L1419 for the built-in options. I think the way to do this would be through eBPF.
Updated by Eric Leblond over 1 year ago
By using cluster_ebpf and the provided lb.pdf file, you will have IP pair load balancing done by the kernel. The documentation on usage is here: https://suricata.readthedocs.io/en/suricata-4.1.4/capture-hardware/ebpf-xdp.html#setup-ebpf-load-balancing