Project

General

Profile

Support #2900

alert 'SURICATA STREAM pkt seen on wrong thread' when run mode set to workers

Added by Anonymous almost 2 years ago. Updated 5 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Affected Versions:
Label:

Description

This alert is constantly triggered when run mode 'workers' is enabled.
Everything works as expected when set to AutoFP.
Can this alert be safely suppressed, or is this something that should be considered an issue?

Setup:
Suricata 4.1.2_1 inline IPS mode with netmap on FreeBSD 11.2


Related issues

Related to Support #2725: stream/packet on wrong threadFeedbackOISF DevActions
#1

Updated by Victor Julien almost 2 years ago

#2

Updated by Victor Julien almost 2 years ago

Unfortunately this is a serious issue that can lead to missed alerts and logs. Resolving it should be high priority. If autofp works well I would recommend staying on that. We're tracking the larger issue in #2725

#3

Updated by Andreas Herz over 1 year ago

We're trying to narrow this issue down as best as we can. Can you give us more details about your config/setup (I saw a pfsense/netgate post from you, I guess that's related to that?) and the traffic seen?
I have similiar issues (but on Linux with AFPacketv3+workers mode) and I'm trying to find a scheme for the traffic that might produce those issues.
Thanks

#4

Updated by Andreas Herz over 1 year ago

  • Assignee set to OISF Dev
  • Target version set to Support
#5

Updated by Cooper Nelson over 1 year ago

Adding notes from my recent 'deep dive'.

The root cause appears to be the hardware implementation of RSS in some NICs, confirmed in the ixgbe driver.

Fragmented TCP packets will be hashed by 'sd' only (as the TCP header is only present on the first packet), so fragmented flows will only go to the same queue if every TCP packet in the flow is fragmented.

However, in practice its very common for the handshake and first packets of a big TCP flow to not be fragmented and fragmentation occurs later in the flow. Particularly when the packet rates increase due to receive window scaling.

Looking at the documentation for AF_PACKET shows that it is supposed to handle this case properly, but either its not or perhaps suricata isn't setting it properly on all kernels:

http://man7.org/linux/man-pages/man7/packet.7.html

It also may be the case that this is describing a software implementation that is overridden by hardware RSS, if present. I think I remember regit mentioning that if there was a flow hash generated on the NIC, that is what cluster_flow used.

I do not think it is possible to force a 'sd' hash on the older 10Gbit Intel NICs, however I might be mistaken.

I'm thinking cluster_flow could be modified to handle fragmented TCP packets properly, or simply just hash on 'sd' only. However the TCP packets would still be delivered out-of-order to the worker thread in many cases due to timing issues. Not sure how much of an issue this is with the stream tracker.

#6

Updated by Anonymous over 1 year ago

Andreas Herz wrote:

We're trying to narrow this issue down as best as we can. Can you give us more details about your config/setup (I saw a pfsense/netgate post from you, I guess that's related to that?) and the traffic seen?
I have similiar issues (but on Linux with AFPacketv3+workers mode) and I'm trying to find a scheme for the traffic that might produce those issues.
Thanks

I can no longer replicate the issue.
I have replicated my (almost) exact same setup from at the time I opened this issue.

Intel pro/1000 PT NIC
pfSense 2.4.4-p3 (FreeBSD 11.2)
Hardware checksum, tcp and large receive offloading disabled
Flow control disabled
Suricata 4.1.4_2
Netmap + worker mode

Changes:
pfSense 2.4.4-p2 -> 2.4.4-p3 (nothing major, still the same FreeBSD release.)
Suricata 4.1.2_1 -> 4.1.4_2

#7

Updated by Anonymous over 1 year ago

Disregard my last update, issue still persists on FreeBSD 11.2 with netmap and worker mode. Intel pro/1000 PT NIC (em driver).

#8

Updated by Andreas Herz over 1 year ago

Karel Van Hecke wrote:

Disregard my last update, issue still persists on FreeBSD 11.2 with netmap and worker mode. Intel pro/1000 PT NIC (em driver).

Could you check what possible options are offered by the NIC. On Linux we can use ethtool to control relevant parts of that. Not sure how it's done with FreeBSD and especially how this affects netmap. Would be nice to see what options are available.

#9

Updated by Andreas Herz 5 months ago

  • Status changed from New to Closed

Hi, we're closing this issue since there have been no further responses.
If you think this bug is still relevant, try to test it again with the
most recent version of suricata and reopen the issue. If you want to
improve the bug report please take a look at
https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Reporting_Bugs

Also available in: Atom PDF