Project

General

Profile

Actions

Bug #5270

closed

Flow hash table collision and flow state corruption between different capture interfaces

Added by Gatewatcher Dev Team over 2 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

Hi,

We found during the quality assurance testing of our suricata setup a somewhat strange situation.
The tcp.pkt_on_wrong_thread was incrementing only on some of our test benchs.
After some investigation, we were able to rule out issues with our load-balancing algorithm.

Just for the sake of completeness, we use af_packet with hash fanout and a patched kernel to ensure that only the 5-tuple (vlan ids, src/dst ip, ip proto) is hashed. Also, we ensured that none of our packets were fragmented during the tests that led to this bug report.
Also, as this may be relevant later on, mid-stream is disabled and we use suricata in IDS mode only.

Digging deeper, we were able to determine that the tcp.pkt_on_wrong_thread counter is incrementing because of a collision in the flow hash table between several flows whose 7-tuple (vlan ids, src/dst ip, src/dst port and ip proto) are identical. The only difference was the capture interface.

We noticed that the FlowHashKey definition does not include the livedev

typedef struct FlowHashKey4_ {
union {
struct {
uint32_t addrs2;
uint16_t ports2;
uint16_t proto; /**< u16 so proto and recur add up to u32 /
uint16_t recur; /
*< u16 so proto and recur add up to u32 */
uint16_t vlan_id2;
};
const uint32_t u325;
};
} FlowHashKey4;

typedef struct FlowHashKey6_ {
union {
struct {
uint32_t src4, dst4;
uint16_t ports2;
uint16_t proto; /**< u16 so proto and recur add up to u32 /
uint16_t recur; /
*< u16 so proto and recur add up to u32 */
uint16_t vlan_id2;
};
const uint32_t u3211;
};
} FlowHashKey6;

We verified this theory in a test lab and were able to confirm that replaying the attached pcap twice, once on each capture interface, does lead to tcp.pkt_on_wrong_thread being incremented.

Digging even deeper, we realised that this meant that the flow entry was shared between capture interfaces, and that this could be exploited as a bypass: an attacker in network A can corrupt flow state of flows in network B. This also means that one cannot use Suricata in a multi-tenant scenario if two tenants have overlapping network addressing because there is always the risk that a flow could be accidentally corrupted by another tenant network activity. Finally, this seems to prevent usage of Suricata to monitor traffic from a unique network at different capture locations (e.g. near the source and near the destination).

We confirmed the flow state corruption in lab:

Date: 4/15/2022 -- 14:11:39 (uptime: 0d, 00h 00m 54s)
------------------------------------------------------------------------------------
Counter | TM Name | Value
------------------------------------------------------------------------------------
capture.kernel_packets | Total | 6
decoder.pkts | Total | 6
decoder.bytes | Total | 412
decoder.ipv4 | Total | 6
decoder.ethernet | Total | 6
decoder.tcp | Total | 6
decoder.avg_pkt_size | Total | 68
decoder.max_pkt_size | Total | 74
decoder.max_mac_addrs_src | Total | 1
decoder.max_mac_addrs_dst | Total | 1
flow.tcp | Total | 1
flow.wrk.spare_sync_avg | Total | 100
flow.wrk.spare_sync | Total | 1
tcp.sessions | Total | 1
tcp.syn | Total | 1
tcp.synack | Total | 1
tcp.pkt_on_wrong_thread | Total | 3
flow.mgr.full_hash_pass | Total | 1
flow.mgr.closed_pruned | Total | 1
flow.spare | Total | 1048501
flow.mgr.rows_maxlen | Total | 1
flow.mgr.flows_checked | Total | 1
flow.mgr.flows_timeout | Total | 1
flow.mgr.flows_evicted | Total | 1
tcp.memuse | Total | 6828326912
tcp.reassembly_memuse | Total | 2162688
flow.memuse | Total | 348135104

In this lab, we injected a 3-way handshake opening in one capture interface and the corresponding 3-way handshake closure in the other capture interface, for a total of six packets.
Three packets were identified by suricata as being handled on a wrong thread: the 3-way handshake closure. Yet, flow.mgr.closed_pruned was incremented, showing that the flow was in the CLOSED state, thanks to the 3-way handshake closure (we used a tcp close timeout of 0 in this test).

We believe the correct behaviour should have been to consider the 3-way handshake closure to be mid-stream trafic unrelated with the 3-way handshake observed on the other capture interface. The flow state should not have been altered.

(BTW, the code responsible for incrementing flow.mgr.closed_pruned was probably deleted by mistake in commit b3599507f4eb891841417575587d690ea13fe6c0; we added it back for this test).

Please feel free to ask for any additional info.

Best regards,
Florian Maury
Gatewatcher


Files

3whs_open.pcap (286 Bytes) 3whs_open.pcap 3whs opening Gatewatcher Dev Team, 04/15/2022 04:23 PM
3whs_close.pcap (270 Bytes) 3whs_close.pcap 3whs closure Gatewatcher Dev Team, 04/15/2022 04:24 PM

Related issues 2 (1 open1 closed)

Related to Suricata - Bug #5330: flow: vlan.use-for-tracking is not used for ICMPv4ClosedPhilippe AntoineActions
Related to Suricata - Optimization #2725: stream/packet on wrong threadFeedbackOISF DevActions
Actions

Also available in: Atom PDF