Bug #5270
closedFlow hash table collision and flow state corruption between different capture interfaces
Description
Hi,
We found during the quality assurance testing of our suricata setup a somewhat strange situation.
The tcp.pkt_on_wrong_thread was incrementing only on some of our test benchs.
After some investigation, we were able to rule out issues with our load-balancing algorithm.
Just for the sake of completeness, we use af_packet with hash fanout and a patched kernel to ensure that only the 5-tuple (vlan ids, src/dst ip, ip proto) is hashed. Also, we ensured that none of our packets were fragmented during the tests that led to this bug report.
Also, as this may be relevant later on, mid-stream is disabled and we use suricata in IDS mode only.
Digging deeper, we were able to determine that the tcp.pkt_on_wrong_thread counter is incrementing because of a collision in the flow hash table between several flows whose 7-tuple (vlan ids, src/dst ip, src/dst port and ip proto) are identical. The only difference was the capture interface.
We noticed that the FlowHashKey definition does not include the livedev
typedef struct FlowHashKey4_ {
union {
struct {
uint32_t addrs2;
uint16_t ports2;
uint16_t proto; /**< u16 so proto and recur add up to u32 /
uint16_t recur; /*< u16 so proto and recur add up to u32 */
uint16_t vlan_id2;
};
const uint32_t u325;
};
} FlowHashKey4;
typedef struct FlowHashKey6_ {
union {
struct {
uint32_t src4, dst4;
uint16_t ports2;
uint16_t proto; /**< u16 so proto and recur add up to u32 /
uint16_t recur; /*< u16 so proto and recur add up to u32 */
uint16_t vlan_id2;
};
const uint32_t u3211;
};
} FlowHashKey6;
We verified this theory in a test lab and were able to confirm that replaying the attached pcap twice, once on each capture interface, does lead to tcp.pkt_on_wrong_thread being incremented.
Digging even deeper, we realised that this meant that the flow entry was shared between capture interfaces, and that this could be exploited as a bypass: an attacker in network A can corrupt flow state of flows in network B. This also means that one cannot use Suricata in a multi-tenant scenario if two tenants have overlapping network addressing because there is always the risk that a flow could be accidentally corrupted by another tenant network activity. Finally, this seems to prevent usage of Suricata to monitor traffic from a unique network at different capture locations (e.g. near the source and near the destination).
We confirmed the flow state corruption in lab:
Date: 4/15/2022 -- 14:11:39 (uptime: 0d, 00h 00m 54s)
------------------------------------------------------------------------------------
Counter | TM Name | Value
------------------------------------------------------------------------------------
capture.kernel_packets | Total | 6
decoder.pkts | Total | 6
decoder.bytes | Total | 412
decoder.ipv4 | Total | 6
decoder.ethernet | Total | 6
decoder.tcp | Total | 6
decoder.avg_pkt_size | Total | 68
decoder.max_pkt_size | Total | 74
decoder.max_mac_addrs_src | Total | 1
decoder.max_mac_addrs_dst | Total | 1
flow.tcp | Total | 1
flow.wrk.spare_sync_avg | Total | 100
flow.wrk.spare_sync | Total | 1
tcp.sessions | Total | 1
tcp.syn | Total | 1
tcp.synack | Total | 1
tcp.pkt_on_wrong_thread | Total | 3
flow.mgr.full_hash_pass | Total | 1
flow.mgr.closed_pruned | Total | 1
flow.spare | Total | 1048501
flow.mgr.rows_maxlen | Total | 1
flow.mgr.flows_checked | Total | 1
flow.mgr.flows_timeout | Total | 1
flow.mgr.flows_evicted | Total | 1
tcp.memuse | Total | 6828326912
tcp.reassembly_memuse | Total | 2162688
flow.memuse | Total | 348135104
In this lab, we injected a 3-way handshake opening in one capture interface and the corresponding 3-way handshake closure in the other capture interface, for a total of six packets.
Three packets were identified by suricata as being handled on a wrong thread: the 3-way handshake closure. Yet, flow.mgr.closed_pruned was incremented, showing that the flow was in the CLOSED state, thanks to the 3-way handshake closure (we used a tcp close timeout of 0 in this test).
We believe the correct behaviour should have been to consider the 3-way handshake closure to be mid-stream trafic unrelated with the 3-way handshake observed on the other capture interface. The flow state should not have been altered.
(BTW, the code responsible for incrementing flow.mgr.closed_pruned was probably deleted by mistake in commit b3599507f4eb891841417575587d690ea13fe6c0; we added it back for this test).
Please feel free to ask for any additional info.
Best regards,
Florian Maury
Gatewatcher
Files
Updated by Gatewatcher Dev Team over 2 years ago
- File 3whs_open.pcap 3whs_open.pcap added
- File 3whs_close.pcap 3whs_close.pcap added
The pcap files used for this test
Updated by Philippe Antoine over 2 years ago
- Status changed from New to In Review
Updated by Philippe Antoine about 2 years ago
- Assignee changed from OISF Dev to Philippe Antoine
- Target version changed from TBD to 7.0.0-beta1
Updated by Philippe Antoine about 2 years ago
- Related to Bug #5330: flow: vlan.use-for-tracking is not used for ICMPv4 added
Updated by Victor Julien about 2 years ago
- Target version changed from 7.0.0-beta1 to 7.0.0-rc1
Updated by Philippe Antoine about 2 years ago
- Related to Optimization #2725: stream/packet on wrong thread added
Updated by Victor Julien almost 2 years ago
- Target version changed from 7.0.0-rc1 to 8.0.0-beta1
Updated by Philippe Antoine over 1 year ago
- Target version changed from 8.0.0-beta1 to 7.0.1
Updated by Philippe Antoine over 1 year ago
Updated by Philippe Antoine over 1 year ago
- Target version changed from 7.0.1 to 7.0.0-rc2
Updated by Philippe Antoine over 1 year ago
- Status changed from In Review to Closed