Bug #8667
closedaf-packet: IPS copy-mode startup race causes permanent ENOTSOCK on peer socket (regression in 923ad6af)
Description
Summary¶
Commit "af-packet: speed up thread sync during startup" (923ad6af, introduced in 8.0.0) moved AFPPeersListReachedInc() from after AFPCreateSocket() returns to immediately after bind(), before AFPSetupRing() and AFPSwitchState(AFP_STATE_UP). A worker thread released by AFPSynchronizeStart() (which spins until peerslist.turn 0) can now call AFPWritePacket() while its peer's socket fd is still 0 (zero-initialized). sendto(0, ...) returns ENOTSOCK, producing:
SFE_N_RX: sending packet failed on socket 0: Socket operation on non-socket
The send_errors rate-limit silences all subsequent warnings. Every forwarded packet on that peer is silently dropped forever. The engine never self-recovers, AFPTryReopen is triggered by read-side failures only.
Affected versions¶
- Suricata 8.0.0 and later (any version containing commit 923ad6af)
- Not present in Suricata 7.0.x: in 7.x,
AFPPeersListReachedInc()was called inReceiveAFPLoopafterAFPCreateSocket()returned, i.e. afterAFPSwitchState(AFP_STATE_UP)had already published the fd.
Root cause¶
AFPPeerUpdate(), the only site that publishes the peer socket fd, writes two atomics in this fixed order:
SC_ATOMIC_SET(ptv->mpeer->socket, ptv->socket); /* written first */ SC_ATOMIC_SET(ptv->mpeer->state, ptv->afp_state); /* written second */
It is called only from AFPSwitchState(ptv, AFP_STATE_UP), which runs at the end of AFPCreateSocket(), after AFPSetupRing(). In Suricata 8, AFPPeersListReachedInc() runs right after bind(), so turn 0 now means "all peers have bind()-ed", not "all peer fds are published". AFPWritePacket() has no peer->state guard in either version and has always relied on the barrier to imply readiness. That implication is broken in 8.
Trigger conditions¶
- Cold restart only —
AFPTryReopenpassespeer_update = false, so the barrier never re-runs on a hot reopen. Any forced cold restart suffices: policy change that alters the config fingerprint, host patching/AMI refresh, etc. - Timing-racy — the window is the interval between barrier release and
AFPSwitchState(AFP_STATE_UP)on the slowest peer. Long uptime with memory fragmentation widens the window by slowing themmapinAFPSetupRing().
Observed symptoms¶
- Runtime warning (often only one line per wedged peer due to rate-limiting):
SFE_N_RX: sending packet failed on socket 0: Socket operation on non-socket - TX-ok / RX-dead asymmetry at stats:
SFE_N_TX: packets: <large>whileSFE_N_RX: packets: 0 - Startup log sequence identical to a healthy start, with
AF_PACKET IPS mode activatedandEngine started.both appear normally; the warning fires between them
Reproduction results¶
Reproduced on Linux with Suricata 8.0.3, 6 AF-PACKET interface pairs (dummy interfaces, copy-mode: ips, runmode workers, 2 threads per pair), and SURICATA_RING_SETUP_DELAY_US=500000 (env-gated widener holding the existing window open — see attached widener.patch):
| Metric | Value |
|---|---|
| Restart cycles | 10 |
| Cycles reaching "Engine started" | 8 |
| Cycles with ENOTSOCK | 7 (87.5%) |
| Total ENOTSOCK lines | 28 |
TX/RX asymmetry on a wedged engine (last cycle):
| Interface | Packets |
|---|---|
| SFE_0_TX | 3,727,852 |
| SFE_0_RX | 279 |
| SFE_1_TX | 19,696,196 |
| SFE_1_RX | 279 |
| SFE_4_TX | 19,874,739 |
| SFE_4_RX | 279 |
| SFE_5_TX | 36,548,604 |
| SFE_5_RX | 558 |
ENOTSOCK lines land at the same second as or 1 second before "Engine started", confirming the race fires inside the startup window.
Attachments¶
reproduce.sh— self-contained reproduction script (Linux, Suricata 8 binary + tcpreplay required)widener.patch— env-gatedusleepinAFPSetupRing()for deterministic reproduction; no-op unlessSURICATA_RING_SETUP_DELAY_USis setREADME.md— full technical writeup including race timeline and Fix A vs Fix B comparison
Files
VJ Updated by Victor Julien 12 days ago
- Status changed from New to Assigned
- Assignee set to Victor Julien
- Priority changed from Normal to High
- Target version changed from TBD to 9.0.0-beta1
- Label Needs backport to 8.0 added
I think I'll just revert that commit. Another issue was (privately) reported about it, not IPS related.
OT Updated by OISF Ticketbot 12 days ago
- Subtask #8668 added
OT Updated by OISF Ticketbot 12 days ago
- Label deleted (
Needs backport to 8.0)
VJ Updated by Victor Julien 12 days ago
- Status changed from Assigned to In Review
https://github.com/OISF/suricata/pull/15671 reverts the broken commit.
VJ Updated by Victor Julien 11 days ago
- Status changed from In Review to Resolved
VJ Updated by Victor Julien 7 days ago
- Status changed from Resolved to Closed