h2. Summary

Commit "af-packet: speed up thread sync during startup" ("923ad6af":https://github.com/OISF/suricata/commit/923ad6af7709c9eca6f0f5856ee267845b425ae5, introduced in 8.0.0) moved @AFPPeersListReachedInc()@ from after @AFPCreateSocket()@ returns to immediately after @bind()@, before @AFPSetupRing()@ and @AFPSwitchState(AFP_STATE_UP)@. A worker thread released by @AFPSynchronizeStart()@ (which spins until @peerslist.turn == 0@) can now call @AFPWritePacket()@ while its peer's socket fd is still 0 (zero-initialized). @sendto(0, ...)@ returns @ENOTSOCK@, producing:

<pre>SFE_N_RX: sending packet failed on socket 0: Socket operation on non-socket</pre>

The @send_errors@ rate-limit silences all subsequent warnings. Every forwarded packet on that peer is silently dropped forever. The engine never self-recovers — @AFPTryReopen@ is triggered by read-side failures only.

h2. Affected versions

* Suricata 8.0.0 and later (any version containing commit "923ad6af":https://github.com/OISF/suricata/commit/923ad6af7709c9eca6f0f5856ee267845b425ae5)
* *Not present in Suricata 7.0.x* — in 7.x, @AFPPeersListReachedInc()@ was called in @ReceiveAFPLoop@ *after* @AFPCreateSocket()@ returned, i.e. after @AFPSwitchState(AFP_STATE_UP)@ had already published the fd.

h2. Root cause

@AFPPeerUpdate()@ — the only site that publishes the peer socket fd — writes two atomics in this fixed order:

<pre>
SC_ATOMIC_SET(ptv->mpeer->socket, ptv->socket);   /* written first  */
SC_ATOMIC_SET(ptv->mpeer->state,  ptv->afp_state); /* written second */
</pre>

It is called only from @AFPSwitchState(ptv, AFP_STATE_UP)@, which runs at the *end* of @AFPCreateSocket()@, after @AFPSetupRing()@. In Suricata 8, @AFPPeersListReachedInc()@ runs right after @bind()@, so @turn == 0@ now means "all peers have bind()-ed", not "all peer fds are published". @AFPWritePacket()@ has no @peer->state@ guard in either version and has always relied on the barrier to imply readiness. That implication is broken in 8.

h2. Trigger conditions

* *Cold restart only* — @AFPTryReopen@ passes @peer_update = false@, so the barrier never re-runs on a hot reopen. Any forced cold restart suffices: policy change that alters the config fingerprint, host patching/AMI refresh, etc.
* *Timing-racy* — the window is the interval between barrier release and @AFPSwitchState(AFP_STATE_UP)@ on the slowest peer. Long uptime with memory fragmentation widens the window by slowing the @mmap@ in @AFPSetupRing()@.

h2. Observed symptoms

* Runtime warning (often only one line per wedged peer due to rate-limiting): @SFE_N_RX: sending packet failed on socket 0: Socket operation on non-socket@
* TX-ok / RX-dead asymmetry at stats: @SFE_N_TX: packets: <large>@ while @SFE_N_RX: packets: 0@
* Startup log sequence *identical* to a healthy start — @AF_PACKET IPS mode activated@ and @Engine started.@ both appear normally; the warning fires between them

h2. Reproduction results

Reproduced on Linux with Suricata 8.0.3, 6 AF-PACKET interface pairs (dummy interfaces, @copy-mode: ips@, @runmode workers@, 2 threads per pair), and @SURICATA_RING_SETUP_DELAY_US=500000@ (env-gated widener holding the existing window open — see attached @widener.patch@):

|_. Metric |_. Value |
| Restart cycles | 10 |
| Cycles reaching "Engine started" | 8 |
| Cycles with ENOTSOCK | 7 (87.5%) |
| Total ENOTSOCK lines | 28 |

TX/RX asymmetry on a wedged engine (last cycle):

|_. Interface |_. Packets |
| SFE_0_TX | 3,727,852 |
| SFE_0_RX | 279 |
| SFE_1_TX | 19,696,196 |
| SFE_1_RX | 279 |
| SFE_4_TX | 19,874,739 |
| SFE_4_RX | 279 |
| SFE_5_TX | 36,548,604 |
| SFE_5_RX | 558 |

ENOTSOCK lines land at the same second as or 1 second before "Engine started" — confirming the race fires inside the startup window.

h2. Proposed fix

Add a peer-state guard at the top of @AFPWritePacket()@ in @src/source-af-packet.c@, before the socket fd is read:

<pre>
if (SC_ATOMIC_GET(p->afp_v.peer->state) != AFP_STATE_UP) {
    return;  /* peer fd not yet published — drop cleanly during startup window */
}
</pre>

Because @AFPPeerUpdate()@ always writes @socket@ before @state@, observing @AFP_STATE_UP@ guarantees the fd is valid. This is race-free. It preserves the parallel @AFPSetupRing()@ optimization introduced by the original commit.

*Fix verification:* 10/10 restart cycles produced zero ENOTSOCK lines with the fix applied and the 500ms widener still active.

h2. Attachments

* @reproduce.sh@ — self-contained reproduction script (Linux, Suricata 8 binary + tcpreplay required)
* @widener.patch@ — env-gated @usleep@ in @AFPSetupRing()@ for deterministic reproduction; no-op unless @SURICATA_RING_SETUP_DELAY_US@ is set
* @README.md@ — full technical writeup including race timeline and Fix A vs Fix B comparison
