Project

General

Profile

Actions

Feature #8472

open
YD VJ

firewall: Auto-Accept Prior States syntax for firewall mode intent rules

Feature #8472: firewall: Auto-Accept Prior States syntax for firewall mode intent rules

Added by Yash Datre about 2 months ago. Updated 2 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Effort:
Difficulty:
Label:

Description

We'd like to propose a syntax addition to Suricata's firewall mode that reduces the rule authoring burden for common firewall use cases while preserving the precision of the state machine.

Currently, writing a firewall rule to allow TLS outbound by SNI requires 13 hand-written rules covering the TCP handshake, all client states, and all server states. The author must know every protocol state name, which state carries which keyword, and the correct direction for each. This pattern repeats for HTTP, DNS, and every other app-layer protocol.

We're exploring a < prefix operator on the hook name that tells the engine to automatically accept all prerequisite states before the specified hook:

accept:tx <tls:client_hello_done $HOME_NET any -> $EXTERNAL_NET any (tls.sni; content:".amazon.com"; endswith; sid:1003;)

This collapses the 13-rule TLS example into a single rule. The author still names the exact hook where the condition is evaluated — the engine handles the transport setup and prior state accepts. Internally this would be a pre-processing step that emits the same state-based rules, so no new architecture is needed.

An alternative expression as a keyword (accept-prior-states;) was also considered for readability. We evaluated several other approaches (full keyword-derived state resolution, keyword-based templates, structured blocks, YAML config) but believe the < operator strikes the right balance between usability and precision.

We also have a few open questions we'd appreciate OISF's perspective on:

  • For rules matching keywords at multiple states (e.g., tls.sni at client_hello_done and tls.cert_subject at server_cert_done), should the engine use the latest state, or should such rules be disallowed?

We're looking for feedback on this approach and whether it aligns with the direction OISF envisions for firewall mode. Happy to iterate on the syntax or semantics based on your input.


Subtasks 1 (1 open0 closed)

Feature #8611: firewall: Auto-Accept Prior States syntax for firewall mode intent rules (8.0.x backport)In ReviewVictor JulienActions

Related issues 2 (2 open0 closed)

Related to Suricata - Task #8491: firewall: support multi hook rulesTriagedOISF DevActions
Related to Suricata - Bug #8645: firewall: accept-prior states logic doesn't work for built-in hooksResolvedVictor JulienActions

YD Updated by Yash Datre about 2 months ago Actions #1

  • Tracker changed from Bug to Feature
  • Affected Versions deleted (8.0.4)

We'd like to propose a syntax addition to Suricata's firewall mode that reduces the rule authoring burden for common firewall use cases while preserving the precision of the state machine.

Currently, writing a firewall rule to allow TLS outbound by SNI requires 13 hand-written rules covering the TCP handshake, all client states, and all server states. The author must know every protocol state name, which state carries which keyword, and the correct direction for each. This pattern repeats for HTTP, DNS, and every other app-layer protocol.

We're exploring a < prefix operator on the hook name that tells the engine to automatically accept all prerequisite states before the specified hook:

accept:tx <tls:client_hello_done $HOME_NET any -> $EXTERNAL_NET any (tls.sni; content:".amazon.com"; endswith; sid:1003;)

This collapses the 13-rule TLS example into a single rule. The author still names the exact hook where the condition is evaluated — the engine handles the transport setup and prior state accepts. Internally this would be a pre-processing step that emits the same state-based rules, so no new architecture is needed.

An alternative expression as a keyword (accept-prior-states;) was also considered for readability. We evaluated several other approaches (full keyword-derived state resolution, keyword-based templates, structured blocks, YAML config) but believe the < operator strikes the right balance between usability and precision.

We also have a few open questions we'd appreciate OISF's perspective on:

  • For rules matching keywords at multiple states (e.g., tls.sni at client_hello_done and tls.cert_subject at server_cert_done), should the engine use the latest state, or should such rules be disallowed?

We're looking for feedback on this approach and whether it aligns with the direction OISF envisions for firewall mode. Happy to iterate on the syntax or semantics based on your input.

AP Updated by Aneesh Patel about 2 months ago Actions #2

We also could implement this by making something like pass_prior_hooks be a keyword that customers can add to their rules - either way should work but just wanted to throw an additional idea in case it may make more sense to you all

VJ Updated by Victor Julien about 2 months ago Actions #3

  • Subject changed from Feature Request: Auto-Accept Prior States syntax for firewall mode intent rules to firewall: Auto-Accept Prior States syntax for firewall mode intent rules

YD Updated by Yash Datre about 1 month ago Actions #4

Design Proposal

One author-visible rule. The Rule_Loader auto-synthesises the accept chain from the protocol's registered state machine. Two equivalent syntaxes:

accept:tx <tls:client_hello_done $HOME_NET any -> $EXTERNAL_NET any \
    (tls.sni; content:".amazon.com"; endswith; sid:1003;)

or, using a keyword form:

accept:tx tls:client_hello_done $HOME_NET any -> $EXTERNAL_NET any \
    (tls.sni; content:".amazon.com"; endswith; accept-prior-states; sid:1003;)

Both forms produce the same expansion. All protocol knowledge (state ordinals, directions, registered transports, keyword→state mappings) comes from the existing AppLayerParser* and DetectBufferType registries — no per-protocol code in the loader. Multi-transport protocols are handled by the expander iterating every transport the registry reports for the protocol, so a single DNS rule emits both the TCP handshake block and the UDP transport block.

Design decision 1 — Load-time pre-processor expansion (chosen) vs lazy evaluation

The Rule_Loader expands each Prior_State_Rule into concrete accept-rule strings before DetectFirewallRuleAppendNew sees them. Runtime sees ordinary Signature objects.

Pros (chosen approach).

  • Zero per-packet cost: equivalent hand-written rules = identical hot path.
  • Round-trip equivalence with hand-written rules is structural, not behavioural — same code path from parsing onward.
  • No cross-table coupling: the TCP handshake rules must live in packet_filter and the state rules in app_filter; load-time expansion handles this naturally.
  • Pure loader-side change, no reach into detect.c / alert / tagging — easier upstream review surface.

Cons.

  • Engine-internal rule count grows (10-13× per Prior_State_Rule). Invisible to customers but real in MPM compilation budget. Measured in microseconds per rule for assembly; MPM compilation dominates the load-time budget either way.

Alternative considered — native lazy evaluation. Keep the Prior_State_Rule as a single Signature with a cached prerequisite bitmap, and check "does this packet's current state satisfy the bitmap?" lazily on every detection pass.

Pros (alternative).

  • No rule-count explosion in the detection engine.
  • Expansion is a runtime concept, not a load-time artefact.

Cons (alternative — why rejected).

  • Permanent per-packet branch on every Signature evaluation (is this Prior_State? does current state satisfy?). Adds non-zero cost to the hot path.
  • Round-trip equivalence becomes a behavioural claim that two different evaluators have to keep producing identical verdicts across every future engine change.
  • packet_filter ↔ app_filter coupling for the TCP handshake side either requires an O(N-rules) per-packet scan or a load-time shadow into packet_filter — which is effectively pre-processor expansion at a different layer.
  • Touches detect.c, detect-engine-alert.c, possibly detect-engine-tag.c — larger patch, more upstream review, longer path to merge.

Net: lazy evaluation is workable but pays a permanent runtime cost and a recurring correctness burden. Pre-processor expansion costs nothing at runtime and is a single loader-side change.

Design decision 2 — Customer-facing SID abstraction

Every customer-visible record (eve.alert, fast.log, syslog, any SID-keyed tooling) shows only the author's Parent_SID. Derived Sub_SIDs exist, but only as engineer-facing attribution in --dump-expanded-rules output and --engine-analysis JSON.

Two structural pieces enforce this:

  1. Every loaded Signature keeps a unique uint32_t runtime SID. Suricata's existing uniqueness invariant (detect-parse.c:DetectEngineSignatureIsDuplicate, relied on by thresholding / tagging / dedup) stays intact. Auto-accepted rules get derived runtime SIDs under the deterministic formula 0x80000000 | (fnv1a32(file) ^ parent_sid ^ sub_index) & 0x7FFFFFFF.
  2. Every auto-accepted Expanded_Rule carries noalert;. Today accept:* is already silent at the alert layer (PacketAlertFinalize skips (action & (ACTION_ALERT | ACTION_PASS)) == 0), but noalert; makes the silence a durable contract instead of a default that could change upstream.

The Decision_Hook rule is not marked noalert; — it is the single rule carrying the Parent_SID and is the only place customer-facing output from the expansion is allowed to originate.

Alternative considered — expose Sub_SIDs via optional eve metadata (firewall.log-expanded-from: yes, metadata.expanded_from: { parent_sid, sub_index } on each alert record).

Pros. Operators debugging a dropped flow can tell which sub-rule matched without --dump-expanded-rules.

Cons (why rejected). Leaks an implementation detail to a customer-facing surface the feature otherwise keeps invisible. Every operator tool keyed on sid would have to decide whether to group on sid or on metadata.expanded_from.parent_sid. Rejected; kept as engineer-facing only.

--dump-expanded-rules — testing tool, not a hard requirement

The spec defines a RUNMODE_DUMP_EXPANDED_RULES run mode that loads the ruleset (classifier → parser → validator → expander) and dumps the post-expansion concrete rules to stdout. It's an engineer-facing inspection tool with three uses:

  • Demo-able visibility: the 12-rule TLS expansion and 10-rule DNS expansion are observable on stdout before any packets are processed, so reviewers can see what the pre-processor produces.
  • Golden-file CI fixture: the Suricata-verify test fw-prior-state-dump-expanded diffs the dump against a checked-in dump.expected so rule-shape regressions surface on every build.
  • Debugging handle for authors: paste the output back into a rule file to verify round-trip equivalence with hand-written rules.

This is not a hard user-facing requirement — no customer needs it to use the feature. It's scaffolding that makes the pre-processor approach auditable and keeps the engineer-inspection deliverables (--engine-analysis integration) cheap. Based on the preference of a different engineer-tooling surface, the expansion work stands on its own and the dump run mode can be dropped or replaced.

VJ Updated by Victor Julien 26 days ago Actions #5

  • Status changed from New to In Review
  • Assignee set to Victor Julien
  • Target version changed from TBD to 9.0.0-beta1

https://github.com/OISF/suricata/pull/15402

Implemented this using stateful rules. It uses tls:<client_hello_done and http1:<request_headers as notation.

YD Updated by Yash Datre 24 days ago Actions #6

Thanks for sharing PR #15402 — we've reviewed it and the referenced suricata-verify tests. Looks like the lazy-evaluation alternative from this issue: the < operator keeps the rule as a single Signature and walks prerequisite states at runtime on the decision direction.

We're committed to implementing this feature ourselves, and intend to extend your work in feature/fw-updates/v26 to close the remaining gaps below. Looking for your feedback on the direction before we start.

The way we read the operator today: when an author writes accept:flow tls:<client_hello_done ... (tls.sni; ...), the operator auto-accepts every TLS app-layer state up to client_hello_done, going forward on the client-to-server direction. That works great for the TLS SNI case. Two places we still need to write rules by hand:

Gap 1 — The TCP/UDP underneath the app-layer is not covered

Before TLS can run at all, the TCP 3-way handshake has to be allowed. The < operator stops at the app-layer — it doesn't reach down into TCP. Test ruletype-firewall-89-lt-sni/ shows this:

# allow session setup
accept:hook tcp:all $HOME_NET any <> $EXTERNAL_NET any (flow:not_established; sid:1021;)
accept:hook tcp:all $HOME_NET any <> $EXTERNAL_NET any (flow:established; sid:1022;)

accept:flow tls:<client_hello_done ... (sid:9999;)

Two hand-written TCP rules just to let SYN / SYN-ACK / ACK and post-handshake packets through packet_filter. Same shape for UDP-carried protocols (you'd write 1-2 UDP rules instead).

Gap 2 — Protocols that run on multiple transports stack Gap 1 per transport

DNS, NTP, SIP run on both TCP and UDP. Gap 1 says we hand-write transport rules; Gap 2 says we hand-write them once for each transport. So "allow DNS queries to .amazon.com over UDP and TCP" ends up needing:

  • 4 TCP handshake rules (Gap 1, TCP)
  • 2 UDP transport rules (Gap 1, UDP)
  • 1 < operator rule (the only thing the author intended to write)

= 7 author-visible rules for one intent. The operator contributes 1 of them.

Rule-count summary

Scenario Without < With < today If 1-2 closed
TLS SNI allow-list (single transport) ~13 4 1
DNS query allow-list, UDP only ~4 2 1
DNS query allow-list, UDP + TCP ~11 6 1
HTTP host allow-list (single transport) ~14 4 1

The operator's value is highest on TLS SNI (its sweet spot) and falls off as protocols span more transports — exactly the protocols operators most want to allow-list.

Direction we're considering

A runtime predicate at the transport / lower-layer hooks (tcp:flow_started, tcp:flow_established, UDP packet:filter) that runs after explicit rules at the same hook and accepts the packet if any <-operator app-layer rule on the flow's prospective ALPROTO matches the 5-tuple. No synthesised rules in firewall.json; one author-visible rule per intent. Preserves explicit drop rules, threshold modifiers, stream engine validation.

Questions before we start coding

  1. Are the gaps framed correctly, or have we missed mechanisms in feature/fw-updates/v26 that already address them?
  2. Is the runtime-predicate direction acceptable in principle, or would you prefer the gaps closed differently (e.g., load-time synthesis, or keeping transport rules permanently author-written)?

Happy to share our experiment suricata-verify tests (TLS SNI & DNS over UDP/TCP) showing the gaps if useful.

VJ Updated by Victor Julien 23 days ago · Edited Actions #7

fw-updates/v26 does not consider the packet layer.

Couple of thoughts on this:

The packet and app fw layers are quite distinct, so having a single rule to accept traffic in both layers is something that will need careful consideration. I think #7704 (handling session setup + established traffic in a single rule) needs to be addressed first.

I think it also needs to be very strict on the app-layer protocol. If we have a

accept:hook tcp:all ... 443

It will effectively accept all tcp packets on port 443, w/o considering whether it is TLS or not.

If we'd have an implied rule like this part of a app-layer tls rule, we'd need to enforce that the packet rule would only accept tls. In other words, we need to address #7705 first.

Once we have #7704 and #7705 addressed, we can see how to support this in here. At this point I don't see this being supported in some implicit way as part of `tls:<client_hello_done`, as this is too much of a boundary bridge (or violation) to be implicit.

Some corner cases to consider

1. masking by explicit rules

When a single rule would accept both packet and app, the relationship between the packet matching logic and the app logic needs to be very clearly defined. What happens if another packet rule already @accept:hook@s the packet layer?

accept:hook tcp:all ... 443 (msg:"independent explicit packet rule"; sid:1;)  
accept:tx tls:<client_hello_done ... 443 (msg:"includes implicit packet rule"; sid:10;)  

This could be implemented as injecting sid 11.

accept:hook tcp:all ... 443 (msg:"independent explicit packet rule"; sid:1;)
accept:hook tcp:all ... 443 (msg:"implicit packet rule for sid:10"; sid:11;)
accept:tx tls:<client_hello_done ... 443 (msg:"includes implicit packet rule"; sid:10;)

If sid 1 matches it will not eval 11 anymore, so the fact that 11 doesn't match shouldn't affect sid 10.

2. TCP/UDP protocols like DNS

For a protocol like DNS the packet rules would either have to allow both UDP and TCP (while enforcing session setup, app proto, etc), or there would need to be rule syntax to explicitly specify which one to use.

VJ Updated by Victor Julien 22 days ago Actions #8

  • Related to Task #8491: firewall: support multi hook rules added

VJ Updated by Victor Julien 14 days ago Actions #9

  • Status changed from In Review to Resolved

VJ Updated by Victor Julien 14 days ago Actions #10

  • Label Needs backport to 8.0 added

OT Updated by OISF Ticketbot 14 days ago Actions #11

  • Subtask #8611 added

OT Updated by OISF Ticketbot 14 days ago Actions #12

  • Label deleted (Needs backport to 8.0)

YD Updated by Yash Datre 7 days ago Actions #13

Found an edge case: accept:flow dns:<request_complete / dns:<response_complete (the auto-accept-prior-hooks < syntax applied to DNS) corrupts the packet:filter table, causing ALL packets to be dropped by the default packet policy — even though packet-layer accept rules are present and loaded.

The same ruleset with explicit per-state DNS rules (no <) works (test-23). TLS < and HTTP < together work fine; the bug is specific to DNS (registered for both TCP and UDP).

Evidence: suricata-verify test https://github.com/OISF/suricata-verify/pull/3146 — fails with:

Sub test #1: FAIL: expected 3 alerts sid:102, got 0
Sub test #3: FAIL: expected 0 drops "firewall default packet policy", got 8
PASSED: 0  FAILED: 1

VJ Updated by Victor Julien 2 days ago Actions #14

  • Related to Bug #8645: firewall: accept-prior states logic doesn't work for built-in hooks added

VJ Updated by Victor Julien 2 days ago Actions #15

Yash Datre wrote in #note-13:

Found an edge case: accept:flow dns:<request_complete / dns:<response_complete (the auto-accept-prior-hooks < syntax applied to DNS) corrupts the packet:filter table, causing ALL packets to be dropped by the default packet policy — even though packet-layer accept rules are present and loaded.

The same ruleset with explicit per-state DNS rules (no <) works (test-23). TLS < and HTTP < together work fine; the bug is specific to DNS (registered for both TCP and UDP).

Evidence: suricata-verify test https://github.com/OISF/suricata-verify/pull/3146 — fails with:

[...]

Tracking this in #8645.

Actions

Also available in: PDF Atom