Bug #7004
openapp-layer: wrong tx may be logged for stream rules
Description
[Private as it discusses an issue where DNS transactions are missed]
In IDS mode, transaction handling appears to be off, and there exists a discrepancy between IDS and IPS modes. While pointed out in PGSQL, I believe DNS presents a simpler reproduction case just being a simpler protocol with less state.
The test PCAP (dns-tcp-multi-5.pcap
) contains 1 TCP session with 3 DNS requests/responses, in order.
- 1) Request/response for suricata.io
- 2) Request/response for oisf.net
- 3) Request/response for google.com
One rule is used:
alert dns any any -> any any (msg:"DNS suricata"; content:"suricata"; sid:1; rev:1;)
Issue 1. IDS Mode
Using a default Suricata configuration:
suricata -S test.rules -r dns-tcp-multi-5.pcap
Observations:
- 2 alerts, logged matching on the "suricata" in the DNS stream
- The transaction in both alerts contain the data from the 3rd transaction, which is for "google.com", which is misleading.
Issue:
- The wrong transaction data is logged. This is very similar to what is being seen in #7000.
Issue 2. IPS Mode
Run the same test above, but in IPS simulation mode:
suricata -S test.rules -r dns-tcp-multi-5.pcap --simulate-ips
Observations:
- 6 alerts logged, one for each request and response
- Given the alert, and pcap_cnt, the tx data seems to be correct. TX data for each request and response is seen.
Issue:
- One might expect to get only 2 alerts in this scenario.
Summary
Being a stream rule there is no direct correlation between the match and the transaction logged. Generally logging the wrong data is worse than not logging it at all, so one idea would be to not log the transaction data in this case. For DNS, most TCP DNS sessions contain one transaction, so maybe in some cases we know we're going to log the correct tx, but in others we know it might not be the correct transaction.
I'm not sure about the IPS issue, I brought it up as it differs from IDS mode, but may be a separate issue.
Files
Updated by Juliana Fajardini Reichow 8 months ago
Victor has suggested that I try to map the whole lifetime of a few transactions for DNS, to help visualize this. I'll be focusing on that this week, and share my findings asap.
Updated by Juliana Fajardini Reichow 8 months ago
- File eve-more-tx-ids-et-rules.json eve-more-tx-ids-et-rules.json added
- File eve-rules-2-3.json eve-rules-2-3.json added
Not sure where's the best place to keep WIP results, but since this is private for now, I'll share them here.
I've made temporary changes to the code, to:
- dns_log_json_[query/answer]: log tx_id in DNS logs in places where either it wasn't logged or it was first decremented before being logged.
- Alert event: create it adding tx_id that comes from what we track as log_id
- Alert header: add the tx_id used from the PacketAlert info to the output
- dns-json: create event headers using the tx_id that is passed to the function (originally, we don't add this info to this event header)
I have also added 2 more rules:
alert dns any any -> any any (msg:"DNS oisf client"; content:"oisf"; flow:to_client; sid:2; rev:1;) alert dns any any -> any any (msg:"DNS suricata server"; content:"suricata"; flow:to_server; sid:3; rev:1;)
Debugging with dns-tcp-multi.pcap
:
Suricata only starts to append rules to the AlertQueue
after parsing packet 12 - something that I think is related to the tx_id tracking - it seems that the engine only catches up with a packet where it could match the rule with what the packet has around the transactions 4 or 5 - which would explain why we seem to miss the alert for the first transaction, and seemingly alert on wrong data, too - but could be wrong.
The DNS transactions logged don't seem to be necessarily the ones that triggered the alerts. It also doesn't seem to align with the tx_id
that is saved in the PacketAlert
- at least for some cases?
If I keep only rules sid: 2 and sid:3 (eve-rules-2.-3.json
), we only see two alerts - but will log query and answer from what I think should be transactions 5 and 6 (query and answer to suricata.org). When we look at the tx_id
logged for the alert for sid:3, from the PacketAlert
and from the engine tracking, we see that it's 4. Maybe this could work if we tracked all txs as starting with index 0, but as we use the State's tx_id for many functions that control tx_id
handling and tracking - and that one starts from 1, things get messed up, I think. It gets more confusing to me for sid:2, as the oisf.net
query tx_id seems to indicate it was detected after the suricata.org
one. I must investigate more, I guess.
I'm attaching the resulting eve log (s), including eve-with-more-tx-ids-et-rules.json
.
I'll share soon a drawing of what I understand is happening while the engine processes dns-tcp-multi.pcap
along with what I thought should be seen at each step - at least in terms of tx_id tracking.
I think it should be possible for us to simplify the tx_id tracking to avoid app-layer having to increment or decrement tx_id info when sending or receiving it from the engine.
TODO: for all cases where we use the tx_id, where are we getting it from, and is it incremented or decremented before we save/log/return?
Updated by Juliana Fajardini Reichow 7 months ago
- File 20240512_184916.jpg 20240512_184916.jpg added
One attempt to represent what I saw happening for the DNS flow for pcap dns-tcp-multi, with the three rules indicated above, running GDB.
I will bring whatever findings to discuss with Victor tomorrow, and see if any of that makes sense, to know what should be the next step.
Haven't gotten to run this with simulate-ips
, yet.
Updated by Juliana Fajardini Reichow 7 months ago
- Related to Optimization #7018: dns/tcp: allow triggering raw stream reassembly added
Updated by Jason Ish 7 months ago
Just noting that there are a handful of rules out there, and rules still being written that don't use the DNS keywords, as we don't provide all the coverage needed. While these rules will catch most DNS traffic as its over UDP, we appear to be blind to single request/response TCP DNS traffic or at least the first/request response in a TCP DNS flow.
Updated by Juliana Fajardini Reichow 7 months ago
Jason Ish wrote in #note-7:
Just noting that there are a handful of rules out there, and rules still being written that don't use the DNS keywords, as we don't provide all the coverage needed. While these rules will catch most DNS traffic as its over UDP, we appear to be blind to single request/response TCP DNS traffic or at least the first/request response in a TCP DNS flow.
And this may be the case for other protos with similar characteristics in similar scenarios, right?
How do we approach this with trusted rule providers?
Updated by Jason Ish 7 months ago
Juliana Fajardini Reichow wrote in #note-8:
Jason Ish wrote in #note-7:
Just noting that there are a handful of rules out there, and rules still being written that don't use the DNS keywords, as we don't provide all the coverage needed. While these rules will catch most DNS traffic as its over UDP, we appear to be blind to single request/response TCP DNS traffic or at least the first/request response in a TCP DNS flow.
And this may be the case for other protos with similar characteristics in similar scenarios, right?
How do we approach this with trusted rule providers?
I think we need to fix it, as we don't have a work-around that I know of, and its expected to work I believe.
Updated by Juliana Fajardini Reichow 7 months ago
- Status changed from New to In Progress
- Assignee changed from OISF Dev to Juliana Fajardini Reichow
- if we trigger raw inspection data for TCP app-protos, the engine properly detects the alerts from stream in IDS mode;
- the false positives happen in IPS mode because of the stream nature of the rule. So the alert is actually due to previous traffic on the stream matching;
- we'll thus address this by triggering raw stream inspection. I also think I should go back to finishing the frame's documentation from a user perspective, and plan to submit a lightning talk to SuriCon to discuss the types of rules and impacts - such as what we saw with this issue.
Updated by Jason Ish 7 months ago
- File dns-tcp-multi-5.pcap dns-tcp-multi-5.pcap added
- Subject changed from app-layer: tx issues and discrepancy between ips and ids modes to app-layer: wrong tx may be logged for stream rules
- Description updated (diff)
Updated by Philippe Antoine 7 months ago
Another solution could be to log the tx data only when there is only one tx
Updated by Juliana Fajardini Reichow 7 months ago
For IPS mode, the logging of the wrong transaction seems even more critical, as for stream rules we get more alerts triggered...
Updated by Juliana Fajardini Reichow 3 months ago
- Related to Optimization #7026: app-protos: trigger raw stream reassembly added